Western Digital SATA Disk Failure Due To Aborted Command Error

Publication Date:  2013-06-13 Views:  1010 Downloads:  6
Issue Description
The command output of showdisk –l executed on the CLI indicates that the Western Digital SATA disk has a logical fault. The command output of showdisk –p indicates that the physical status of the disk is also faulty.
1. On the CLI, run showdisk –p. The disk SN is WD-XXXX and the disk firmware is 03.00C06.
2. Analyze the messages log of controller A and search the keyword sensekey. Determine whether the cause of this failure is the abort command (sensekey=0xb asc=0x0 ascq=0x0).

The disk returns an aborted command (0xb). This error code causes I/Os to fail. As a result, the RAID set disk cannot be read or written, and the disk logic fails. If this error code is returned, I/Os accessing the disk will fail. Currently, we cannot handle this error code. In normal cases, Western Digital SATA disks do not return the 0xb error code.


Alarm Information
None
Handling Process
You are advised to upgrade the disk firmware.
1. Workaround: If it is confirmed that the failure is due to an aborted command error, reinsert the faulty disk or replace it with a new disk.
2. Thorough solution: Upgrade the firmware of the faulty disk, which requires stopping services.
3. The following upgrade guides attached introduce how to upgrade disk firmware.
You must adhere to the principle that no internal or external I/Os are accessing the storage array when you perform the upgrade. Otherwise, the upgrade may fail.
The following upgrade guide is suitable for S2600 R1:
see .
The following upgrade guide is suitable for S2600 R5:
see 
The following are files that need to be uploaded to the storage array for upgrading the disk firmware. Decompress the package before upload.

Root Cause
The Western Digital has confirmed that their SATA disks return the aborted command error code, and this problem is caused by the disk firmware incompatibility.
Suggestions
After the disk recovers, run showdisk –p and showdisk –l to check that both the physical and logical status of the disk is normal.

END