No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search


To have a better experience, please upgrade your IE browser.


The S2600 reconstructs failed

Publication Date:  2012-11-27 Views:  126 Downloads:  0

Issue Description

In a site, the S2600 (0,4), (0,5) double–disk become invalid, because the data saves in the local, the video surveillance service hasn’t been influenced, but we can’t obtain the video material. The engineers carry the spares to the site, and restore the (0,5) disk forcibly, and replace the (0,4) disk for reconstructing, but the reconstruction is failed.

Alarm Information


Handling Process

1. Draw out the (0,4) disk, revive (0,5) disk and LUN 1, draw in a new (0,4) disk and start reconstruction, firstly reconstruct the LUN1;
2. After completing the LUN1 reconstruction, draw out the (0,4) and (0,5) disks, make the RAID group invalid;
3. Draw in the (0,5) disk, revive (0,5) and LUN0, in the RAID degraded status, mark the LBA of the (0,5) disk reporting the 5101 error bit as the BST bad extent, and then reconstruct the LUN0;
4. After the LUN 0 has been reconstructed, the RAID group’s status is “normal”, and then draw out the (0,5) disk, revive LUN 1;
5. Draw in a new disk in the (0,5) slot, start the automatic reconstruction, and then draw in the (0,11) disk and set it as the hot spare disk;

Root Cause

1. The (0,4) and (0,5) two extents logic disks take ineffectiveness:

2. The “smart” information of the (0,4) disk displays there are 1449 bad slots, as the following displayed:

The reason of the (0,4) disk is invalid is there are rather many bad slots, the number of IO which has delayed more than 10s in 30 minutes reaches at 10, so the system consider it is a slow disk and then separate it from the other.
3. The reason of the (0,5) disk is invalid is the hard disk returning back the 5101 (AMNF:Address Mark Not Found) error bit, this error has been discarded in the ATA-8 protocol, but the ** has been used, and finally causes the IO which reading the (0,5) disk failed and can’t be redden and written;
This error bit has displayed in the “smart” information, it is “Aborted Command” in the log: