No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

What are the Workarounds to Unstable Links Caused by Poor Disk Connections in the S5100?

Publication Date:  2012-07-20 Views:  39 Downloads:  0
Issue Description
At a commercial deployment, an S5100 is used to store videos. For the S5100, all disks are organized in RAID 5 and five of them are configured as hot spare disks. When a disk becomes faulty, a hot spare disk completes reconstruction for it. Since then, the resource consumption soars and the storage speed deteriorates on the service server. Alarm information: By checking alarm messages on the OSM, the faulty disk in slot (2, 6) is detected. Since then, disk error messages stating the disk is inserted or removed repeat.
Alarm Information
None
Handling Process

Step 1 To completely solve the problem, replace the disk with a new disk of improved hardware architecture.

 Step 2 Given this problem is critical, seal up the disk slot to ensure interim link stability.

 To do this, complete the following steps:

 1. Log in to the OSM, identify the enclosure ID and slot ID of the faulty disk, and record its serial number.

 2. Log in to a controller by executing the cli command. Switch to the debug mode and then the mml mode. Execute the dev disk [enclosure ID] command to confirm and record the enclosure ID, slot ID, and serial number. And then, execute the dev setphyswitch [frame ID] [PHY ID] [0:off; 1:on] command where [frame ID] stands for the enclosure ID, [PHY ID] stands for the slot ID, 0 indicates that the slot is locked, and 1 indicates that the slot is unlocked.

 3. Execute the command mentioned above respectively on controllers A and B to lock the slot, which is equivalent to removing the faulty disk in practice. Execute the same command to unlock the slot after replacing the disk with a new one.

Root Cause
The connection between the disk conversion board and disk backplane is poor. In fact, the RAID group restores after the faulty disk is replaced through reconstruction performed by a hot spare disk. However, intermittent disk insertion and removal repeatedly trigger copyback and then fail it, degrading link stability of the S5100.
Suggestions
Given the solution is interim; hardware optimization is required for completely solving the problem.

END