[T Series]A RAID Group Failure Led to a Copy Progress Halt on a Healthy RAID Group

Publication Date:  2012-07-19 Views:  128 Downloads:  0
Issue Description

Product and version information:

  • S5500T V100R001 V100R002
  • S5600T V100R001 V100R002
  • S5800T V100R001 V100R002
  • S6800T V100R001 V100R002
  • Application server operating system: Windows Server 2008 R2

Operations:
1.We mapped two LUNs from two separate RAID groups to a Windows 2008 R2 server.
2.We then copied files to the two mapped LUNs on the server.
3.We removed a member disk from a RAID group. The RAID group and the LUN on that RAID group failed. The copy progress on the failed LUN halted.
4.The copy progress on the healthy LUN also halted.

Alarm Information
None
Handling Process
1.Reinsert the removed member disk as soon as possible. The failed RAID group may restore, and the copy progress may resume.
2.Close the copy progress dialog boxes, and copy the files to healthy LUNs again.
3.If you fail to close the copy progress dialog boxes, restart the host
Root Cause

Comparison test:

  1. The same symptom occurred in a comparison test after the UltraPath was uninstalled.
  2. We mapped LUN A and LUN B from two separate S5000T storage devices. When LUN A failed, the copy progress on LUN B halted. Since LUN A and LUN B were mapped from two separate S5000T storage devices, the failure of the RAID group where LUN A resided was not supposed to affect LUN B.
  3. We ran the same test on peer vendors' storage devices, and the same problem occurred.
  4. We used another Windows Server 2008 R2 host to ran the same test, the same problem still existed.
  5. Even when we reinstalled the operating system of the host, the same problem persisted.
  6. Even after we installed the SP1, the problem did not disappear.

The first test indicated that the problem was irrelevant to the UltraPath. The second and third tests indicated that the problem was irrelevant to the storage devices. The fourth and fifth tests indicated that the problem was irrelevant to specific Windows 2008 R2 hosts, but was relevant to the Windows 2008 R2 operating system. The sixth test showed that installing the SP1 failed to solve the problem.
Conclusion: The problem was due to a inherent defect in the Windows 2008 R2 operating system, and was irrelevant to storage devices.

Suggestions
None

END