2.11 TC-A2011 The Device Performance Declines and the Whole Service Network Slows Down

Publication Date:  2012-07-18 Views:  156 Downloads:  0
Issue Description
Related information about the product and version: the T3500
When adopting Linux SLC 4.6 science to test the T3500 that is configured with Areca RAID card, the customer finds that the data writing speed is fast when the device just connects to the network but gradually becomes slow with only 200 MB/s to 300 MB/s. Meanwhile, the whole Lustre network also slows down. When the device is offline, the while Lustre network become fast again.
 
Alarm Information
None
Handling Process
Step 1     Set the hard disks in slots a and b (the two rightmost slots) to RAID 1 and install the OS on the two hard disks.                              
Step 2    
Install the operating system on the RAID 1 on slots a and b. For details, see Scenarios 2: Typical Configuration of Red Hat Linux in Oceanspace T3000 Storage Node Initial Configuration Guide-(V100R001C01)                              
Step 3    
Divide the rest 22 hard disks into three RAID groups. Each group include seven hard disks. The left one hard disk is set to a hot spare disk. In this way, the writing speed is not slow any more and the whole Lustre network accelerates accordingly.

----End

Root Cause
The fault is caused by the unreasonable partition of the RAID group. The hard disks in slots a and b (the two rightmost slots) are set to RAID 1 while all other hard disks are set to one RAID group. During the test, there are lots of concurrent I/O, which lowers the device performance. Meanwhile, due to the inherent features of the Service network, once one node is faulty, the whole network will be affected.
 
Suggestions
Reasonably divide the RAID group to prevent the device performance from being reduced.
 

END