2.16 TC-A2016 Multiple Hard Disks in a RAID Group Are Faulty Due to Bugs in Solaris

Publication Date:  2012-07-18 Views:  437 Downloads:  0
Issue Description
Related information about the product and version: the T3500
In Sloaris 10.5, after the customer uses 20 hard disks to create a zpool that consist of 10 groups of mirrored hard disks, the system reports a great amount of errors and the zfs file system cannot work.
Run the zpool status command and find that STATE and CKSUM of multiple hard disks in the zpool respectively display FAULTED and too many errors, as shown in Figure 2-18.
Figure 1-1  zpool information

Run the dmesg command and then unable to kmem_alloc enough memory for scatter/gather list is displayed in dmesg, as shown in Figure 2-19.
Figure 1-2 dmesg information

Online status indicators of all hard disks are green on (normal) and no indicators are found red on.
Alarm Information
Handling Process
  Back up the data in the device, restart the OS, and ask the administrator to remove the errors of the zpool. For details, see Solaris ZFS. You do not need to replace the hard disk.
l  This problem is caused by bugs of Solaris and should be solved by the customer.
l  It is the customer who should perform all operations to remove this fault. Technical support engineers are only responsible for giving some advice.
Root Cause
Due to bugs in Solaris, memories cannot be allocated to scatter/gather list. As a result, file systems are abnormal, and some hard disks cannot be accessed and are mistakenly recognized FAULTED.