Diagnose and rectify drive I/O faults depending on the symptoms.
- If a fault can be located using logs or tools, see "Handling Procedure". If a fault needs to be rectified quickly onsite, see "Quick Recovery Method".
- For more fault symptoms and solutions, see the Computing Product Case Library. The Computing Product Case Library is available only to Huawei partners and Huawei engineers.
Fault Symptom
|
Handling Procedure
|
Quick Recovery Method
|
A "Disk Fault" alarm is reported to iBMC.
|
- If the drive is in a RAID array and the RAID array is not functioning correctly, troubleshoot the RAID array.
- If the server has stopped, use Smart Provisioning to inspect the server hardware. If the server is operating, replace the drive.
- If the server has stopped, check whether the cables are properly connected. For example, replace the cables to check whether the cables are faulty.
- If the fault persists, insert the new drive into the slot that you suspect to be faulty to check whether that slot is faulty.
NOTE: For RAID controller cards that support out-of-band management, if a drive is in the Unconfigured Good (Foreign) state, an iBMC alarm will be generated but the fault indicator will not light up.
|
- If the faulty drive is not in a RAID array (except drives in passthrough mode), the drive cannot be used and needs to be replaced. It is recommended that you configure RAID for all drives and then deploy the redundant services.
- Back up the data of redundant RAID arrays to avoid data loss.
- Follow the handling procedure to replace any faulty modules.
|
A RAID controller card fails to identify one or more drives.
|
- Check the number of drives supported by the RAID controller card. If the number of installed drives exceeds the maximum number supported by the RAID controller card, adjust the number of drives or replace the RAID controller card based on the site requirements.
- Power off the server, swap the drive that cannot be identified with a normal drive, and power on the server to check whether the drive is faulty.
- If the fault is caused by the drive, replace the drive.
- If the fault is caused by the drive slot, check whether SAS cables are connected properly to all SAS ports on the drive backplane. For details, see the server user guide.
- If the fault persists, go to 3.
- Replace the RAID controller card first, the SAS cables second, and the drive backplane third.
|
- Check the number of drives supported by the RAID controller card. If the number of installed drives exceeds the maximum number supported by the RAID controller card, adjust the number of drives or replace the RAID controller card based on the site requirements.
- If the redundant RAID array fails or no RAID array is configured, the related drive partitions are unavailable.
- Move the unidentified drives or all drives in the RAID array to a standby server. Ensure that you retain their order during this process and attempt to back up data.
- Follow the handling procedure to replace any faulty modules.
|
A RAID controller card cannot identify any drives.
|
- Check whether the drive activity indicators light up. If they are off, ensure that both the drive backplane power cables and drives are installed properly.
- If the fault persists, check that the SAS cables and signal cables are connected properly. For details, see "Internal Cabling" in the user guide.
- If the fault persists, replace any RAID controller card first, the SAS cables second, and the drive backplane third.
|
Follow the handling procedure to replace any faulty modules without changing the drive installation positions.
|