No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

FusionServer Pro Rack Server iBMC Alarm Handling 30

This document describes iBMC alarms in terms of the meaning, impact on the system, possible causes, and handling suggestions.
Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
ALM-0x02000007 Hard Disk Fault (Disk, Major Alarm)

ALM-0x02000007 Hard Disk Fault (Disk, Major Alarm)

Description

Alarm message:

The [arg1] disk arg2 failure (SN: arg3, PN: arg4).
NOTE:

From iBMC V316, the CPU and disk alarms will also include the SN and part umber and the mainboard and memory alarms will also include the part number.

This alarm is generated when a hard disk is faulty.

NOTE:

If arg2 is DISKA, DISKB, DISKC, or DISKD, this alarm is generated for the rear hard disk of a V2 or V3 server. For details about the slot information of the hard disks on a V3 server, see "Removing a Hard Disk" in the user guide of the server you use.

Alarm object: disk

NOTE:

This alarm also applies to SATADOM and M.2 disks.

Attribute

Alarm ID Alarm Severity Auto Clear

0x02000007

Major

Yes

Parameters

Name Meaning

arg1

Location of the hard disk. For example, FM or CMn.

arg2

Slot number of the hard disk.

arg3

Disk serial number.

arg4

Part number.

Impact on the System

Services related to the faulty hard disk are affected, and data is lost.

Possible Causes

Component

Possible Causes

SAS or SATA disk

  • The SAS cable is faulty.

  • The hard disk is faulty.

  • The hard disk has foreign configuration.

  • The hard disk backplane is faulty.

  • The RAID controller card is faulty.

NVMe SSD (not managed by the RAID controller card)

  • The hard disk is faulty.

  • The NVMe cable is faulty.

  • The hard disk backplane is faulty.

  • The CPU is faulty.

  • The mainboard is faulty.

Procedure

If the RAID card supports out-of-band management and the iBMC version is V328 or later, perform the following steps:

  1. Replace the hard disk. Then, check whether the alarm is cleared.
  • If yes, no further action is required.

  • If no, go to 2.

  1. Contact Huawei technical support.

If the RAID card does not support out-of-band management and the iBMC version is earlier than V328, perform the following steps:

  1. Determine the component for which the alarm is generated.
    • If the alarm is generated for a SAS/SATA disk, go to 2.

    • If the alarm is generated for an NVMe SSD (not managed by the RAID controller card), go to 9.

  2. Check whether there is SAS cable fault alarm.
    • If yes, go to 3.
    • If no, go to 4.
  3. Rectify the SAS cable fault according to the alarm handling suggestions. Then, check whether the hard disk alarm is cleared.
    • If yes, no further action is required.
    • If no, go to 4.
  4. Check whether the hard disk firmware status is FOREIGN.

    You can choose System Info > Storage > Views on the iBMC WebUI to check the hard disk firmware status.

    • If yes, go to 5.
    • If no, go to 6.
  5. Clear or import the RAID configuration. Then, check whether the alarm is cleared.
    • If yes, no further action is required.
    • If no, go to 6.
  6. Replace the hard disk. Then, check whether the alarm is cleared.

    For details, see the user guide of the server you use.

    • If yes, no further action is required.
    • If no, go to 7.
  7. Replace the hard disk backplane. Then, check whether the alarm is cleared.
    • If yes, no further action is required.
    • If no, go to 8.
  8. Replace the RAID controller card. Then, check whether the alarm is cleared.
    • If yes, no further action is required.
    • If no, go to 14.
  9. Replace the hard disk. Then, check whether the alarm is cleared.

    For details, see the user guide of the server you use.

    • If yes, no further action is required.
    • If no, go to 10.
  10. Replace the NVMe cable. Then, check whether the alarm is cleared.
    • If yes, no further action is required.
    • If no, go to 11.
  11. Replace the hard disk backplane or transfer card. Then, check whether the alarm is cleared.
    • If yes, no further action is required.
    • If no, go to 12.
  12. Replace the CPU. Then, check whether the alarm is cleared.
    • If yes, no further action is required.
    • If no, go to 13.
  13. Replace the mainboard. Then, check whether the alarm is cleared.
    • If yes, no further action is required.
    • If no, go to 14.
  14. Contact Huawei technical support.
Download
Updated: 2019-08-05

Document ID: EDOC1000054724

Views: 349602

Downloads: 3090

Average rating:
This Document Applies to these Products
Related Version
Related Documents
Share
Previous Next