No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

Huawei Rack Server iBMC Alarm Handling 28

This document describes iBMC alarms in terms of the meaning, impact on the system, possible causes, and handling suggestions.
Rate and give feedback :
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
ALM-0x0100002F Fatal Error Detected in Memory Initialization (Memory, Critical Alarm)

ALM-0x0100002F Fatal Error Detected in Memory Initialization (Memory, Critical Alarm)

Description

Alarm message:

[arg1] arg2 memory MRC fatal error detected. Error code: arg3 (SN: arg4, PN: arg5).
NOTE:

From iBMC V316, the CPU and disk alarms will also include the SN and part umber and the mainboard and memory alarms will also include the part number.

This alarm is generated when a fatal error is detected during the memory initialization process. The error code is arg3.

NOTE:

For details about DIMM layout, see "Installing DIMMs" in the troubleshooting manual of the server you use.

Alarm object: memory

Attribute

Alarm ID Alarm Severity Auto Clear

0x0100002F

Critical

Yes

Parameters

Name Meaning

arg1

Slot number of the memory board.

arg2

  • DIMM silkscreen, for example, DIMM020 (A) or DIMM010 (B)
  • CPU socket number and channel number. For example, in a 2488 V5 server, CPU 1 channel 2 indicates the number 2 memory channel of CPU 1, that is, DIMMs DIMM020 and DIMM021.
    NOTE:

    The number of DIMMs corresponding to a memory channel varies depending on the server model.

You can obtain the CPU socket number and channel number corresponding to a DIMM from the server user guide. For example, if you use a 2488 V5 server, see Components > DIMM Slot Locations in the 2488 V5 Server V100R005 User Guide.

arg3

Error code of the alarm.

arg4

Memory serial number.

arg5

Part number.

Impact on the System

The DIMM cannot be used, which affects server performance.

Possible Causes

  • The DIMM is faulty.
  • The DIMM slot is faulty.

Procedure

  1. Check whether the alarm information provides the silkscreen of the DIMM.

    • If yes, go to 2.

    • If no, go to 3.

  2. Replace the DIMM, and power on the server. After the server is powered on, check whether the alarm is cleared.

    • If yes, no further action is required.

    • If no, go to 3.

  3. Check whether the DIMM slot has contaminants.

    • If yes, go to 4.

    • If no, go to 5.

  4. Clean the DIMM slot, reinstall the DIMM, and power on the server. After the server is powered on, check whether the alarm is cleared.

    • If yes, no further action is required.

    • If no, go to 5.

  5. Replace the memory board or mainboard. After the server is powered on, check whether the alarm is cleared.

    • If yes, no further action is required.

    • If no, go to 6.

  6. Contact Huawei technical support.
Download
Updated: 2019-06-04

Document ID: EDOC1000054724

Views: 249292

Downloads: 2956

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next