No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

Huawei Rack Server iBMC Alarm Handling 28

This document describes iBMC alarms in terms of the meaning, impact on the system, possible causes, and handling suggestions.
Rate and give feedback :
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
ALM-0x070CFFFF Correctable Machine Check Error (CPUN Status)

ALM-0x070CFFFF Correctable Machine Check Error (CPUN Status)

Description

Alarm message:

Correctable Machine Check Error

This alarm is generated when the sensor detects that a self-check exception has occurred in a CPU.

This alarm applies only to the RH5885 V3, RH5885H V3, and RH8100 V3.

Sensor triggering the alarm: CPUN Status

Attribute

Alarm ID Alarm Severity Auto Clear

0x070CFFFF

Minor

Yes

Parameters

Name Meaning
N

Serial number of the CPU.

Impact on the System

The DIMMs corresponding to the CPU cannot be used. As a result, server performance may deteriorate.

Possible Causes

  • The SMI2 link has failed in memory mirroring mode.
  • An internal error has occurred in Jordan Creek.
  • The number of errors that occur during data transmission between Jordan Creek and the memory controller has reached the alarm threshold.

Procedure

  1. Check whether there is any alarm generated for the memory board or DIMM corresponding to the CPU.

    • For RH5885 V3, check the DIMMs corresponding the CPU.
    • For RH5885H V3 and RH8100 V3, check the memory boards corresponding the CPU.
    • If yes, go to 2.
    • If no, go to 3.

  2. Replace the memory boardor DIMM. Then, check whether the alarm is cleared.

    • For RH5885 V3, replace the DIMM.
    • For RH5885H V3 and RH8100 V3, replace the memory board.
    • If yes, no further action is required.
    • If no, go to 3.

  3. Replace the mainboard or the system compute module (SCM). Then, check whether the alarm is cleared.

    • For RH5885 V3 and RH5885H V3, replace the mainboard.
    • For RH8100 V3, replace the SCM.
    • If yes, no further action is required.
    • If no, go to 4

  4. Contact Huawei technical support.
Download
Updated: 2019-06-04

Document ID: EDOC1000054724

Views: 262440

Downloads: 2973

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next