No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

E9000 Server V100R001 HMM Alarm Handling 19

This document describes E9000 server alarms in terms of the meaning, impact on the system, possible causes, and solutions.
Rate and give feedback :
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Event List

Event List

An event alarm is used to inform users of the occurrence of a key operation. No action is required for event alarms.

Table 3-1 HMM event list

Event Code

Event Description

Meaning

0x00000079

CPUarg1 health status degradation detected by PFAE.

A CPU CE hard failure occurred.

arg1 indicates the CPU socket number.

0x0200001F

The [arg1] disk arg2 health status degradation detected by PFAE.

A hard drive CE hard failure occurred.

  • arg1 indicates the hard drive location.
  • arg2 indicates the slot number of the hard drive.

0x0240FF12

Transition to idle

An active/standby MM switchover was performed. The MM changed to the standby state.

0x0241FF12

Transition to active

An active/standby MM switchover was performed. The MM changed to the active state.

NOTE:
  • The standby HMM becomes active. Examples:
    • An active/standby switchover is manually triggered.
    • A critical alarm causes an active/standby switchover.
    • A hardware fault causes an active/standby switchover.
  • You should view operation logs to check whether the active/standby switchover is normal.

0x0341FF06

arg1 certificate is about to expire or has expired.

The certificate is about to expire or has expired.

arg1 indicates the certificate type.

0x0341FF14

The UID button on the panel is pressed.

The UID button was pressed.

0x0441FF12

Power capping failed.

Power capping failed.

0x0540FF28

Limit not exceeded

The certificate is about to expire.

0x0541FF07

The CPU usage (arg1%) exceeds the threshold (%2%).

The CPU usage is too high.

0x0541FF0C

The memory usage (arg1%) exceeds the threshold (%2%).

The memory usage is too high.

0x06000023

RAID controller card arg1 health status degradation detected by PFAE.

A RAID controller card CE hard failure occurred.

arg1 indicates the slot number of the RAID controller card.

0x0707FFFF

CPU arg1 installed.

The CPU was installed.

0x0748FF10

iBMC operation log has reached 90% space capacity.

The space for storing operation logs is about to full.

iBMC security log has reached 90% space capacity.

The space for storing security logs is about to full.

0x0748FF12

The server system crashes or is abnormally reset.

The server system is malfunctioning.

0x0787FFFF

CPU arg1 removed.

The CPU was removed.

0x08000065

[arg1] PCIe card arg2 (arg3) health status degradation detected by PFAE.

A PCIe card CE hard failure occurred.

  • arg1 indicates the PCIe card location.
  • arg2 indicates the PCIe card slot number.
  • arg3 indicates the PCIe card type.

0x0841FF08

Device inserted.

The PSU was installed.

0x0841FF0A

Device inserted.

The fan module was installed.

0x0841FF17

The RAID controller card arg1 installed.

The RAID controller card was installed.

arg1 indicates the slot number of the RAID controller card.

Portarg1 installed.

A device (optical module or DAC cable) was connected to the port.

arg1 indicates the port number.

PIC card arg1 installed.

The flexible NIC was installed.

arg1 indicates the slot number of the flexible NIC.

0x08C1FF17

The RAID controller card arg1 removed.

The RAID controller card is removed.

arg1 indicates the slot number of the RAID controller card.

Portarg1 removed.

A device (optical module or DAC cable) was removed from the port.

arg1 indicates the port number.

PIC card arg1 removed.

The flexible NIC was removed.

arg1 indicates the slot number of the flexible NIC.

0x0940FF07

CPU arg1 Core arg2 isolated.

The CPU core was isolated.

  • arg1 indicates CPU socket number.
  • arg2 indicates CPU core number.

0x0941FF16

iBMC is restarted after AC power supply is restored.

The iBMC was restarted after the AC power supply was restored.

iBMC is reset and started.

The iBMC was restarted after being reset.

0x0A40FF12

Transition to running

A switchover between the active and standby compute node resource pools was performed.

0x0B00001F

Mezzanine card arg1 failed to obtain the WFR chip SVID value.

The system failed to obtain the SVID of the WFR chip on the mezzanine card.

arg1 indicates the slot number of the mezzanine card.

0x0B000023

Mezzanine card arg1 health status degradation detected by PFAE.

A mezzanine card CE hard failure occurred.

arg1 indicates the slot number of the mezzanine card.

0x0C00FFFF

memory correctable ECC error occurred

A memory error checking and correcting (ECC) error occurred.

0x0C05FFFF

arg1 memory correctable ECC.

The number of ECC errors exceeds the threshold.

arg1 indicates the DIMM silkscreen.

0x0C06FFFF

arg1 installed.

The DIMM was installed.

arg1 indicates the DIMM silkscreen.

0x0C86FFFF

arg1 removed.

The DIMM was removed.

arg1 indicates the DIMM silkscreen.

0x0D000007

The NIC [arg1] health status degradation detected by PFAE.

A NIC CE hard failure occurred.

arg1 indicates the slot number of the mezzanine card.

0x0D00FFFF

The [arg1] disk arg2 installed.

The hard drive was installed.

  • arg1 indicates the hard drive location.
  • arg2 indicates the hard drive slot number.

SD card arg1 installed.

The SD card was installed.

arg1 indicates the SD card slot number.

0x0D07FFFF

RAID rebuild starts at the [arg1] disk arg2.

A RAID rebuild started on the hard drive.

  • arg1 indicates the hard drive location.
  • arg2 indicates the hard drive slot number.

Data rebuild starts at SD card arg1.

A RAID rebuild started on the SD card.

arg1 indicates the SD card slot number.

0x0D80FFFF

The [arg1] disk arg2 removed.

The hard drive was removed.

  • arg1 indicates the hard drive location.
  • arg2 indicates the hard drive slot number.

SD card arg1 removed.

The SD card was removed.

arg1 indicates the SD card slot number.

0x0D87FFFF

RAID rebuild at the [arg1] disk arg2 is complete.

The RAID rebuild of the hard drive was complete.

  • arg1 indicates the hard drive location.
  • arg2 indicates the hard drive slot number.

Data rebuild stops at SD card arg1.

The RAID rebuild of the SD card was complete.

arg1 indicates the SD card slot number.

0x100000CD

The LOM [arg1] health status degradation detected by PFAE.

A LOM CE hard failure occurred.

arg1 indicates the LOM slot number.

0x1002FFFF

iBMC event records are cleared.

The event records were cleared.

0x1004FFFF

SEL full

Log backup was rolled back.

0x1005FFFF

iBMC event record has reached 90% space capacity.

The space for storing event logs is about to full.

0x1400FFFF

The power button on the panel is pressed.

The power button on the panel was pressed.

0x1A000029

iBMC time is stepped by more than arg1 minutes.

The iBMC time was stepped.

arg1 indicates the time length that was stepped.

0x1A00002B

iBMC failed to synchronize time with the NTP server.

The iBMC failed to synchronize time with the NTP server.

0x1D02FFFF

The Base plane of the switch module restarted.

The Base plane was reset.

The Fabric plane of the switch module restarted.

The Fabric plane was reset.

0x1D0700FF

The host was restarted due to unrecognized reason.

The system was restarted due to unknown reasons.

0x1D0701FF

The host was restarted by command.

The system was restarted by a command.

0x1D0703FF

The host was restarted by power button.

The system was restarted because the power button was pressed.

0x1D0704FF

The host was restarted due to watchdog timeout.

The system was restarted because the watchdog timed out.

0x1D0706FF

The host is restarted after being powered on. (Power strategy is "Turn On".)

The system was restarted because the power strategy is "Turn On".

0x1D0707FF

The host is restarted after being powered on. (Power strategy is "Restore Previous State".)

The system was restarted because the power strategy is "Restore Previous State".

0x1D07FFFF

system restart cause unknown

The system was restarted due to unknown reasons.

NOTE:

You should check the OS logs to see whether any relevant operations were performed.

system restart cause chassis control

The system was restarted by a command.

NOTE:

You should heck the OS logs to see whether any relevant commands were received.

system restart cause power button pressed

The system was restarted because the power button was pressed.

NOTE:

You should ask whether equipment room administrators conducted relevant operations and check the equipment room video.

system restart cause Watchdog control

The system was restarted because the watchdog timed out.

NOTE:

You should check the OS logs to see whether the software stopped responding.

system restart cause always power up

The system was restarted because the power strategy is "Turn On".

system restart cause always restore previous state

The system was restarted because the power strategy is "Restore Previous State".

0x1E000041

Failed to obtain the switch chip SVID value for the Fabric plane.

The system failed to obtain the SVID of the switch chip of the Fabric plane.

0x1E00FFFF

The OS cannot start without a boot device.

The OS failed to start because there was no boot medium.

0x1E01FFFF

The OS cannot start without a bootable disk.

The OS failed to start because there was no bootable disk.

0x1E02FFFF

The OS cannot start because the PXE service is unavailable.

The OS failed to start because there was no PXE server.

0x1E03FFFF

The OS cannot start due to the invalid boot partition.

The OS failed to start because there was no valid boot partition.

0x2100FFFF

The board port configuration has changed, or BIOS fails to obtain the stateless computing configuration. Restart the server.

A stateless computing event occurred in the chassis.

Hard disk drawer opened.

The hard drive was open.

Hard disk drawer closed.

The hard drive was closed.

0x2101FFFF

Heartbeat signals between the iBMC and the system management software (iBMA) are lost.

The iBMA heartbeat signal was lost.

Identify status.

The +5V voltage of USB 1 is abnormal.

0x2200FFFF

ACPI is in the working state.

ACPI is in working state.

0x2206FFFF

ACPI is in the soft-off state.

ACPI is in soft-off state.

0x2300FFFF

The watchdog(arg1) timed out.

The watchdog timed out, and no action was performed.

arg1 indicates the watchdog type. It can be BIOS FRB2, BIOS/POST, OS Load, SMS/OS, or OEM.

0x27000033

PCH health status degradation detected by PFAE.

A PCH CE hard failure occurred.

0x28000015

CPU arg1 QPI/UPI arg2 link health status degradation detected by PFAE.

A CE hard failure occurred on the CPU QPI connection.

  • arg1 indicates the CPU socket number.
  • arg2 indicates the QPI/UPI link number.

0x2902FFFF

The [arg1] PCIe card arg2 (arg3) BBU is present.

The BBU of the PCIe card is present.

  • arg1 indicates PCIe card location.
  • arg2 indicates the PCIe card slot number.
  • arg3 indicates the PCIe card type.

RAID card arg1 BBU is present.

The BBU of the RAID controller card is present.

0x2908FFFF

The [arg1] PCIe card arg2 (arg3) BBU is absent.

The BBU of the PCIe card is not present.

  • arg1 indicates the PCIe card location.
  • arg2 indicates the PCIe card slot number.
  • arg3 indicates the PCIe card type.

RAID card arg1 BBU is absent.

The BBU of the RAID controller card is not present.

0x2B000003

RAID card (RAID Card1)arg1 PHY0 bit error increased too fast.

The error code of the SAS PHY increases sharply.

arg1 indicates xxxxx.

0x2B01FFFF

SDR or FRU info changed.

Version is changed.

iBMC version changed.

0x2C000053

The hard disk partition (arg1) usage (arg2%) exceeds the threshold (arg3%).

The usage of the drive partition is too high.

  • arg1 indicates the drive number.
  • arg2 indicates the current usage.
  • arg3 indicates the usage threshold.

0x2C000063

The host was restarted by the BMC.

The system was restarted by the iBMC.

0x43000001

The pass-through card arg1 installed.

The pass-through card was installed.

arg1 indicates the slot number of the pass-through card.

0x43000003

The pass-through card arg1 removed.

The pass-through card was removed.

arg1 indicates the slot number of the pass-through card.

0xF000FFFF

Transition to M0.cause:Normal State Change.fruid:#0/fru hot swap to M0 status

The device hot swap status changed to M0.

The Base plane of the switch module is not installed.

The Base module is not installed.

The Fabric plane of the switch module is not installed.

The Fabric module is not installed.

The switch mezzanine card arg1 is not installed.

The switch module mezzanine card is not installed.

arg1 indicates the slot number of the mezzanine card.

0xF001FFFF

Transition to M1.cause:Normal State Change.fruid:#0/fru hot swap to M1 status

The device hot swap status changed to M1.

The mainboard is installed but not powered on.

The mainboard is installed but not powered on.

The Base plane of the switch module requests power-on.

The Base module is installed but not powered on.

The Fabric plane of the switch module is installed but powered off.

The Fabric module is installed but not powered on.

The switch mezzanine card arg1 is installed but not powered on.

The switch module mezzanine card is installed but not powered on.

arg1 indicates the slot number of the mezzanine card.

0xF002FFFF

Transition to M2.cause:Normal State Change.fruid:#0/fru hot swap to M2 status

The device hot swap status changed to M2.

The mainboard requests power-on.

The mainboard is requesting power-on.

The Base plane of the switch module requests power-on.

The Base module is requesting power-on.

The Fabric plane of the switch module requests power-on.

The Fabric module is requesting power-on.

The switch mezzanine card arg1 requests power-on.

The mezzanine card is requesting power-on.

arg1 indicates the slot number of the mezzanine card.

0xF003FFFF

Transition to M3.cause:Normal State Change.fruid:#0/fru hot swap to M3 status

The device hot swap status changed to M3.

The mainboard is being powered on.

The mainboard is being powered on.

The Base plane of the switch module is being powered on.

The Base module is being powered on.

The Fabric plane of the switch module is being powered on.

The Fabric module is being powered on.

The switch mezzanine card arg1 is being powered on.

The mezzanine card is being powered on.

arg1 indicates the slot number of the mezzanine card.

0xF004FFFF

Transition to M4.cause:Normal State Change.fruid:#0/fru hot swap to M4 status

The device hot swap status changed to M4.

The mainboard is powered on.

The mainboard is powered on.

The Base plane of the switch module is powered on.

The Base module is powered on.

The Fabric plane of the switch module is powered on.

The Fabric module is powered on.

The switch mezzanine card arg1 is powered on.

The switch mezzanine card is powered on.

arg1 indicates the slot number of the mezzanine card.

0xF005FFFF

Transition to M5.cause:Normal State Change.fruid:#0/fru hot swap to M5 status

The device hot swap status changed to M5.

The mainboard requests power-off.

The mainboard is requesting power-off.

The Base plane of the switch module requests power-off.

The Base module is requesting power-off.

The Fabric plane of the switch module requests power-off.

The Fabric module is requesting power-off.

The switch mezzanine card arg1 requests power-off.

The mezzanine card is requesting power-off.

arg1 indicates the slot number of the mezzanine card.

0xF006FFFF

Transition to M6.cause:Normal State Change.fruid:#0/fru hot swap to M6 status

The device hot swap status changed to M6.

The mainboard is being powered off.

The mainboard is being powered off.

The Base plane of the switch module is being powered off.

The Base module is being powered off.

The Fabric plane of the switch module is being powered off.

The Fabric module is being powered off.

The switch mezzanine card arg1 is being powered off.

The mezzanine card is being powered off.

arg1 indicates the slot number of the mezzanine card.

Translation
Download
Updated: 2018-08-16

Document ID: EDOC1000015902

Views: 214108

Downloads: 1595

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next