No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

FusionServer Pro Rack Server iBMC Alarm Handling 30

This document describes iBMC alarms in terms of the meaning, impact on the system, possible causes, and handling suggestions.
Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Event Alarms by Sensor

Event Alarms by Sensor

An event alarm indicates an event occurred during the running of a server. Generally, this type of alarms does not affect services and need to be handled immediately. Users can handle event alarms in off-peak hours. Table 3-3 lists the events of servers.

Table 3-3 Event list

Event Code

Event Description

Impact/Suggestions

0x0341FFFF

The UID button on the panel is pressed.

0x0341FFFF

arg1certificate is about to expire or has expired.

NOTE:

arg1 indicates the certificate type.

Impact: None.

Handling suggestions: Import a new certificate.

0x0441FFFF

Power capping failed.

Impact: The server automatically shuts down or fails to power on, which interrupts services

Handling suggestions:

  1. Check whether the mains supply meets power consumption requirements of the server.
  2. If no, perform rectification to enable the mains supply to meet requirements.
  3. If the alarm still persists, increase the power cap value for the server.

0x0500FFFF

Chassis cover opened.

Impact: The heat dissipation of the chassis is affected, and the server components are not protected.

Handling suggestions: Close the chassis cover.

0x0541FFFF

The CPU usage (arg1) exceeds the threshold (arg2).

NOTE:
  • arg1 indicates the current reading of the sensor.
  • arg2 indicates the alarm threshold.

Impact: The system performance is affected.

Handling suggestions:

  1. Check whether the CPU usage threshold is set to a value that is too low within the value range.
  2. Stop unnecessary services to release CPU resources.

0x0541FFFF

The memory usage (arg1) exceeds the threshold (arg2).

NOTE:
  • arg1 indicates the current reading of the sensor.
  • arg2 indicates the alarm threshold.

Impact: The system performance is affected.

Handling suggestions:

  1. Check whether the memory usage threshold is set to a value that is too low within the value range.
  2. Stop unnecessary services to release memory resources.

0x0580FFFF

Chassis cover closed.

0x0707FFFF

CPU arg1 installed.

NOTE:

arg1 indicates the CPU No.

0x0748FFFF

The server system crashes or is abnormally reset.

Impact: Services will be interrupted.

Handling suggestions:

  1. Collect the iBMC and OS logs.
  2. Contact Huawei technical support.

0x0748FFFF

iBMC operation log has reached 90% space capacity.

Impact: Some historical operations logs will be lost.

Handling suggestions:
  1. Export the operation logs.
  2. Enable remote syslog dump for operation logs.

0x0748FFFF

iBMC security log has reached 90% space capacity.

Impact: Some historical security logs will be lost.

Handling suggestions:
  1. Export the security logs.
  2. Enable remote syslog dump for security logs.

0x0787FFFF

CPU arg1 removed.

NOTE:

arg1 indicates the CPU No.

Impact: The server OS crashes.

Handling suggestions: None.

0x0840FFFF

Fan arg1 [arg2] removed.

NOTE:
  • arg1 indicates the fan module location, for example front, inner, or rear.

  • arg2 indicates the fan module No.

Impact: The fan redundancy is affected.

Handling suggestions: None.

0x0840FFFF

LCD removed.

0x0841FFFF

PSU arg1 installed.

NOTE:

arg1 indicates the PSU No.

0x0841FFFF

The RAID controller card arg1 installed.

NOTE:

arg1 indicates the slot No. of the RAID controller card.

0x0880FFFF

LCD installed.

0x08C0FFFF

Fan arg1 [arg2] installed.

NOTE:
  • arg1 indicates the fan module location, for example front, inner, or rear.

  • arg2 indicates the fan module No.

0x08C1FFFF

PSU arg1 removed.

NOTE:

arg1 indicates the PSU No.

Impact: The PSU redundancy is affected.

Handling suggestions: None.

0x08C1FFFF

The RAID controller card arg1 removed.

NOTE:

arg1 indicates the slot No. of the RAID controller card.

Impact: Services related to the RAID controller card are interrupted.

Handling suggestions: None.

0x0941FFFF

iBMC is restarted after AC power supply is restored.

0x0941FFFF

iBMC is reset and started.

0x0C05FFFF

[arg1] arg2 memory correctable ECC.

NOTE:
  • arg1 indicates the slot No. of the memory board.
  • arg2 indicates the DIMM silkscreen.

Impact: The system performance is affected.

Handling suggestions: None.

0x0C06FFFF

[arg1] arg2 installed.

NOTE:
  • arg1 indicates the slot No. of the memory board.
  • arg2 indicates the DIMM silkscreen.

0x0C86FFFF

[arg1] arg2 removed.

NOTE:
  • arg1 indicates the slot No. of the memory board.
  • arg2 indicates the DIMM silkscreen.

Impact: The system performance is affected.

Handling suggestions: None.

0x0C06FFFF

[arg1] arg2 installed.

NOTE:
  • arg1 indicates the slot No. of the memory board.
  • arg2 indicates the DIMM silkscreen.

0x0C86FFFF

[arg1] arg2 removed.

NOTE:
  • arg1 indicates the slot No. of the memory board.
  • arg2 indicates the DIMM silkscreen.

Impact: The system performance is affected.

Handling suggestions:

  1. Install a DIMM in the slot.
  2. Remove and reinstall the DIMM.
  3. Replace the DIMM.
  4. Replace the mainboard or the board holding the DIMM.

0x0D00FFFF

The [arg1] disk arg2 installed.

NOTE:
  • arg1 indicates hard disk location, for example rear.
  • arg2 indicates the slot No. of the hard disk, for example diskA1 or diskB1.

0x0D00FFFF

SD card arg1 installed.

NOTE:

arg1 indicates slot No. of the SD card.

0x0D07FFFF

RAID rebuild starts at the [arg1] disk arg2.

NOTE:
  • arg1 indicates hard disk location, for example rear.
  • arg2 indicates the slot No. of the hard disk, for example diskA1 or diskB1.

0x0D07FFFF

Data rebuild starts at SD card arg1.

NOTE:

arg1 indicates slot No. of the SD card.

0x0D80FFFF

The [arg1] disk arg2 removed.

NOTE:
  • arg1 indicates hard disk location, for example rear.
  • arg2 indicates the slot No. of the hard disk, for example diskA1 or diskB1.

0x0D80FFFF

SD card arg1 removed.

NOTE:

arg1 indicates slot No. of the SD card.

Impact: The storage capacity of the server is reduced.

Handling suggestions: None.

0x0D87FFFF

RAID rebuild at the [arg1] disk arg2 is complete.

NOTE:
  • arg1 indicates hard disk location, for example rear.
  • arg2 indicates the slot No. of the hard disk, for example diskA1 or diskB1.

0x0D87FFFF

Data rebuild at SD card arg1 is complete.

NOTE:

arg1 indicates slot No. of the SD card.

0x1002FFFF

iBMC event records are cleared.

0x1005FFFF

iBMC event record has reached 90% space capacity.

Impact: If the event records keep increasing, an alarm will be generated indicating that the event logs are about to full.

Handling suggestions: None.

0x1400FFFF

The power button on the panel is pressed.

Impact: The server is powered off.

Handling suggestions: None.

0x1D0700FF

The host was restarted due to unrecognized reason.

Impact: The OS restarts, which interrupts services.

Handling suggestions: None.

0x1D0701FF

The host was restarted by command.

Impact: The OS restarts, which interrupts services.

Handling suggestions: None.

0x1D0703FF

The host was restarted by power button.

Impact: The OS restarts, which interrupts services.

Handling suggestions: None.

0x1D0704FF

The host was restarted due to watchdog timeout.

Impact: The OS restarts, which interrupts services.

Handling suggestions: None.

0x1D0705FF

The host was restarted by BMC.

Impact: The OS restarts, which interrupts services.

Handling suggestions: None.

0x1D0706FF

The host is restarted after being powered on. (Power strategy is "Turn On".)

Impact: The OS restarts, which interrupts services.

Handling suggestions: None.

0x1D0707FF

The host is restarted after being powered on. (Power strategy is "Restore Previous State".)

Impact: The OS restarts, which interrupts services.

Handling suggestions: None.

0x1E00FFFF

The OS cannot start without a boot device.

Impact: The OS fails to start.

Handling suggestions: None.

0x1E01FFFF

The OS cannot start without a bootable disk.

Impact: The OS fails to start.

Handling suggestions: None.

0x1E02FFFF

The OS cannot start because the PXE service is unavailable.

Impact: The OS fails to start.

Handling suggestions: None.

0x1E03FFFF

The OS cannot start due to the invalid boot partition.

Impact: The OS fails to start.

Handling suggestions: None.

0x2101FFFF

Heartbeat signals between the iBMC and the system management software (iBMA) are lost.

Impact: The inband management and monitoring information cannot be obtained or updated in real time.

Handling suggestions: Reinstall the iBMA.

0x2108FFFF

arg1 [arg2] arg3 disconnected.

NOTE:
  • arg1 indicates the network adapter name, for example NIC 1, PCIe Card 1, or LOM.
  • arg2 indicates the network adapter type, for example (NIC) or (FC).
  • arg3 indicates the network port, for example port 1.

Impact: Services over the network port are interrupted.

Handling suggestions:

  1. Remove and reconnect the network cable.
  2. Check whether the network cable is connected to the switch.
  3. Check whether the peer switch is working properly.

0x2200FFFF

ACPI is in the working state.

0x2206FFFF

ACPI is in the soft-off state.

Impact: The server fails to power on.

Handling suggestions: None.

0x2300FFFF

The watchdog (arg1) timed out.

NOTE:

arg1 indicates the system boot process, for example BIOS FRB 2, BIOS/POST, OS Load, SMS/OS or OEM.

0x2902FFFF

arg1 RAID card arg2 BBU is absent.

NOTE:
  • arg1 indicates the front I/O module (for example FM) or the compute node and its slot No. (for example CMn).
  • arg2 indicates the slot No. of the RAID controller card.

Impact: The cache of the RAID controller card fails.

Handling suggestions: Install the BBU.

0x2902FFFF

The [arg1] PCIe card arg2 (arg3) BBU is present.

NOTE:
  • arg1 indicates the PCIe card location, for example front or rear.
  • arg2 indicates the PCIe card slot No.
  • arg3 indicates the PCIe card type.

0x2982FFFF

The [arg1] PCIe card arg2 (arg3) BBU is absent.

NOTE:
  • arg1 indicates the PCIe card location, for example front or rear.
  • arg2 indicates the PCIe card slot No.
  • arg3 indicates the PCIe card type.

Impact: The cache of the PCIe card fails.

Handling suggestions: Install the BBU.

0x2982FFFF

arg1 RAID card arg2 BBU is present.

NOTE:
  • arg1 indicates the front I/O module (for example FM) or the compute node and its slot No. (for example CMn).
  • arg2 indicates the slot No. of the RAID controller card.

Download
Updated: 2019-08-05

Document ID: EDOC1000054724

Views: 335623

Downloads: 3066

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next