Event Alarms
An event alarm indicates an event occurred during the running of a server. Generally, this type of alarms does not affect services and need to be handled immediately. Users can handle event alarms in off-peak hours. Table 10-1 lists the events of servers.
Event Code |
Event Description |
Impact/Suggestions |
---|---|---|
0x00000015 |
CPU arg1 installed. NOTE:
arg1 indicates the CPU No. |
- |
0x00000017 |
CPU arg1 removed. NOTE:
arg1 indicates the CPU No. |
Impact:The server may break down. |
0x00000079 |
CPU arg1 health status degradation detected by PFAE. NOTE:
arg1 indicates the CPU No. |
Impact: The system reliability is affected. Handling suggestions:
|
0x00000081 |
arg1 CPU arg2 is replaced from SN(arg3) to SN(arg4). NOTE:
|
- |
0x0100000D |
[Memory board arg1] arg2 memory correctable ECC. NOTE:
|
Impact: The system performance is affected. |
0x0100000F |
[Memory board arg1] arg2 installed. NOTE:
|
- |
0x01000011 |
[Memory board arg1] arg2 removed. NOTE:
|
Impact: The system performance is affected. Handling suggestions:
|
0x0100002B |
[arg1]arg2 memory initialization error. Error code: arg3. NOTE:
|
- |
0x0100002D |
[arg1] arg2 health status degradation detected by PFAE. NOTE:
|
Impact: The system reliability is affected. Handling suggestions:
|
0x01000041 |
arg1 arg2 is replaced from SN(arg3) to SN(arg4). NOTE:
|
- |
0x02000003 |
The [arg1] disk arg2 installed. NOTE:
|
- |
0x02000005 |
The [arg1] disk arg2 removed. NOTE:
|
- |
0x0200000D |
RAID rebuild starts at the [arg1] disk arg2. NOTE:
|
- |
0x0200000F |
RAID rebuild at the [arg1] disk arg2 stopped. NOTE:
|
The stop of the rebuild does not mean a success of the rebuild. You still need to check for alarms related to the drives and RAID controller card.
|
0x0200001F |
The [arg1] disk arg2 health status degradation detected by PFAE. NOTE:
|
Impact: The system reliability is affected. Handling suggestions:
|
0x02000023 |
The arg1 disk arg2 is replaced from SN(arg3) to SN(arg4). NOTE:
|
- |
0x02000033 |
The [arg1] disk arg2 disconnected temporarily. NOTE:
|
- |
0x03000003 |
PSU arg1 installed. NOTE:
arg1 indicates the PSU slot No. |
- |
0x03000005 |
PSU arg1 removed. NOTE:
arg1 indicates the PSU slot No. |
Impact: The power supply redundancy is affected. |
0x03000021 |
High output voltage detected on PSU arg1. NOTE:
arg1 indicates the PSU slot No. |
Impact: The output power voltage is high. |
0x03000023 |
Low output voltage detected on PSU arg1. NOTE:
arg1 indicates the PSU slot No. |
Impact: The output power voltage is low. |
0x03000025 |
High output current detected on PSU arg1. NOTE:
arg1 indicates the PSU slot No. |
Impact: The PSU is about to fail. |
0x03000027 |
High input voltage detected on PSU arg1. NOTE:
arg1 indicates the PSU slot No. |
Impact: The input voltage is too high, and the PSU is about to fail. |
0x0300001D |
Low input voltage detected on PSU arg1. NOTE:
arg1 indicates the PSU slot No. |
Impact: The input voltage is too low, and the PSU is about to fail. |
0x03000029 |
High temperature detected on PSU arg1. NOTE:
arg1 indicates the PSU slot No. |
Impact: The PSU internal temperature is too high, and the PSU is about to fail. |
0x0300002B |
Fan alarm detected on PSU arg1. NOTE:
arg1 indicates the PSU slot No. |
Impact: A PSU fan alarm is generated, and the PSU is about to fail. |
0x04000001 |
Fan arg1 [arg2] installed. NOTE:
|
- |
0x04000003 |
Fan arg1 [arg2] removed. NOTE:
|
Impact: The fan redundancy is affected. |
0x06000001 |
The RAID controller card arg1 installed. NOTE:
arg1 indicates the slot No. of the RAID controller card. |
- |
0x06000003 |
The RAID controller card arg1 removed. NOTE:
arg1 indicates the slot No. of the RAID controller card. |
Impact: The services related to the RAID controller card will be interrupted. |
0x06000013 |
arg1 RAID card arg2 BBU is absent. NOTE:
|
Impact: The cache function of the RAID controller card fails. Handling suggestions: Replace the BBU. |
0x06000015 |
arg2 RAID card arg1 BBU is present. NOTE:
|
- |
0x06000023 |
The arg1 RAID controller card arg2 health status degradation detected by PFAE. NOTE:
|
Impact: The system is still running properly, but the reliability is affected. Handling suggestions:
|
0x08000019 |
The [arg1] PCIe card arg2 (arg3) starting arg4. NOTE:
|
- |
0x0800003D |
The [arg1] PCIe card arg2 (RAID) BBU is absent. NOTE:
|
Impact: The cache function of the PCIe card fails. Handling suggestions: Replace the BBU. |
0x0800003F |
The [arg1] PCIe card arg2 (RAID) BBU is present. NOTE:
|
- |
0x08000065 |
arg1 arg2 [arg3] health status degradation detected by PFAE. NOTE:
|
Impact: The system reliability is affected. Handling suggestions:
|
0x0800008F |
The arg1 PCIe card arg2 (arg3) arg4 chip was reset. [arg5] [arg6] NOTE:
|
- |
0x0D000007 |
The NIC arg1 health status degradation detected by PFAE. NOTE:
arg1 indicates the NIC slot No. |
Impact: The system reliability is affected. Handling suggestions:
|
0x0F000001 |
PCIe riser card arg1 installed. NOTE:
arg1 indicates the slot No. of the PCIe riser card. |
- |
0x0F000003 |
PCIe riser card arg1 removed. NOTE:
arg1 indicates the slot No. of the PCIe riser card. |
Impact: The services related to the PCIe card will be interrupted. |
0x100000C3 |
Failed to obtain the RTC Time on the mainboard. |
Impact: The log time on the iBMC is inaccurate. Suggestions:
|
0x100000CD |
The LOM [arg1] health status degradation detected by PFAE. NOTE:
arg1 indicates the LOM slot No. |
Impact: The system reliability is affected. Handling suggestions:
|
0x12000005 |
Chassis cover opened. |
Impact: Heat dissipation and component protection will be affected. Handling suggestions: Close the chassis cover. |
0x1A00000B |
arg1 changed. NOTE:
arg1 indicates the SDR or FR information and iBMC version. |
- |
0x1A00000D |
iBMC is restarted after AC power supply is restored. |
- |
0x1A00000F |
iBMC event records are cleared. |
- |
0x1A000011 |
iBMC event record has reached 90% space capacity. |
Impact: If this alarm is not handled in time, the event records will overflow. Handling suggestions: Clear event records. |
0x1A00001B |
iBMC operation log has reached 90% space capacity. |
Impact: If this alarm is not handled in time, the operation logs will overflow and some historical operation logs may be lost. Handling suggestions:
|
0x1A00001D |
iBMC security log has reached 90% space capacity. |
Impact: If this alarm is not handled in time, the security logs will overflow and some historical security logs may be lost. Handling suggestions:
|
0x1A000021 |
iBMC is reset and started. |
- |
0x1A000023 |
arg1 certificate is about to expire or has expired. NOTE:
|
Handling suggestions: Import a new certificate. |
0x1A000025 |
Heartbeat signals between the iBMC and the system management software (iBMA) are lost. |
Impact: The in-band management and monitoring information cannot be obtained or updated on a realtime basis. Handling suggestions: Reinstall the iBMA. |
0x1A000029 |
iBMC time is stepped by more than arg1 minutes. NOTE:
arg1 indicates the time stepped. |
Impact: The iBMC log time is inaccurate. Handling suggestions: Restart the iBMC. |
0x1A00002B |
iBMC failed to synchronize time with the NTP server. NOTE:
The event code of this alarm is 0x1A000031 in versions earlier than iBMC V256 and 0x1A00002B in iBMC V256 or later. |
Impact: The iBMC system time is inaccurate. Handling suggestions:
|
0x1A000039 |
The iBMC license enters the grace period and can still be used. It will expire in arg1 days. NOTE:
arg1 indicates the remaining days in the grace period. |
Impact: Advanced features of the iBMC cannot be implemented. Handling suggestions:
|
0x1A00003B |
The iBMC license has expired. |
Impact: Advanced features of the iBMC cannot be implemented. Handling suggestions:
|
0x27000033 |
PCH health status degradation detected by PFAE. |
Impact: The system reliability is affected. Handling suggestions:
|
0x28000015 |
CPU arg1 QPI/UPI arg2 link health status degradation detected by PFAE. NOTE:
|
Impact: The system reliability is affected. Handling suggestions:
|
0x29000001 |
arg1 [arg2] portarg3 disconnected. NOTE:
|
Impact: The network port services will be interrupted. Handling suggestions:
|
0x2B000003 |
arg1 NOTE:
arg1 indicates Alarm message:
|
Impact: The performance of the drive connected to the corresponding slot may deteriorate, or the drive may even become offline. Handling suggestions:
|
0x2C000001 |
The CPU usage (arg1) exceeds the threshold (arg2). NOTE:
|
Impact: The system performance is affected. Handling suggestions:
|
0x2C000003 |
The memory usage (arg1) exceeds the threshold (arg2). NOTE:
|
Impact: The system performance is affected. Handling suggestions:
|
0x2C000009 |
ACPI is in the working state. |
- |
0x2C00000B |
ACPI is in the soft-off state. |
Impact: The server fails to power on. |
0x2C000063 |
The host was restarted by BMC arg1. NOTE:
arg1 indicates the cause of the restart, for example, "due to an IERR diagnosis failure" or "due to PCIe switch or retimer upgrade". |
Impact: Services will be interrupted. Handling suggestions: Restart the iBMC as soon as possible. |
0x2C00000F |
The host was restarted due to unrecognized reason. |
Impact: Services running on the server will be interrupted. |
0x2C000011 |
The host was restarted by command. |
Impact: Services running on the server will be interrupted. |
0x2C000013 |
The host was restarted by power button. |
Impact: Services running on the server will be interrupted. |
0x2C000015 |
The host was restarted due to watchdog timeout. |
Impact: Services running on the server will be interrupted. |
0x2C000017 |
The host is restarted after being powered on (Power strategy is "Turn On"). |
Impact: Services running on the server will be interrupted. |
0x2C000019 |
The host is restarted after being powered on (Power strategy is "Restore Previous State"). |
Impact: Services running on the server will be interrupted. |
0x2C00001B |
The OS cannot start without a boot device. |
Impact: The server OS fails to start. |
0x2C00001D |
The OS cannot start without a bootable disk. |
Impact: The server OS fails to start. |
0x2C00001F |
The OS cannot start because the PXE service is unavailable. |
Impact: The server OS fails to start. |
0x2C000021 |
The OS cannot start due to the invalid boot partition. |
Impact: The server OS fails to start. |
0x2C000023 |
The watchdog(arg1) timed out. NOTE:
arg1 indicates the watch dog type, which can be BIOS FRB2, BIOS/POST, OS Load, SMS/OS, or OEM. |
- |
0x2C00002D |
Power capping failed. |
Impact: The server automatically powers off, which interrupts services. Handling suggestions:
|
0x2C00002F |
The server system crashes or is abnormally reset. |
Impact: The server OS is abnormal, and related services are interrupted. |
0x2C000051 |
arg1 arg2 arg3 arg4 memory initialization error. Error code: 0xarg5. NOTE:
arg1 indicates the Slot number of the memory board. arg2 indicates the DIMM silkscreen or CPU socket number and memory channel No. arg3 indicates the error code of the alarm. arg4 indicates the Memory serial number. arg5 indicates the BOM code. |
Impact: The server performance is affected, or the server fails to start. |
0x2C000053 |
The hard disk partition (arg1) usage (arg2) exceeds the threshold (arg3). NOTE:
|
Impact: The system performance is affected. Handling suggestions:
|
0x2C000059 |
System power is abnormal, reading value:arg1,threshold value:arg2. NOTE:
arg1 indicates the Current reading of the sensor. arg2 indicates the Alarm threshold. |
Impact: The management software cannot correctly report the total system power consumption in real time. Handling suggestions: 1.Restart the iBMC. 2.Replace the PSU. |
0x2C000061 |
Network arg1 [arg2] arg3 bandwidth usage (arg4) exceeds the threshold (arg5). NOTE:
|
Impact: The packet loss rate of the NIC port increases, and the communication quality deteriorates. Handling suggestions:
|
0x31000001 |
The power button on the panel is pressed. |
Impact: The server will be powered off. |
0x31000003 |
The UID button on the panel is pressed. |
- |
0x2C000085 |
After the AC is powered on, the host is restarted because the SP information collection is completed. |
- |
0x05000015 |
The disk backplane arg1 is replaced from SN(arg2) to SN(arg3). NOTE:
|
- |
0x080000BD |
arg1 is replaced from SN(arg2) to SN(arg3). NOTE:
|
- |
0x03000043 |
PSU arg1 is replaced from SN(arg2) to SN(arg3). NOTE:
|
- |