No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

KunLun 9008 V5 Alarm Handling 05

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
ALM-0x01000045 DCPMM Overtemperature (Memory, Minor Alarm)

ALM-0x01000045 DCPMM Overtemperature (Memory, Minor Alarm)

Description

Alarm message:

The DCPMM (arg1 arg2) temperature (arg3 degrees C) exceeds the overtemperature threshold (arg4degrees C)

This alarm is generated when the DCPMM temperature exceeds the alarm threshold. This alarm is clearedwhen the DCPMM temperature is within the normal range.

Alarm object: memory

Attribute

Alarm ID

Alarm Severity

Auto Clear

0x01000045

Minor

Yes

Parameters

Name

Meaning

arg1

Socket number of the CPU, for example, CPUn.

arg2

  • DIMM silkscreen, for example, DIMM020 (A) or DIMM010 (B)
  • CPU socket number and channel number. For example, CPU 1 channel 2 indicates the number 2 memory channel of CPU 1, that is, DIMMs DIMM020 and DIMM021.

You can obtain the CPU socket number and channel number corresponding to a DIMM from the server user guide. For details, see

System Compute Modules > DIMM Slot Locations in the KunLun 9008 V5 User Guid.

arg3

Current reading of the sensor.

arg4

Alarm threshold.

Impact on the System

The overheating affects DCPMM stability and server performance.

Possible Causes

  • The fan module is faulty.
  • The equipment room temperature exceeds the normal range.
  • The air inlet or outlet is blocked.
  • The DCPMM is faulty.

Procedure

  1. Check whether there are fan module alarms.

  2. Replace the alarmed fan module. After 5 minutes, check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to Step 3.

  3. Check whether the equipment room temperature exceeds the normal range.

  4. Reduce the equipment room temperature to the normal range. After 5 minutes, check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to Step 5.

  5. Check whether the air inlet or outlet of the server is blocked.

  6. Clear the blockage. After 5 minutes, check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to Step 7.

  7. Replace the DCPMM. After the server is poweredon, check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to Step 8.

  8. Contact Huawei technical support.
Translation
Download
Updated: 2019-05-25

Document ID: EDOC1100023838

Views: 108743

Downloads: 17

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next