No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

E9000 Server V100R001 HMM Alarm Handling 19

This document describes E9000 server alarms in terms of the meaning, impact on the system, possible causes, and solutions.
Rate and give feedback :
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Slot: CPU QPI/UPI Connection Failed (Major, CPUN QPI Link)

Slot: CPU QPI/UPI Connection Failed (Major, CPUN QPI Link)

Description

Alarm message:

Incorrect cable connected/Incorrect interconnection

or

CPU arg1 QPI/UPI arg2 link failed.

This alarm is generated when the QuickPath Interconnect (QPI/UPI) bus is faulty.

This alarm is generated by the following sensors:

CPUN QPI Link

Attribute

Alarm ID

Alarm Severity

Auto Clear

0x1B01FFFF

Major

Yes

Parameters

Name

Meaning

arg1, N

Slot number of the CPU.

arg2

Location of QuickPath.

Impact on the System

Server performance is affected.

Possible Causes

  • The QPI/UPI link is faulty.
  • The CPU is faulty.

Procedure

  1. Power off the compute node and remove it from the chassis. Then check whether the CPU socket is damaged.

    • If yes, go to 4.
    • If no, go to 2.
    You can power off the compute node using either of the following methods:
    • On the HMM CLI, run the smmset -l bladeN -d powerstate -v poweroff command.
    • On the HMM WebUI, choose PSUs & Fans > PSU Management > Power Control, and click the power control button to power off the compute node.

  2. Check whether the CPU is faulty.

    • If yes, go to 3.
    • If no, go to 4.

    The following is an example of clearing the alarm.

    Incorrect cable connected/Incorrect interconnection (CPU1 QPI Link)
    1. Switch positions of CPU1 and a normal CPU.
    2. Power on the server. If the CPU indicated in the alarm message is changed, CPU1 is faulty. Otherwise, the QPI link on the mainboard is faulty.

  3. Replace the CPU. Then check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 5.

  4. Replace the mainboard. Then check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 5.

  5. On the HMM WebUI, choose System Management > Information Collection, and collect logs.
  6. Contact Huawei technical support.
Translation
Download
Updated: 2018-08-16

Document ID: EDOC1000015902

Views: 192791

Downloads: 1565

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next