No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

RH5885H V3 Reports "smpboot: CPU1: Not responding" After Running for a Period of Time

Publication Date:  2019-04-23 Views:  276 Downloads:  0

Issue Description

Two physical CPUs are configured on the RH5885H V3. After the Oracle Linux 6.5 runs for a period of time, the following alarm is reported: "smpboot: CPU1: Not responding; smpboot: CPU3: Not responding; smpboot: CPU5: Not responding". When the FusionServer Tools-Toolkit is used to mount the system, the message "CPU1: Not responding" is still displayed. However, the BMC displays that the hardware is normal and no alarm log is generated. In addition, the self-check upon server startup is normal.

Alarm Information

The Oracle Linux 6.5 startup alarm diagram is shown in the following figure.

The following figure is displayed when the FusionServer Tools-Toolkit-V119 is used to mount the alarm diagram.

Handling Process

1. Use the minimization test method to locate the faulty physical CPU.

2. Test the two physical CPUs one after another. The test result shows that the CPUs are normal.

3. Replace the PCBA and rectify the fault on the PCBA. After the mainboard is replaced, the system starts normally with a single CPU but fails with two CPUs. The message "smpboot: CPU1: Not responding" is displayed.

4. Remove all cables from the rear panel. The system starts properly.

5. Remove each cable and reconnect it. It is found that the host starts properly after the USB cable of the KVM is removed.

Root Cause

The BMC information displayed during the minimization test shows that the host hardware is normal. The USB cable connected to the external KVM is faulty. The physical CPU cannot respond to the OS upon startup. As a result, the "smpboot: CPU1: Not responding" alarm is generated on the system startup screen.

Solution

Communicate with the customer to replace the backplane KVM and USB cable.

Suggestions

Log in to the BMC management port to query fault alarms and event logs. If no alarm is generated, use the minimization test method to locate the fault.

END