No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

HUAWEI CLOUD Stack 6.5.0 Alarm and Event Reference 04

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
ALM-70101 Faulty VM OS

ALM-70101 Faulty VM OS

Description

This alarm is generated when the VM OS has an internal error or the VM fails to send heartbeat signals to an external watchdog service for a long time.

Attribute

Alarm ID

Alarm Severity

Auto Clear

70101

Critical

Yes

Parameters

Name

Meaning

Fault Location Info

  • instance_id: specifies the ID of the VM for which the alarm is generated.
  • tenant_id: specifies the tenant of the VM for which the alarm is generated.
  • region_name: specifies the region of the tenant of the VM for which the alarm is generated.

Additional Info

  • event_id: specifies the event ID of the alarm.
  • availability_zone: specifies the AZ of the VM for which the alarm is generated
  • tenant_name: specifies the tenant of the VM for which the alarm is generated.
  • instance_name: specifies the name of the VM for which the alarm is generated.
  • hostname: specifies the name of the host accommodating the VM for which the alarm is generated.
  • host_id: specifies the ID of the host accommodating the VM for which the alarm is generated.

Impact on the System

The VM OS is faulty, and the internal services of the VM are unavailable. The VM may have been automatically restarted.

Possible Causes

  • The VM OS or another internal software has a bug, which causes a kernel panic with the event ID 0 or 9.
  • The software for sending heartbeat signals to an external watchdog service is not installed in the VM or the software has become unresponsive. The event ID is 1.
  • The VM is restarted due to an internal cause. The event ID is 8 (only for some customized images).
  • The QEMU process of the VM is terminated unexpectedly. The event ID is 3.
  • The CGP VM is powered off. The event ID is 12.

Procedure

  1. On ManageOne Operation Portal, choose Console > Computing > Elastic Cloud Server. On the Elastic Cloud Server page, select the VM to be processed and click Remote Login. Check whether the VM is automatically restarted.

    • Perform the following operations to view restart logs in the Windows OS:

      a. Press Windows logo key+R to open the Run dialog box and then run the eventvwr command.

      b. On the Computer Management page, choose Event Viewer > Windows Logs > System > Filter Current Log. In the displayed Filter Current Log dialog box, enter 1074 (indicating restart) and click OK to filter restart logs.

    • Run the last reboot command in the Linux OS to view restart logs.

  2. Check whether the VM has restart logs.

    • If yes, go to 4.
    • If no, go to 3.

  3. On Service OM, stop and then restart the VM. Check whether the VM can be restarted.

    • If yes, go to 4.
    • If no, go to 5.

  4. Clear the alarm manually, and wait for 10 minutes to check whether the alarm is generated again.

    • If yes, the watchdog service may be enabled in the VM but no process is available to restart the watchdog timer. In this case, go to 7.
    • If no, no further action is required.

  5. If the VM cannot be restarted, check whether any QEMU process is in the D state on FusionSphere OpenStack.

    1. Run the nova show VM D | grep host command to obtain the ID of the host where the VM is located.
    2. Run the cps host | grep Host ID command to obtain the management plane IP address of the host.
    3. Run the su fsp command to switch to user fsp.
    4. Run the ssh fsp@IP address of the management plane command to switch to the host where the VM is located.
    5. Run the TMOUT=0 command to disable logout on system timeout.
    6. Import environment variables based on Importing Environment Variables.
    7. Run the ps aux | grep qemu | grep VM ID command to check whether any QEMU process is in the D state.
      • If yes, restart the host and then go to 6.
        NOTE:

        Restarting the host will interrupt VM services running on the host. Migrate all the running VMs on this host to other hosts before restarting the host.

      • If no, go to 7.

  6. Clear the alarm manually, and wait for 10 minutes to check whether the alarm is generated again.

    • If yes, go to 7.
    • If no, no further action is required.

  7. Contact technical support for assistance.

Related Information

None

Translation
Download
Updated: 2019-08-30

Document ID: EDOC1100062365

Views: 45037

Downloads: 33

Average rating:
This Document Applies to these Products
Related Version
Related Documents
Share
Previous Next