No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

HUAWEI CLOUD Stack 6.5.0 Alarm and Event Reference 04

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
ALM-70100 VM Audit Alarm

ALM-70100 VM Audit Alarm

Description

This alarm is generated when any VM in the abnormal status exists in the system, or the CPS host has been deleted but residual data of the nova-compute service on the host exists in the system, or the transactions that are not submitted exist in the Nova database for more than one hour.

Attribute

Alarm ID

Alarm Severity

Auto Clear

70100

Major

Yes

Parameters

Name

Meaning

Fault Location Info

servicename: specifies the name of the service for which the alarm is generated.

Additional Info

detail_info: provides detailed information about the alarm.

IP Address/URL/Domain Name

Specifies the IP address, URL, or domain name of the host for which the alarm is generated.

Impact on the System

  • The VMs in abnormal status exist in the system, which adversely affects VM management. The VMs in abnormal status cannot be managed by the system.
  • The orphan VM occupies computing and network resources.
  • Users can query the invalid VM, but the invalid VM does not exist on the host.
  • The VM becomes unavailable if the VM whose host recorded in the system database is inconsistent with the actual host exists in the system.
  • The stuck VM becomes unavailable and occupies system resources.
  • If the VM status recorded in the database is inconsistent with that in the system, tenants' operation rights on the VM are restricted.
  • If the VM with cold migration stuck in the intermediate state exists or the cold migrated VM that is adversely affected by the faulty source host exists in the system, the tenant cannot perform maintenance for the VM.
  • If VM attributes recorded in the system are inconsistent with the actual attributes, the VM becomes unavailable.
  • The nova-compute service is unavailable because residual data of the nova-compute service on the host exists in the system.
  • The transactions that are not submitted for more than one hour exist in the Nova database. The transactions occupy the database connections, resulting in a decrease in the available database connections. As a result, other transactions are slowly handled or fail to be handled.
  • The used quota in the quota table of the database is inconsistent with the actually used quota. As a result, the tenant may fail to create a VM due to resource limitation.

Possible Causes

  • An orphan VM exists in the system.
  • An invalid VM exists in the system.
  • A stuck VM exists in the system.
  • A VM whose host recorded in the system database is inconsistent with the actual host exists in the system.
  • The VM with clod migration stuck in the intermediate state exists in the system.
  • A cold migrated VM that is adversely affected by the faulty source host exists in the system.
  • A VM whose attributes recorded in the system are inconsistent with the actual attributes exists in the system.
  • The CPS host has been deleted but residual data of the nova-compute service on the host exists in the system.
  • The transactions that are not submitted for more than one hour exist in the Nova database.
  • Changes in the quota table and VM changes are not ensured in transactions.
  • When a network exception occurs during the process of creating, resizing, or deleting a VM, the actually used quota is inconsistent with that in the quota table.
NOTE:

Obtain the audit issue that caused the alarm from the alarm details and handle the issue according to the following procedure.

Procedure

NOTE:

If the audit alarm is generated in a cascaded FusionSphere OpenStack system, the alarm must be handled regardless of whether it is also generated in the cascading system. After the alarm is cleared, audit the cascading system (manually or using the automatic routine audit by the cascading system) to ensure data consistency between the cascading and cascaded systems.

  1. Obtain the value of detail_info in Additional Info in the alarm details and the name of the corresponding audit report based on Table 3-1.

    Table 3-1 Mapping between detailed information and audit reports

    Details

    Audit Report

    audit_orphan_vms

    orphan_vm.csv

    audit_invalid_vms

    invalid_vm.csv

    audit_host_changed_vms

    host_changed_vm.csv

    audit_stucking_vms

    stucking_vm.csv

    audit_diff_state_vms

    diff_state_vm.csv

    audit_stucking_migrations

    cold_stuck.csv

    audit_host_invalid_migrations

    host_invalid_migration.csv

    audit_diff_property_vms

    diff_property_vm.csv

    audit_nova_service_cleaned

    nova_service_cleaned.csv

    nova_idle_transaction

    nova_idle_transactions.csv

    audit_nova_vcpus

    nova_quota_vcpus.csv

    audit_nova_memory_mb

    nova_quota_memory_mb.csv

    audit_nova_quota_instance

    nova_quota_instance.csv

  2. Determine the deployment scenario of the current environment. Obtain the audit report.

  3. Determine the scenario where the current environment is deployed and obtain section "Analyzing Audit Results." Identify the handling method based on the audit report name and handle the audit items.

  4. Determine the scenario where the current environment is deployed and obtain section "Manual Audit." Start the system audit again.

    • Region Type I scenario:
    • Region Type II and Region Type III scenarios:

  5. Check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 6.

  6. Contact technical support for assistance.

Related Information

None

Translation
Download
Updated: 2019-08-30

Document ID: EDOC1100062365

Views: 48332

Downloads: 33

Average rating:
This Document Applies to these Products
Related Version
Related Documents
Share
Previous Next