No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

HUAWEI CLOUD Stack 6.5.0 Alarm and Event Reference 04

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
ALM-1126001 Bare Metal Server Is Unavailable

ALM-1126001 Bare Metal Server Is Unavailable

Description

This alarm is generated if the management status of a bare metal server becomes deploy failed, inspect failed, clean failed, or error when the bare metal server is provisioned, reclaimed, or formatted, or has the capacity expanded.

Attribute

Alarm ID

Alarm Severity

Auto Clear

1126001

Major

Yes

Parameters

Name

Meaning

Fault Location Info

BaremetalNode ID: specifies the ID of the bare metal server node for which the alarm is generated.

Additional Info

  • Service: specifies the name of the service for which the alarm is generated. The default value is BMS.
  • MicroService: specifies the name of the microservice for which the alarm is generated. The default value is ironic.
  • HostIP: specifies the IP address of the host for which the alarm is generated.
  • BaremetalMACAddress: specifies the BMC IP address of the bare metal server for which the alarm is generated.
  • CurrentProvisionState: specifies the management status of the bare metal server node for which the alarm is generated.

Impact on the System

The process of provisioning, reclaiming, expanding the capacity of, or formatting a bare metal server is faulty, and the bare metal server cannot be provisioned again.

Possible Causes

An exception occurs on the network or in the database.

Procedure

  1. Use PuTTY to log in to the first FusionSphere OpenStack node through the IP address of the External OM plane.

    The default user name is fsp. The default password is Huawei@CLOUD8.

    The system supports both password and public-private key pair for identity authentication. If the public-private key pair is used for login authentication, see detailed operations in Using PuTTY to Log In to a Node in Key Pair Authentication Mode.

    NOTE:
    To obtain the IP address of the External OM plane, search for the required parameter on the Tool-generated IP Parameters sheet of the xxx_export_all.xlsm file exported from HUAWEI CLOUD Stack Deploy during software installation. The parameter names in different scenarios are as follows:
    • Region Type I scenario:

      Cascading system: Cascading-ExternalOM-Reverse-Proxy

      Cascaded system: Cascaded-ExternalOM-Reverse-Proxy

    • Region Type II and Region Type III scenarios: ExternalOM-Reverse-Proxy

  2. Run the following command and enter the password of user root to switch to user root:

    su - root

    The default password of user root is Huawei@CLOUD8!.

  3. Run the following command to disable user logout upon system timeout:

    TMOUT=0

  4. Run the following command to import environment variables:

    source set_env

    Information similar to the following is displayed:

      please choose environment variable which you want to import: 
      (1) openstack environment variable (keystone v3) 
      (2) cps environment variable 
      (3) openstack environment variable legacy (keystone v2) 
      (4) openstack environment variable of cloud_admin (keystone v3) 
      please choose:[1|2|3|4] 

  5. Enter 1 to enable Keystone V3 authentication and enter the password of OS_USERNAME as prompted.

    Default account format: DCname_admin; default password: FusionSphere123.

  1. Run the ironic node-show node_uuid command to query the value of provision_state of the bare metal server.

    • If the value is clean failed, go to 7.
    • If the value is deploy failed or error, go to 8.
    • If the value is inspect failed, go to 9.

  2. Run the ironic node-show node_uuid command to query the value of last_error to locate the cause.

    If the following information is displayed, the sdb disk is read-only. In this case, you need to remove the write-protected attribute.

    /dev/sdb: failed to open for writing: Read-only file system
    • Locate the cause based on the command output and run the following commands in sequence:

      ironic node-set-provision-state node_uuid manage

      ironic node-set-provision-state node_uuid provide

      Perform the initialization of the bare metal server again. The bare metal server should be in the available state after being successfully initialized. If the alarm is cleared, no further action is required. Otherwise, go to 10.

    • If the cause cannot be located based on the command output, go to 10.

  3. Run the ironic node-show node_uuid command to query the value of last_error to locate the cause.

    • Locate the fault based on the command output. After the fault is rectified, run the ironic node-set-provision-state node_uuid deleted command. During the deletion process, the bare mental server is initialized. The bare metal server should be in the available state after being successfully initialized. If the alarm is cleared, no further action is required. Otherwise, go to 10.
    • If the cause cannot be located based on the command output, go to 10.

  4. Run the ironic node-show node_uuid command to query the value of last_error and locate the inspect failed cause.

    • Locate the fault based on the command output. After the fault is rectified, run the ironic node-set-provision-state node_uuid inspect command. During the inspection process, the capacity of the bare mental server is expanded. The bare metal server should be in the manageable state after the capacity is expanded. If the alarm is cleared, no further action is required. Otherwise, go to 10.
    • If the cause cannot be located based on the command output, go to 10.

  5. Contact technical support for assistance.

Related Information

None

Translation
Download
Updated: 2019-08-30

Document ID: EDOC1100062365

Views: 47633

Downloads: 33

Average rating:
This Document Applies to these Products
Related Version
Related Documents
Share
Previous Next