No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

HUAWEI CLOUD Stack 6.5.0 Alarm and Event Reference 04

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
ALM-73011 Key Process Fault

ALM-73011 Key Process Fault

Description

The system checks key processes every 9s. This alarm is generated when the system detects that a key process or service is abnormal.

Attribute

Alarm ID

Alarm Severity

Auto Clear

73011

Major

Yes

Parameters

Name

Meaning

Fault Location Info

  • host_id: specifies the ID of the host for which the alarm is generated.
  • process_name: specifies the name of an abnormal process.

Additional Info

  • error_info: provides alarm exception information.
  • host_id: specifies the ID of the host for which the alarm is generated.
  • hostname: specifies the name of the host for which the alarm is generated.
  • HostIP: specifies the IP address of the host for which the alarm is generated.

Impact on the System

The system or the service function may be faulty.

Possible Causes

  • The processes stop abnormally.
  • The services are abnormal.

Procedure

  1. Log in to the FusionSphere OpenStack web client.

    For details, see Logging In to the FusionSphere OpenStack Web Client (ManageOne Mode).

  2. On the Summary page, obtain the management IP address of the host in the OM IP Address column based on the host ID or host name in the alarm additional information.
  3. Use PuTTY to log in to the host for which the alarm is generated using the management IP address of the host.

    The default user name is fsp. The default password is Huawei@CLOUD8.

    The system supports both password and public-private key pair for identity authentication. If the public-private key pair is used for login authentication, see detailed operations in Using PuTTY to Log In to a Node in Key Pair Authentication Mode.

  4. Run the following command and enter the password of user root to switch to user root:

    su - root

    The default password of user root is Huawei@CLOUD8!.

  5. Run the following command to disable user logout upon system timeout:

    TMOUT=0

  6. Run the following command to import environment variables:

    source set_env

    Information similar to the following is displayed:

      please choose environment variable which you want to import: 
      (1) openstack environment variable (keystone v3) 
      (2) cps environment variable 
      (3) openstack environment variable legacy (keystone v2) 
      (4) openstack environment variable of cloud_admin (keystone v3) 
      please choose:[1|2|3|4] 

  7. Enter 1 to enable Keystone V3 authentication and enter the password of OS_USERNAME as prompted.

    Default account format: DCname_admin; default password: FusionSphere123.

  8. Run the following command to switch to the specified directory:

    cd /etc/sysmonitor/process

  9. Using the abnormal process or service described in the alarm information as the keyword, run the cat command to query the RECOVER_COMMAND value in the configuration file and then attempt to restore the failed process or service.

    For example, if the hirmd service encounters an exception, log in to the host based on the host name displayed in the alarm information. Then switch to the /etc/sysmonitor/process/ directory and locate the hirmd-monitor configuration file based on keyword hirmd. In the command output, if the RECOVER_COMMAND field is service hirmd restart, run the following command to restore the hirmd service process:

    service hirmd restart

  10. Using the abnormal process or service described in the alarm information as the keyword, run the cat command to query the MONITOR_COMMAND value in the configuration file and check whether the process or service is restored.

    For example, if the MONITOR_COMMAND field of the hirmd service is /usr/bin/serviceStatusCheck -s hirmd -b /usr/bin/hirmd -n hirmd -p /var/run/hirmd.pid, run the following command to check whether the hirmd service status is normal:

    /usr/bin/serviceStatusCheck -s hirmd -b /usr/bin/hirmd -n hirmd -p /var/run/hirmd.pid

    Then run the following command to check whether the process or service is restored:

    echo $?

    • If 0 is returned in the command output, the process or service is restored. No further action is required.
    • If the returned value is not 0, the process or service is not restored. Go to 11.

  11. Contact technical support for assistance.

Related Information

None

Translation
Download
Updated: 2019-08-30

Document ID: EDOC1100062365

Views: 35896

Downloads: 31

Average rating:
This Document Applies to these Products
Related Version
Related Documents
Share
Previous Next