No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

HUAWEI CLOUD Stack 6.5.0 Alarm and Event Reference 04

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
ALM-14102 Abnormal Instance Status

ALM-14102 Abnormal Instance Status

Description

This alarm is generated when the instance status is faulty.

Attribute

Alarm ID

Alarm Severity

Alarm Type

14102

Major

Communications alarm

Alarm Parameters

Table 17-3 Parameters in location information

Parameter

Description

instanceId

Specifies the instance ID.

alarmId

Specifies the alarm ID.

causeId

Specifies the alarm cause ID.

Table 17-4 Parameters in additional information

Parameter

Description

hostName

Specifies the name of the host for which the alarm is generated.

hostIP

Specifies the IP address of the host for which the alarm is generated.

cause

Specifies the description of the alarm cause.

Impact on the System

The instance is not running properly and you cannot perform any other operations except deleting the instance.

Possible Causes

The application that generates the alarm works unexpectedly.

Procedure

  1. Obtain the information about the application that generates the alarm.

    1. Log in to FusionStage.

      1. Use a browser to log in to ManageOne Operation Portal (ManageOne Tenant Portal in B2B scenarios) as a VDC administrator or VDC operator.
        Login address:
        • Login address in a non-B2B scenario: https://Address for accessing ManageOne Operation Portal, for example, https://console.demo.com.
        • Login address in the B2B scenario: https://Address for accessing ManageOne Tenant Portal, for example, https://tenant.demo.com
      2. Select your region and then project from the drop-down list on the top menu bar.
      3. Choose Console > Application > FusionStage from the main menu.

    2. Choose Application Publishing > Application Management from the main menu.
    3. Select the application that generates the alarm and copy the application name.

  2. Use PuTTY to log in to the manage_lb1_ip node.

    The default username is paas, and the default password is QAZ2wsx@123!.

  3. Run the following command and enter the password of the root user to switch to the root user:

    su - root

    Default password: QAZ2wsx@123!

  4. Run the following command to query the container:

    kubectl get pod --all-namespaces |grep dcs-server-component-90478e21

    NOTE:

    dcs-server-component-90478e21 is the application name obtained in 1.c.

    Information similar to the following is displayed:

    06954f029140408d844c21beb9a04407   dcs-server-component-90478e21-869fbc6d77-7gh9g    1/1       Running             0          4h
    06954f029140408d844c21beb9a04407   dcs-server-component-90478e21-869fbc6d77-g7bjh    1/1       Running             0          4h

  5. Run the following command to access the container:

    kubectl exec -ti dcs-server-component-90478e21-869fbc6d77-7gh9g -n 06954f029140408d844c21beb9a04407 /bin/bash

    NOTE:

    In the preceding command, dcs-server-component-90478e21-869fbc6d77-7gh9g and 06954f029140408d844c21beb9a04407 respectively indicate the application name and namespace obtained in 4.

  6. Run the following command to view the error information:

    grep "${instance_id}" /opt/dcs/logs/dcs/dcs_status.log | grep "ERROR"

    In the preceding information, ${instance_id} must be changed to the value of instanceId in the alarm location information.

  7. Perform the following operations based on the log information:

    • If the system displays a message indicating that the DCS-Server process cannot access the service VM, perform the following operations:
      1. Log in to the faulty node in VNC mode.

        The default username is paas, and the default password is QAZ2wsx@123!.

      2. Check whether the SSH service runs properly.
        systemctl status sshd
        • If the service is running properly, the following information is displayed. Go to 7.d.

        • If the service is not running properly, run the following command to restart the SSH service:

          systemctl restart sshd

      3. Run the following command and enter the password QAZ2wsx@123! of the root user to switch to the root user:

        su - root

      4. Run the following command to check whether the paas user can log in remotely:

        vim /etc/ssh/sshd_config

        • If the following information is specified in the configuration file, the paas user can remotely log in to the service VM.

        • If the information cannot be found, add it to the file.
      5. Check whether the network information is correctly configured.
      6. Run the following command to check whether the disk space is used up:

        df -h

        If the disk space is used up, the program cannot run properly.

    • If the system displays a message indicating that the om command fails to be executed, perform the following operations to check whether the Redis process on the service VM is normal:
      1. Use PuTTY to log in to the Redis node.

        The default username is paas, and the default password is QAZ2wsx@123!.

      2. Run the following command to check whether the Redis process is started:

        ps -ef |grep redis

        • If the process has started, go to 7.d.
        • If the process has not started, go to 7.c.
      3. Run the following command to start the Redis service:

        /opt/dcs/redis/redis/data/ctl/redis_ctl start

      4. Go to the /var/log/dcs/redis/redis_run.log directory, view service run logs, and check whether Redis is running properly.

        If Redis is running properly, contact technical support.

        If Redis is not running properly, go to 7.e.

      5. Run the following commands to restart the Redis service:

        /opt/dcs/redis/redis/data/ctl/redis_ctl stop

        /opt/dcs/redis/redis/data/ctl/redis_ctl start

    • If the system displays a message indicating that both instances are active, perform the following operations:
      1. Use PuTTY to log in to each node of the active and standby instances.

        The default username is paas, and the default password is QAZ2wsx@123!.

      2. Run the following command and enter the password QAZ2wsx@123! of the root user to switch to the root user:

        su - root

      3. Run the following command to check whether the Keepalived process has started:

        ps -ef | grep keepalived

        • If the process has started, go to 7.d.
        • If the process has not started, run the following command to start it:

          /opt/dcsroot/redis/keepalived/ctl/keepalived_ctl start

      4. Go to the /var/log/dcsroot/redis directory, view the logs, and check whether the Keepalived process is abnormal.

  8. If the issue persists, contact technical support.

Alarm Clearing

After the alarm is cleared, the instance is in the Running state. The system automatically clears the alarm.

Related Information

None

Translation
Download
Updated: 2019-08-30

Document ID: EDOC1100062365

Views: 48372

Downloads: 33

Average rating:
This Document Applies to these Products
Related Version
Related Documents
Share
Previous Next