No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

HUAWEI CLOUD Stack 6.5.0 Alarm and Event Reference 04

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
ALM-14105 Abnormal Instance Node Status

ALM-14105 Abnormal Instance Node Status

Description

This alarm is generated when the standby node is faulty. If the alarm is not cleared and the status of the active node is abnormal, the instance service is unavailable or the data is abnormal.

Attribute

Alarm ID

Alarm Severity

Alarm Type

14105

Major

Communications alarm

Alarm Parameters

Table 17-7 Parameters in location information

Parameter

Description

instanceId

Specifies the instance ID.

alarmId

Specifies the alarm ID.

causeId

Specifies the alarm cause ID.

Table 17-8 Parameters in additional information

Parameter

Description

hostName

Specifies the name of the host for which the alarm is generated.

hostIP

Specifies the IP address of the host for which the alarm is generated.

floatingIpAddress

Specifies the IP address of the faulty node.

cause

Specifies the description of the alarm cause.

Impact on the System

  • Data cannot be automatically synchronized between the active and standby instances.

Possible Causes

The application that generates the alarm works unexpectedly.

Procedure

  1. Obtain the information about the application that generates the alarm.

    1. Log in to FusionStage.

      1. Use a browser to log in to ManageOne Operation Portal (ManageOne Tenant Portal in B2B scenarios) as a VDC administrator or VDC operator.
        Login address:
        • Login address in a non-B2B scenario: https://Address for accessing ManageOne Operation Portal, for example, https://console.demo.com.
        • Login address in the B2B scenario: https://Address for accessing ManageOne Tenant Portal, for example, https://tenant.demo.com
      2. Select your region and then project from the drop-down list on the top menu bar.
      3. Choose Console > Application > FusionStage from the main menu.

    2. Choose Application Publishing > Application Management from the main menu.
    3. Select the application that generates the alarm and copy the application name.

  2. Use PuTTY to log in to the manage_lb1_ip node.

    The default username is paas, and the default password is QAZ2wsx@123!.

  3. Run the following command and enter the password of the root user to switch to the root user:

    su - root

    Default password: QAZ2wsx@123!

  4. Run the following command to query the container:

    kubectl get pod --all-namespaces |grep dcs-server-component-90478e21

    NOTE:

    dcs-server-component-90478e21 is the application name obtained in 1.c.

    Information similar to the following is displayed:

    06954f029140408d844c21beb9a04407   dcs-server-component-90478e21-869fbc6d77-7gh9g    1/1       Running             0          4h
    06954f029140408d844c21beb9a04407   dcs-server-component-90478e21-869fbc6d77-g7bjh    1/1       Running             0          4h

  5. Run the following command to access the container:

    kubectl exec -ti dcs-server-component-90478e21-869fbc6d77-7gh9g -n 06954f029140408d844c21beb9a04407 /bin/bash

    NOTE:

    In the preceding command, dcs-server-component-90478e21-869fbc6d77-7gh9g and 06954f029140408d844c21beb9a04407 respectively indicate the application name and namespace obtained in 4.

  6. Go to the /opt/dcs/logs/dcs directory and run the following command to view the error information:

    grep "${instance_id}" /opt/dcs/logs/dcs/dcs_status.log | grep "WARN"

    In the preceding information, ${instance_id} must be changed to the value of instanceId in the alarm location information.

  7. Perform the following operations based on the log information:

    • If the system displays a message indicating that the DCS-Server process cannot access the service VM, perform the following operations:
      1. Log in to the faulty node in VNC mode.

        The default username is paas, and the default password is QAZ2wsx@123!.

      2. Check whether the SSH service runs properly.
        systemctl status sshd
        • If the service is running properly, the following information is displayed. Go to 7.d.

        • If the service is not running properly, run the following command to restart the SSH service:

          systemctl restart sshd

      3. Run the following command and enter the password QAZ2wsx@123! of the root user to switch to the root user:

        su - root

      4. Run the following command to check whether the paas user can log in remotely:

        vim /etc/ssh/sshd_config

        • If the following information is specified in the configuration file, the paas user can remotely log in to the service VM.

        • If the information cannot be found, add it to the file.
      5. Check whether the network information is correctly configured.
      6. Run the following command to check whether the disk space is used up:

        df -h

        If the disk space is used up, the program cannot run properly.

    • If the system displays a message indicating that the om command fails to be executed, perform the following operations to check whether the Redis process on the service VM is normal:
      1. Use PuTTY to log in to the Redis node.

        The default username is paas, and the default password is QAZ2wsx@123!.

      2. Run the following command to check whether the Redis process is started:

        ps -ef |grep redis

        • If the process has started, go to 7.d.
        • If the process has not started, go to 7.c.
      3. Run the following command to start the Redis service:

        /opt/dcs/redis/redis/data/ctl/redis_ctl start

      4. Go to the /var/log/dcs/redis/redis_run.log directory, view service run logs, and check whether Redis is running properly.
        • If Redis is running properly, contact technical support.
        • If Redis is not running properly, go to 7.e.
      5. Run the following commands to restart the Redis service:

        /opt/dcs/redis/redis/data/ctl/redis_ctl stop

        /opt/dcs/redis/redis/data/ctl/redis_ctl start

  8. If the issue persists, contact technical support.

Alarm Clearing

After the fault is rectified, the system automatically clears the alarm.

Related Information

None

Translation
Download
Updated: 2019-08-30

Document ID: EDOC1100062365

Views: 46062

Downloads: 33

Average rating:
This Document Applies to these Products
Related Version
Related Documents
Share
Previous Next