No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

HUAWEI CLOUD Stack 6.5.0 Alarm and Event Reference 04

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
ALM-0001000300030001 Elasticsearch Cluster Heartbeat Exception

ALM-0001000300030001 Elasticsearch Cluster Heartbeat Exception

Alarm Description

This alarm is generated when the heartbeat detection between nodes of the Elasticsearch cluster is abnormal for 5 consecutive times and the number of faulty nodes exceeds 50%.

Attributes

Alarm ID

Alarm Severity

Alarm Type

0001000300030001

Critical

Environmental alarm

Parameters

Name

Description

Module

Elasticsearch cluster

State

Heartbeat exception

Abnormal node

IP address of the abnormal node

Impact on the System

  • Fault locating and demarcation based on the call chain are unavailable.
  • No data is displayed on the management operation log page.
  • No data is displayed on the Cluster Status page in the Run Logs tab.

Possible Causes

  • Network exception.
  • The heartbeats of the Elasticsearch cluster node are abnormal in the last ten minutes.

Handling Procedure

  1. Log in to ManageOne Deployment Portal and view the nodes where services are deployed.

    1. Log in to ManageOne Deployment Portal.

      URL: https://IP address of ManageOne Deployment Portal.

      Default username: admin; default password: Huawei12#$

    2. Choose Application > Software Management > Manage Product Software from the main menu.
    3. Click the ManageOne card.
    4. In the Deployment History List area, click in the Operation column corresponding to the most recently deployed microservice.

      Enter the service name mo_om_adminplane in the search box and press Enter to query the deployment list of the service.

      If the deployment list of the service is not found, go back to Deployment History List. In the Deployment History List area, click in the Operation column of the row under the row containing the most recently deployed microservice to continue the query.

    5. Click the instance name to go to the service deployment details page.
    6. In the Deployment History List area, click Successful in the Deployment Status column corresponding to the most recently deployed microservice.
    7. In the Resources Deploy Status area, find the resource whose Name contains MOLogStorageService and Type is Stage, and click in the Operation column.
    8. In the Node IP Address column, obtain the IP address of the node where MOLogStorageService is deployed.

  2. Use PuTTY to log in to the node where MOLogStorageService is deployed. If an exception is reported, log in to any node.

    Default account: sopuser; default password: D4I$awOD7k

  3. Run the following command to disable user logout upon timeout:

    TMOUT=0

  4. Run the following command to switch to the root user:

    sudo su root

    The default password of the root user is Changeme_123.

  5. Run the following command to switch to the ossadm user:

    su ossadm

  6. Run the following command to check whether the VMs can communicate with each other:

    ping MOLogStorageService IP address of the abnormal node

    If information similar to the following is displayed, the communication between VM nodes is normal.

    64 bytes from node IP: icmp_seq=1 ttl=64 time=0.019 ms
    64 bytes from node IP: icmp_seq=2 ttl=64 time=0.036 ms

    If yes, go to 7.

    If no, go to 8.

  7. Log in to the abnormal node and run the following commands to restart the service:

    /opt/oss/manager/agent/bin/ipmc_adm -cmd restartapp -app MOLogStorageService

    If information similar to the following is displayed after the restart, the restart is successful:

    Stopping process mologstorageservice-10-0 ... success
    Starting process mologstorageservice-10-0 ... success

    After all nodes are restarted, wait for 20 minutes and check whether the alarm is cleared. If the alarm persists, go to 8.

  8. Contact technical support for assistance.

Alarm Clearing

After the fault is rectified, the system will automatically clear the alarm. Manual clearing is not required.

Related Information

None

Translation
Download
Updated: 2019-08-30

Document ID: EDOC1100062365

Views: 35763

Downloads: 31

Average rating:
This Document Applies to these Products
Related Version
Related Documents
Share
Previous Next