No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

HUAWEI CLOUD Stack 6.5.0 Alarm and Event Reference 04

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
ALM-73403 Data Inconsistency Between Active and Standby GaussDB Databases

ALM-73403 Data Inconsistency Between Active and Standby GaussDB Databases

Description

The active and standby GaussDB servers periodically check the data synchronization status every 1 minute. This alarm is generated when data synchronization fails for 3 consecutive minutes. If the synchronization status is normal, the alarm is cleared.

Attribute

Alarm ID

Alarm Severity

Auto Clear

73403

Critical

Yes

Parameters

Name

Meaning

Fault Location Info

component: specifies the name of the component for which the alarm is generated.

Additional Info

Detail_info: sync_abnormal

Local_Address: specifies the local synchronization IP address of the local.

Peer_address: specifies the synchronization IP address of the peer.

Impact on the System

GaussDB data synchronization fails, and data may be lost.

Possible Causes

  • The host where the database is located is powered off.
  • Improper start and stop commands are executed on the databases.
  • The network is faulty.
  • The GaussDB partition on the standby node is fully occupied.

Procedure

  1. Obtain the name of the component for which the alarm is generated based on the alarm location information.
  2. Use PuTTY to log in to the first FusionSphere OpenStack node through the IP address of the External OM plane.

    The default user name is fsp. The default password is Huawei@CLOUD8.

    The system supports both password and public-private key pair for identity authentication. If the public-private key pair is used for login authentication, see detailed operations in Using PuTTY to Log In to a Node in Key Pair Authentication Mode.

    NOTE:
    To obtain the IP address of the External OM plane, search for the required parameter on the Tool-generated IP Parameters sheet of the xxx_export_all.xlsm file exported from HUAWEI CLOUD Stack Deploy during software installation. The parameter names in different scenarios are as follows:
    • Region Type I scenario:

      Cascading system: Cascading-ExternalOM-Reverse-Proxy

      Cascaded system: Cascaded-ExternalOM-Reverse-Proxy

    • Region Type II and Region Type III scenarios: ExternalOM-Reverse-Proxy

  3. Run the following command and enter the password of user root to switch to user root:

    su - root

    The default password of user root is Huawei@CLOUD8!.

  4. Run the following command to disable user logout upon system timeout:

    TMOUT=0

  5. Run the following command to import environment variables:

    source set_env

    Information similar to the following is displayed:

      please choose environment variable which you want to import: 
      (1) openstack environment variable (keystone v3) 
      (2) cps environment variable 
      (3) openstack environment variable legacy (keystone v2) 
      (4) openstack environment variable of cloud_admin (keystone v3) 
      please choose:[1|2|3|4] 

  1. Enter 2 to enable CPS authentication and enter the username and password of CPS_USERNAME as prompted.

    The default username is cps_admin, and the default password is FusionSphere123.

Check whether the host where the database is located is powered off and whether the network is normal.

  1. Run the following command to obtain the name of the service for which the alarm is generated:

    cps template-list | grep Component name
    5F0B42C1-EFD0-C685-E811-16AD2A2D1793:/home/fsp # cps template-list | grep gaussdb
    | gaussdb_keystone    | gaussdb_keystone               | DataBase for Keystone service.            |
    | gaussdb_cinder      | gaussdb_cinder                 | DataBase for Cinder service.              |
    | gaussdb_nova        | gaussdb_nova                   | DataBase for Nova service.                |
    | gaussdb_neutron     | gaussdb_neutron                | DataBase for Neutron service.             |
    | gaussdb             | gaussdb                        | DataBase for OpenStack service.           |
    NOTE:

    In the preceding command output, the first column is the service name and the second column the component name. Obtain the service name based on the component name.

  2. Run the following command to query the host where the database is located:

    cps template-instance-list --service Service name Component name

  3. Run the following command to check whether the host status is normal:

    cps host-list | grep fault

    Check whether the ID in the command output contains the ID displayed in 8.
    • If yes, go to 10 to check the cause of the host fault.
    • If no, go to 11.

  4. Check whether the host is powered off.

    • If yes, power on the host. After the host is powered on, go to 18.
    • If no, go to 11.

  5. Run the following command to check whether the network status is normal:

    • In IPv4 scenarios, run the ping manageip command.
    • In the IPv6 environment: run the ping6 manageip command.
    Check whether the management IP address can be pinged.
    • If yes, go to 16.
    • If no, the fault node is unreachable, and you need to handle the network fault and then go to 18.
    NOTE:

    manageip indicates the management IP address of the node where the faulty GaussDB is located.

Check whether the disk partition of the standby node is used up.

  1. Run the following command to obtain the management IP address of the standby GaussDB node:

    cps template-instance-list --service Service name Component name | grep standby

  2. Use PuTTY to log in to the standby GaussDB node using the management IP address.
  3. Run the following command to check whether the GaussDB partition is fully occupied:

    df -h

    A68692F3-DE60-11B4-E811-FC323C32F17C:~ # df -h
    Filesystem                              Size  Used Avail Use% Mounted on
    /dev/mapper/cpsVG-rootfs                7.8G  5.2G  2.2G  71% /
    devtmpfs                                220G  4.0K  220G   1% /dev
    tmpfs                                   220G   92K  220G   1% /dev/shm
    tmpfs                                   220G   77M  220G   1% /run
    tmpfs                                   220G     0  220G   0% /sys/fs/cgroup
    /dev/mapper/cpsVG-fsp                   976M  4.1M  905M   1% /home/fsp
    /dev/mapper/cpsVG-data                  488M   15M  438M   4% /opt/fusionplatform/data
    /dev/mapper/cpsVG-log                    20G  990M   18G   6% /var/log
    /dev/sda2                               471M   72M  371M  17% /boot
    /dev/sda1                               493M  152K  493M   1% /boot/efi
    /dev/mapper/cpsVG-backup                 20G   45M   19G   1% /opt/backup
    /dev/mapper/cpsVG-repo                   20G   45M   19G   1% /opt/fusionplatform/data/fusionsphere/repo
    /dev/mapper/cpsVG-database               20G  1.4G   18G   8% /opt/fusionplatform/data/gaussdb_data
    /dev/mapper/cpsVG-zookeeper             4.8G  121M  4.5G   3% /opt/fusionplatform/data/zookeeper
    tmpfs                                    44G     0   44G   0% /run/user/1002
    /dev/mapper/cpsVG-upgrade               3.4G   15M  3.2G   1% /opt/fusionplatform/data/upgrade
    /dev/mapper/cpsVG-image                 276G  9.4G  255G   4% /opt/HUAWEI/image
    /dev/mapper/cpsVG-image--cache           50G  8.9G   39G  19% /opt/HUAWEI/image_cache
    /dev/mapper/cpsVG-rabbitmq              3.9G   19M  3.6G   1% /opt/fusionplatform/data/rabbitmq
    tmpfs                                    44G     0   44G   0% /run/user/2001
    tmpfs                                    44G     0   44G   0% /run/user/1008
    tmpfs                                    44G     0   44G   0% /run/user/0
    /dev/mapper/zk_vol_1                     59G  400M   56G   1% /opt/dsware/agent/zk/data
    /dev/mapper/extend_vg-ceilometer--data  118G  1.8G  110G   2% /var/ceilometer
    /dev/mapper/extend_vg-swift             500G   57G  444G  12% /opt/HUAWEI/swift
    NOTE:

    In the command output, check the value of Use% in the row that contains /opt/fusionplatform/data/gaussdb_data. If the value is 100%, the disk partition is fully occupied.

  4. After 1 to 3 minutes, check whether this alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 16.

Check whether improper start and stop commands are executed on the databases.

  1. Run the following command to stop the faulty database:

    cps host-template-instance-operate --service Service name Component name --action stop --host HOSTID
    • If the command is successfully executed, go to 17.
    • If no, try again after one minute. If three consecutive retries fail, go to 19.
      NOTE:

      HOSTID indicates the ID of the node where the faulty GaussDB database is located.

  1. Run the following command to start the faulty GaussDB database:

    cps host-template-instance-operate --service Service name Component name --action start --host HOSTID
    • If the command is successfully executed, go to 18.
    • If no, try again after one minute. If three consecutive retries fail, go to 19.

  1. After one to three minutes, check whether this alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 19.

  2. Contact technical support for assistance.

Related Information

None

Translation
Download
Updated: 2019-08-30

Document ID: EDOC1100062365

Views: 46554

Downloads: 33

Average rating:
This Document Applies to these Products
Related Version
Related Documents
Share
Previous Next