No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

FusionCloud 6.3.1.1 Troubleshooting Guide 02

Rate and give feedback :
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Common Troubleshooting Cases on the Management Plane

Common Troubleshooting Cases on the Management Plane

Faults on the management plane include those in VMs, network, and database. Some faults may affect services. This chapter provides instructions on how to check, locate, and troubleshoot them.

GaussDB Faults

GaussDB faults include data inconsistency on the primary and standby databases, failover failures, and data losses.

Data Inconsistency Between the Primary and Standby GaussDB Databases
Symptom

The primary and standby GaussDB databases are faulty, causing data inconsistency.

Possible Causes
  • The communication between the primary and standby GaussDB databases is abnormal.
  • The primary GaussDB is powered off.
  • The standby GaussDB is promoted to the primary.
  • Data loss occurs.
Locating Method

Check the process status. Run the gs_ctl query-U gausscore-P clouddb@123 command to query the status of primary and standby databases.

Procedure
  1. Check whether an alarm is generated for Metastore.

    Log in as the database administrator (such as gausscore or gaussbase) and run the following command to ping the peer database IP address.

    ping IP

  2. Use PuTTY to log in to the newRDS-Database01 node and check the ha_monitor process.

    Default account: dbs; default password: Changeme_123

    If the process is not started, check whether it is monitored and is added to system startup items.

    Run the $(ha_install_path)/ha/module/hamon/script/start_ha_monitor.sh script to manually start the ha_monitor process.

  3. Use PuTTY to log in to the database server host as the root user and view the /var/log/messages file.

    cat /var/log/messages

  4. Query the system log and check whether GaussDB shutdown information exists.
  5. Log in to the primary and standby GaussDB nodes as the database system user and run gs_ctl query _U gausscore _P clouddb@123 to check whether SYNC_PERCENT is 100%. If no, check whether data on the primary and standby GaussDB databases is consistent and whether the network is normal.
  6. After the network fault is rectified, run gs_ctl query -U gausscore -P clouddb@123 to check whether SYSC_PERCENT is 100%. If the fault persists, contact technical support for assistance.
GaussDB Failover Failed
Symptom

The primary GaussDB database is faulty, but the standby GaussDB database fails to be automatically promoted to the primary.

Possible Causes
  • The alarm indicting a management database fault is generated.
  • The synchronization between the primary and standby GaussDB databases is delayed and might be failed. You can run the gs_ctl query -U xxx -P xxx command to check whether the synchronization is delayed.
  • OMMHA is faulty.
Locating Method
  • Check whether the standby GaussDB database is faulty.
  • Check the I/O usage and network status and determine whether the I/O and network bottlenecks cause the delay.
Procedure
  1. Use PuTTY to log in to the newRDS-Database01 node.

    Default account: dbs; default password: Changeme_123

  2. Check the status of the standby GaussDB database. If it is faulty, the failover will fail. For example, the standby GaussDB database breaks down, and its process has stopped.
  3. Run the iostat -x N command to view the disk I/O usage.

    Calculate the throughput and IOPS to determine whether the I/O bottleneck is reached. Use a dedicated tool, such as nmon, to check the network status, and then check whether the network traffic reaches its bottleneck.

Translation
Download
Updated: 2019-06-10

Document ID: EDOC1100063248

Views: 22631

Downloads: 37

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next