No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

HUAWEI CLOUD Stack 6.5.0 Troubleshooting Guide 02

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
monitordbsvr Instance Status Is Abnormal After Its Large Amount of Data Is Restored

monitordbsvr Instance Status Is Abnormal After Its Large Amount of Data Is Restored

Symptom

  • When the master/slave status of the monitordbsvr instance is normal, but the data volume is large, data failed to be restored using the physical restoration command. As a result, error code 101 is generated for the slave instance.
  • When the master/slave status of the monitordbsvr instance is normal, but the data volume is large, data failed to be restored using the physical restoration command. As a result, error code 101 is generated for the master instance and error code 303 is generated for the slave instance.

Possible Cause

  • After data restoration fails, a failover occurs but the slave instance cannot be started. Information similar to the following is displayed:
    $ ./dbsvc_adm -cmd query-db-instance -type gauss -tenant fst-manage | grep monitor
    monitordbsvr-10_60_40_61-16@10_60_40_12-16  primary  monitordbsvr-10_60_40_12-16  fst-manage  10.60.40.12  32084  Up     gauss   V100R003C20SPC112  Master  Normal          --                         --         48            off
    monitordbsvr-10_60_40_61-16@10_60_40_12-16  primary  monitordbsvr-10_60_40_61-16  fst-manage  10.60.40.61  32084  Down   gauss   --                 Slave   Abnormal (101)  --                         --         --            off
  • After data restoration fails, no failover occurs, the master instance fails to be started and error code 101 is generated, and the slave instance fails to connect to the master instance and error code 303 is generated. Information similar to the following is displayed:
    $ cd /opt/paas/oss/manager/apps/DBAgent/bin/
    $ ./dbsvc_adm -cmd query-db-instance -type gauss | grep monitor
    monitordbsvr-10_60_40_52-2@10_60_40_27-2    primary  monitordbsvr-10_60_40_27-2   fst-manage        10.60.40.27  32082  Up     gauss   V100R003C20SPC112  Slave   Abnormal (303)  --                            --         48            off
    monitordbsvr-10_60_40_52-2@10_60_40_27-2    primary  monitordbsvr-10_60_40_52-2   fst-manage        10.60.40.52  32082  Down   gauss   --                 Master  Abnormal (101)  --                            --         --            off

Troubleshooting Method

NOTE:
  • If the original master instance fails to be started and error code 101 is generated, and the slave instance cannot connect to the master instance and error code 303 is generated, perform 1 to 5.
  • If the original master instance cannot be started, perform 6 to 9.
  1. Use PuTTY to log in to the manage_lb1_ip node.

    The default username is paas, and the default password is QAZ2wsx@123!.

  2. Run the following command and enter the password of the root user to switch to the root user:

    su - root

    Default password: QAZ2wsx@123!

  3. Run the following command to obtain the name of the pod corresponding to DBHASwitchService:

    kubectl get pod -n fst-manage | grep dbhaswitch |grep Running| awk '{ print $1 }'

    Information similar to the following is displayed:

    dbhaswitchservice-3302270813-1n452
    dbhaswitchservice-3302270813-cp154

  4. Run the following command to enter the container where DBHASwitchService resides:

    kubectl exec dbhaswitchservice-3302270813-1n452 -n fst-manage -it sh

    NOTE:

    dbhaswitchservice-3302270813-1n452 indicates the result obtained in the preceding step. If the pod has multiple values, choose any one from these values.

  5. Run the following commands to clear the failover time:

    cd /opt/apps/DBHASwitchService/bin

    ./switchtool.sh -cmd del-failover-time -instid monitordbsvr-10_60_40_52-2@10_60_40_27-2

    Information similar to the following is displayed:

    sh-4.2$ ./switchtool.sh -cmd del-failover-time -instid monitordbsvr-10_60_40_52-2@10_60_40_27-2
    Successful.
    sh-4.2$ exit

  6. Use PuTTY to log in to the node where the faulty instance for which the error code 101 is generated is located.

    The default username is paas, and the default password is QAZ2wsx@123!.

  7. Run the following commands to switch to the dbuser user:

    su root

    su dbuser

  8. Import GaussDB environment variables.

    . ~/appgsdb.bashrc

  9. Run the following commands:

    cd /opt/gauss/app/bin/

    ./gs_ctl -D /opt/gauss/data/monitordbsvr-10_60_40_61-16/ build

    Information similar to the following is displayed:
    [dbuser@fst65-b053-overlay-micro-manage-db01 bin]$ ./gs_ctl -D /opt/gauss/data/monitordbsvr-10_60_40_61-16/ build
    gs_ctl: connect to server, build started.
    xlog start point: 3/610000B8
    gs_ctl: starting background WAL receiver
    3804773/3804773 kB (100%), 1/1 tablespace
    xlog end point: 3/610005A0
    gs_ctl: waiting for background process to finish streaming...
    gs_ctl: build completed.
    server starting.... done
    server started

  10. If the fault persists, contact technical support for assistance.
Translation
Download
Updated: 2019-06-01

Document ID: EDOC1100062375

Views: 1965

Downloads: 12

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next