No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

HUAWEI CLOUD Stack 6.5.0 Troubleshooting Guide 02

Rate and give feedback :
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
etcd Cluster Troubleshooting

etcd Cluster Troubleshooting

Symptom

  • etcd cluster instances are unavailable for use.
  • Hosts where etcd cluster instances reside are unavailable for use.

Possible Causes

  • Hosts where etcd instances reside are faulty. As a result, etcd instances are unavailable for use.
  • etcd clusters are overloaded. As a result, etcd instances unexpectedly exit.

Troubleshooting Method

  1. Use PuTTY to log in to the manage_lb1_ip node.

    The default username is paas, and the default password is QAZ2wsx@123!.

  2. Run the following command and enter the password of the root user to switch to the root user:

    su - root

    Default password: QAZ2wsx@123!

  3. Run the following command to check whether all etcd instances are in the Running state.

    kubectl -n fst-manage get pod|grep etcd|grep -v cse

    etcd-network-server-paas-10-118-29-153       1/1       Running          0          1h
    etcd-network-server-paas-10-118-29-169       1/1       Running          0          1h
    etcd-network-server-paas-10-118-29-73        1/1       Error            0          1h

    The preceding command output shows that etcd-network-server-paas-10-118-29-73 is abnormal.

  4. Check whether the node where etcd-network-server-paas-10-118-29-73 resides is normal.

    1. Run the following command to obtain the host name and etcd-network-server-paas-10-118-29-73 is the faulty instance name obtained in 3:

      kubectl -n fst-manage get pod etcd-network-server-paas-10-118-29-73 -o wide

      NAME                                    READY     STATUS    RESTARTS   AGE       IP              NODE
      etcd-network-server-paas-10-118-29-73   1/1       Error     0          1h        10.118.29.73    paas-10-118-29-73
    2. Run the following command to check whether the host is normal and paas-10-118-29-73 is the faulty node obtained in 4.a:

      kubectl -n fst-manage get node | grep paas-10-118-29-73

      • If the following information is displayed, go to 5.
        fst-manage       paas-10-118-29-73            Ready      <none>    24d       v2.1.22-FusionStage6.5.RP3-B010-dirty
      • If the following information is displayed, the host is abnormal. See Node in NotReady State repair node, and then follow 5.
        fst-manage       paas-10-118-29-73            NotReady      <none>    24d       v2.1.22-FusionStage6.5.RP3-B010-dirty
        NOTE:

        For an etcd cluster with N instances, if the number of faulty nodes exceeds (N-1)/2, the etcd cluster is unavailable.

  5. Run the following command to delete paas-10-118-29-73. Then FusionStage automatically restores paas-10-118-29-73. If it is not restored, contact technical support for assistance.

    kubectl -n fst-manage delete pod etcd-network-server-paas-10-118-29-73

Translation
Download
Updated: 2019-06-01

Document ID: EDOC1100062375

Views: 1186

Downloads: 12

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next