No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

FusionCloud 6.3.1.1 Troubleshooting Guide 02

Rate and give feedback :
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
etcd Cluster Troubleshooting

etcd Cluster Troubleshooting

Symptom

  • etcd cluster instances are unavailable for use.
  • Hosts where etcd cluster instances reside are unavailable for use.

Possible Causes

  • Hosts where etcd instances reside are faulty. As a result, etcd instances are unavailable for use.
  • etcd clusters are overloaded. As a result, etcd instances unexpectedly exit.

Troubleshooting Method

  1. Use PuTTY to log in to the om_core1_ip node.

    The default username is paas, and the default password is QAZ2wsx@123!.

  2. Run the following command to check whether all etcd instances are in the Running state.

    kubectl -n manage get pod|grep etcd|grep -v cse

    etcd-0                   1/1       Running             0    1d
    etcd-1                   1/1       Running             0    1d
    etcd-2                   1/1       Error               0    1d

    The preceding command output shows that etcd-2 is abnormal.

  3. Check whether the node where the abnormal etcd-2 instance resides is normal.

    1. Run the following command to obtain the name of the node.
      kubectl -n manage get pod etcd-2 -o wide
      NAME      READY     STATUS   RESTARTS   AGE       IP               NODE   
      etcd-2    0/4       Error    0          40m       10.109.176.180   manage-cluster1-87c05eac-9dmpc

      The preceding command output shows that the node name is manage-cluster1-87c05eac-9dmpc.

      NOTE:

      Run the following command to obtain the IP address of the node:

      kubectl -n manage get node manage-cluster1-87c05eac-9dmpc -ojson |grep address

      Information similar to the following is displayed:

              "address": "10.109.176.180",
              "addresses": [
                      "address": "10.109.176.180",
                      "address": "10.109.176.180",
                      "address": "manage-cluster1-87c05eac-9dmpc",
    2. Run the following command to check whether the node is normal:
      kubectl -n manage get node | grep manage-cluster1-87c05eac-9dmpc
      • If information similar to the following is displayed, the node is normal and the etcd instance is recovered automatically. In this case, no further action is required.
        manage-cluster1-87c05eac-9dmpc Ready 2d
      • If information similar to the following is displayed, the node is abnormal. In this case, go to 4.
        manage-cluster1-87c05eac-9dmpc NotReady   2d
        NOTE:

        For an etcd cluster with N instances, if the number of faulty nodes exceeds (N-1)/2, the etcd cluster is unavailable.

  4. Run the following command to delete etcd-2. Then FusionStage automatically restores etcd-2.

    kubectl -n manage delete pod etcd-2

    Run the following command to check whether the etcd instance is recovered:

    kubectl -n manage get pod|grep etcd|grep -v cse

    If information similar to the following is displayed, the etcd instance is recovered successfully.

Translation
Download
Updated: 2019-06-10

Document ID: EDOC1100063248

Views: 22648

Downloads: 37

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next