No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

FusionCloud 6.3.1.1 Troubleshooting Guide 02

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Failure in etcd Backup and Restoration Due to the Node Where the etcd-backup Service Is Deployed Is Powered Off or Restarted (OM Zone)

Failure in etcd Backup and Restoration Due to the Node Where the etcd-backup Service Is Deployed Is Powered Off or Restarted (OM Zone)

Symptom

The etcd backup in the OM zone is successful. When one node of the etcd cluster is powered off, etcd fails to be restored, the pod of etcd may work improperly, and the Kubernetes command is unavailable. After the node is powered on again and etcd is rectified manually, etcd is restored successfully.

Possible Causes

The node powered off is the node where the etcd-backup service is deployed. After the node is powered off, the other two etcd-backup nodes cannot connect to the powered off etcd-backup node. As a result, the restoration fails.

Troubleshooting Method

  1. Use PuTTY to log in to the om_core1_ip node.

    The default username is paas, and the default password is QAZ2wsx@123!.

  1. Run the following command to query the etcd status:

    kubectl get pod -nom -owide | grep etcd

    Information similar to the following is displayed:

    apm-etcd-0                                   1/1       Running   0          1d        172.17.1.131   paas-10-60-20-81
    apm-etcd-1                                   1/1       Running   0          1d        172.17.1.117   paas-10-60-20-115
    apm-etcd-2                                   1/1       Running   0          1d        172.17.1.144   paas-10-60-20-10
    cse-etcd-0                                   1/1       Running   0          21h       172.17.2.112   paas-cse-03-10-60-20-232
    cse-etcd-1                                   1/1       Running   0          21h       172.17.2.136   paas-cse-02-10-60-20-59
    cse-etcd-2                                   1/1       Running   2          21h       172.17.2.185   paas-cse-04-10-60-20-109
    cse-etcd-backup-6b89f6fb4d-lmw59             1/1       Running   0          21h       172.17.2.134   paas-cse-02-10-60-20-59
    cse-etcd-backup-6b89f6fb4d-qxw4n             1/1       Running   0          21h       172.17.2.183   paas-cse-04-10-60-20-109
    etcd-backup-server-paas-192-168-20-187        1/1       Running   0          2d        10.60.20.187   paas-10-60-20-187
    etcd-backup-server-paas-192-168-20-204        1/1       Running   0          2d        10.60.20.204   paas-192-168-20-204
    etcd-backup-server-paas-192-168-20-239        1/1       Running   0          2d        10.60.20.239   paas-192-168-20-239
    etcd-event-server-paas-192-168-20-187         1/1       Running   0          2d        10.60.20.187   paas-192-168-20-187
    etcd-event-server-paas-192-168-20-204         1/1       Running   0          2d        10.60.20.204   paas-192-168-20-204
    etcd-event-server-paas-192-168-20-239         1/1       Running   0          2d        10.60.20.239   paas-192-168-20-239
    etcd-network-server-paas-192-168-20-187       1/1       Running   0          2d        192.168.20.187   paas-192-168-20-187
    etcd-network-server-paas-192-168-20-204       1/1       Running   0          2d        192.168.20.204   paas-192-168-20-204
    etcd-network-server-paas-192-168-20-239       1/1       Running   0          2d        192.168.20.239   paas-192-168-20-239
    etcd-server-paas-192-168-20-187               1/1       Running   0          2d        192.168.20.187   paas-192-168-20-187
    etcd-server-paas-192-168-20-204               1/1       Running   0          2d        192.168.20.204   paas-192-168-20-204
    etcd-server-paas-192-168-20-239               1/1       Running   0          2d        192.168.20.239   paas-192-168-20-239
    • If the etcd status is Running, go to 6.
    • If this command is unavailable or the pod is abnormal, go to 3.

  2. Log in to each etcd node and run the following command to check whether etcd-event.manifest, etcd.manifest, and etcd-network.manifest exist:

    cd /var/paas/kubernetes/

    • If yes, move these three files to /var/paas/kubernetes/manifests and then go to 4.

      mv etcd* manifests

    • If no, go to 4.

  3. Run the following command on the OM-Core01 node to check the etcd status:

    kubectl get pod -nom -owide | grep etcd

    • If the pod status of etcd, etcd-event, and etcd-network is Running, go to 6.
    • If the pod status is Pending, go to 5.

  4. Log in to the three etcd nodes one after another as the paas user and run the following commands to restart kubelet and wait until the kubelet status is Running:

    monit restart kubelet

    monit summary

  5. Access the OM zone as the admin user and perform the restoration operations again.
  6. Log in to the OM-Core01 node after the restoration operations succeed and run the following command to check the etcd status:

    kubectl get pod -nom -owide | grep etcd

    If the pod is in the Pending state, log in to the three etcd nodes as the paas user. Run the following commands to restart kubelet and wait until the kubelet status is Running:

    monit restart kubelet

    monit summary

    If the pod status is Running, the pod is restored successfully and the fault is rectified.

Translation
Download
Updated: 2019-06-10

Document ID: EDOC1100063248

Views: 23013

Downloads: 37

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next