No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

HUAWEI CLOUD Stack 6.5.0 Troubleshooting Guide 02

Rate and give feedback :
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
CFE Troubleshooting

CFE Troubleshooting

Symptom

A component process of the cloud fabric engine (CFE) enters the Z (TASK_DEAD - EXIT_ZOMBIE) state (exit state) and turns into a zombie process, which makes the CFE service abnormal and unavailable.

If CFE's kube-apiserver process turns into a zombie process, an error will be reported when the portal is accessed.

  • The kube-apiserver process is normal.
    1. Use PuTTY to log in to manage_lb1_ip node.

      The default username is paas, and the default password is QAZ2wsx@123!.

    2. Run the following command and enter the password of the root user to switch to the root user:

      su - root

      Default password: QAZ2wsx@123!

    3. Run the following command to obtain the pod name (multiple pod names may exist):

      kubectl get pods -n fst-manage | grep kube-apiserver | awk '{print $1}'

      Run the following command to query the node where the kube-apiserver component is located (if there are multiple pods, run the command multiple times). In the command, PODNAME is the pod name obtained in the previous step.

      kubectl get pods PODNAME -n fst-manage -o yaml | grep hostIP

      The following error message is displayed, indicating that the attempt to access apiserver fails:

          hostIP: 10.120.183.194
          hostIP: 10.120.183.138
    4. Log in to the node where the component is deployed and run the following command:

      ps -ef | grep kube-apiserver

      The following error message is displayed, indicating that the attempt to access apiserver fails:

      paas 15723 30552 0 Nov15 ? 00:31:01 /usr/local/bin/kube-apiserver xxx

      On the node of manage_lb1_ip, run the following command to access apiserver:

      curl https://$(kubectl get svc -n fst-manage|grep kube-apiserver|awk '{print $3}'):5443 -k

      The following error message is displayed, indicating that the attempt to access apiserver fails:

      Unauthorized
  • The kube-apiserver process turns into a zombie process.

    Use PuTTY to log in to the node where the component is deployed and run the following command:

    ps -ef | grep kube-apiserver

    The following error message is displayed, indicating that the attempt to access apiserver fails:

    paas 15723 30552 0 Nov15 ? 00:31:01 [kube-apiserver] <defunct>

    On the node of manage_lb1_ip, run the following command to access apiserver:

    curl https://$(kubectl get svc -n fst-manage|grep kube-apiserver|awk '{print $3}'):5443 -k

    The following error message is displayed, indicating that the attempt to access apiserver fails:

    Error from server: an error on the server has prevented the request from succeeding

Possible Causes

  • An attacker logs in to the node where the component process is deployed and changes the process into a zombie process.
  • The OS is abnormal.

Troubleshooting Method

  1. Use PuTTY to log in to the node where the abnormal component process (zombie process) is deployed as the paas user.
  2. Run the following command and enter the password QAZ2wsx@123! of the root user as prompted to switch to the root user:

    su - root

  3. Run the following command to query the parent process of the zombie process:

    ps -elf | grep 'bin\/kube-apiserver'

    Information similar to the following is displayed. Here, the 21790 process is the parent process of the zombie process 21807.

    4 S paas     21807 21790  0  80   0 -  2938 do_wai Feb20 ?        00:00:00 /bin/sh -c umask 077; sudo chmod a+r /etc/hosts; sudo chown -hR paas:paas /var/paas; chmod 750 /var/paas /var/paas/log /var/paas/auditlog; touch /var/paas/auditlog/audit-runtime.log; chmod 600 
    ...

  4. Run the following command to manually kill the parent process of the zombie process.

    kill -9 Parent process ID

    Parent process ID indicates the parent process ID you obtained in 3.

  5. Run the following commands to switch to the paas user and restart the Docker service:

    su paas

    monit restart docker

  6. If the fault persists, contact technical support for assistance.
Translation
Download
Updated: 2019-06-01

Document ID: EDOC1100062375

Views: 1243

Downloads: 12

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next