No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

HUAWEI CLOUD Stack 6.5.0 Troubleshooting Guide 02

Rate and give feedback :
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Node Exceptions in the Management Zone

Node Exceptions in the Management Zone

Symptom

Node exceptions occur. The possible causes are as follows:

  • The OS breaks down.
  • The disk is abnormal.

Troubleshooting

Prerequisites
  • One node is faulty and has been rebuilt. The other nodes are running properly.

    For more information on how to rebuild nodes, see the relevant Infrastructure as a Service (IaaS) guide manual.

  • The abnormal node has been deleted.
  • The physical and virtual IP addresses of the rebuilt node remain unchanged.
  • The specifications of the rebuilt node remain unchanged.
  • The passwords of all nodes in the management zone are the same.
Procedure
  1. Use PuTTY to log in to the node to be rebuilt.

    The default username is paas, and the default password is QAZ2wsx@123!.

  2. Run the following command to switch to user root:

    su - root

    paasword: password of the root user

  3. First check whether there is a hanging volume log file with .flag or create_vol.log ending in /home/paas/create_vol_tool/, /var/log/tools/create_vol/ or /tmp directory. If there is any deletion, skip it. Then switch to the paas user and mount disks to the rebuilt node.

    For more information, see section "Preparations Before Installation > Configuring Disk Partitions" in FusionStage 6.5.0.SPC100 Product Documentation.

  4. Prepare the certificate file.

    1. Log in to the rebuilt node as the paas user. Run the following commands to create a directory for storing the certificate file:

      mkdir -p /var/paas/cert

      chmod 700 /var/paas/cert

    2. Log in as the paas user to the manage_lb1_ip or manage_lb2_ip node that is properly running, and run the following command to copy the certificate to the rebuilt VM:

      scp /var/paas/srv/kubernetes/* paas@{IP address of NIC eth0 on the rebuilt node}:/var/paas/cert/

      NOTE:

      Enter the password of the paas user of the VM as prompted.

    3. Run the following command to switch to the root user and enter the password of the root user as prompted:

      su root

    4. Run the following command to copy the docker certificate file to the rebuilt node:
      scp /etc/docker/certs.d/{VIP of NIC eth0 on the manage_lb1_ip or manage_lb2_ip node}:20202/client.key paas@{IP address of NIC eth0 on the rebuilt node}:/var/paas/cert/kubecfg.key
      NOTE:

      Enter the password of the paas user of the rebuilt node as prompted.

    5. Log in to the rebuilt node as the paas user. Run the following command to modify the permission on the certificate file:

      chmod 600 /var/paas/cert/*

  5. Prepare the FusionStage Base installation package.

    1. Use the remote transmission tool to upload the FusionStage Base installation package with the same version of FusionStage as the paas user to the /var/paas directory on the rebuilt node.
    2. Log in as the paas user to the manage_lb1_ip or manage_lb2_ip node that is properly running.
    3. Run the following commands to prepare the configuration file for the rebuilt node:

      cd /var/paas/bootstrap/bin

      ./fsadm addvm LiteCoreBase -m base -f ../knowledge/fusionstage_LiteCoreBase.yaml

    4. Log in to the rebuilt node as the paas user. Run the following command to decompress the FusionStage Base installation package:

      unzip /var/paas/FusionStage-BaseLight-XXX.zip -d /var/paas/FusionStage-Base

      NOTE:

      The preceding version package is for reference only. Replace it as required.

    5. Run the following commands to prepare the bootstrap package:

      scp -r paas@{IP address of NIC eth0 on the manage_lb1_ip or manage_lb2_ip node that is properly running}:/var/paas/bootstrap /var/paas/

      NOTE:

      Enter the password of the paas user of the corresponding VM as prompted.

      rm -rf /var/paas/bootstrap/images

      cp -rf /var/paas/FusionStage-Base/bootstrap/images /var/paas/bootstrap/

      cp -rf /var/paas/FusionStage-Base/bootstrap/package /var/paas/bootstrap/

  6. (Optional) Prepare keepalived and haproxy configuration files.

    NOTE:

    This is required only when manage_lb1_ip or manage_lb2_ip node has been rebuilt.

    1. Log in to the rebuilt VM as the paas user. Run the following command to copy the keepalived and haproxy configuration files from the manage_lb1_ip or manage_lb2_ip node that is properly running:

      mkdir -p /var/paas/srv/

      chmod 750 /var/paas/srv

      scp -r paas@{IP address of the manage_lb1_ip or manage_lb2_ip node that is properly running}:/var/paas/srv/haproxy /var/paas/srv

      scp -r paas@{IP address of the manage_lb1_ip or manage_lb2_ip node that is properly running}:/var/paas/srv/keepalived /var/paas/srv/

      NOTE:

      Enter the password of the paas user of the manage_lb1_ip or manage_lb2_ip node as prompted.

    2. Run the following command to modify the keepalived configuration file:

      vim /var/paas/srv/keepalived/keepalived.conf

      Exchange the values of unicast_src_ip and unicast_peer parameters in the keepalived configuration file.

    3. Run the following command to change the owner of all files in the /var/paas directory:

      chown -R paas:paas /var/paas/*

  7. Log in to the node to be rebuilt as the paas user. Run the following commands to rebuild the node:

    cd /var/paas/bootstrap/bin

    ./fsadm restoreLite {Node type} -m LiteCoreBase

    Node type indicates the type of the node to be rebuilt. The value can be manager-lb1, manager-lb2, manager-core1, manager-core2, manager-core3, manager-db1, manager-db2, or manager-db3.

    If information similar to the following is displayed, the rebuilt node has been successfully rebuilt. The manager-core2 field in the command output varies depending on site requirements.

    ***********************************************************************  [ 2017-08-14 11:32:16 ] End exec job:  labelNode  ***********************************************************************  End of restoring node: manager-core2

  8. Run the following command for the environment variables to take effect:

    . ~/.bashrc

  9. (Optional) Restore CEF-ETCD.

    If CEF-ETCD exceptions occur on the manage-db1, manage-db2, or manage-db3 node, rectify the faults by referring to the corresponding sections in 4 CFE-ETCD Database Emergency Restoration. If there are no CEF-ETCD exceptions, or the CEF-ETCD exceptions occur on other nodes, skip this step.

  10. (Optional) Restore GaussDB and Redis.

    Only the manage-db1 and manage-db2 nodes where GaussDB is deployed are involved. For other nodes, skip this step.

    Restore the database by referring to Abnormal Slave DatabaseInstances.

  11. Run the following command to delete the FusionStage Base installation package:

    rm -rf /var/paas/FusionStage-Base

    NOTE:
    • If the rebuilt node is a lb node, add a route after the node is rebuilt.
    • After the node is rebuilt, run the kubectl get pods --all-namespaces command to query the pod status as the root user. If the pod status is not Running, the following exceptions may occur:

      If the pod is abnormal, run the kubectl delete pod name -n fst-manage --grace-period=0 --force command to delete the pod as the root user. name indicates the pod name, that is, the value obtained from the NAME column.

    • Perform the following steps to change the name of the host on the rebuilt node if needed.
      1. Log in to the rebuilt node as the paas user and switch to the root user. Run the following command to temporarily change the host name:

        hostname {New host name}

      2. Run the following command to modify the configuration file for the new host name to take effect permanently:

        echo '{New host name}' > /etc/hostname

      3. Log in to the rebuilt node as the paas user for the new host name to take effect immediately.

Translation
Download
Updated: 2019-06-01

Document ID: EDOC1100062375

Views: 1159

Downloads: 12

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next