No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

FusionCloud 6.3.1.1 Troubleshooting Guide 02

Rate and give feedback :
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Node Exceptions in the OM Zone

Node Exceptions in the OM Zone

This section describes how to perform emergency recovery for OM zone nodes OM-Core01, OM-Core02, and OM-Core03.

Symptom

OM-Core node exceptions include:

  • OS crash
  • Hard disk faults

Troubleshooting

Prerequisites
  • One OM-Core node (OM-Core01, OM-Core02, or OM-Core03) is faulty and has been rebuilt. Two other OM-Core nodes and the OM-Core cluster are running properly.

    For more information on how to rebuild nodes, see the relevant Infrastructure as a Service (IaaS) guide manual.

  • The physical and virtual IP addresses of the rebuilt OM-Core node remain unchanged.
  • The specifications of the rebuilt OM-Core node remain unchanged.
Procedure

If the security is hardened, add sudo privileges before the operation, and then remove sudo privileges after the operation.

  1. Assign the sudo permission to the paas user.

    1. Log in to the rebuilt node as the paas user.
    2. Run the following command to switch to the root user:

      su - root

      The system then prompts you to type the password of the root user.

    3. Run the following command to modify the sudo permission:

      echo "%paas ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers

  2. Prepare a docker certificate file.

    1. Log in to the rebuilt node as the paas user. Run the following commands to create a directory in which the docker certificate file will be located:

      mkdir -p /var/paas/cert

      chmod 700 /var/paas/cert

    2. Log in to any of the running OM-Core nodes (OM-Core01, for example) as the paas user. Run the following command to copy the certificate file to the rebuilt node:

      scp /var/paas/srv/kubernetes/* paas@{IP address of the rebuilt node}:/var/paas/cert/

      NOTE:
      • Enter the password of the paas user as prompted.
      • If certification authentication fails, run the vim ~/.ssh/known_hosts command to delete the fingerprint used for login to the rebuilt node.
    3. Run the following command to switch to the root user:

      su root

      The system then prompts you to enter the password of the root user.

    4. Run the following command to copy the docker certificate file to the rebuilt node:

      scp /etc/docker/certs.d/{virtual IP address of OM-Core01}:20202/client.key paas@{IP address of the rebuilt node}:/var/paas/cert/kubecfg.key

      NOTE:
      • Enter the password of the paas user as prompted.
      • If certification authentication fails, run the vim ~/.ssh/known_hosts command to delete the fingerprint used for login to the rebuilt node.
    5. Log in to the rebuilt node as the paas user. Run the following command to modify the permission on the certificate file:

      chmod 600 /var/paas/cert/*

  3. Prepare a FusionStage Base release package.

    1. As the paas user, use a remote file transfer tool to upload the current FusionStage Base release package to the var/paas directory on the rebuilt node.
    2. Ready to rebuild the configuration files of the node. Execute the following commands on the normally running OM-Core01 or OM-Core02 nodes.

      cd /var/paas/bootstrap/bin

      ./fsadm addvm CorebaseHA -m base -f ../knowledge/fusionstage_CorebaseHA.yaml

    3. Log in to the rebuilt node as the paas user. Run the following commands to decompress the FusionStage release package:

      unzip /var/paas/FusionStage-Base-XXX.zip -d /var/paas/FusionStage-Base

      NOTE:

      XXX indicates the version number. Replace it with the FusonStage release package in use.

    4. Run the following command to prepare the bootstrap installation package:

      scp -r paas@{IP address of node in b}:/var/paas/bootstrap /var/paas/

      NOTE:

      Enter the password of the paas user as prompted.

      rm -rf /var/paas/bootstrap/images

      cp -r /var/paas/FusionStage-Base/bootstrap/images /var/paas/bootstrap/

      cp -r /var/paas/FusionStage-Base/bootstrap/package /var/paas/bootstrap/

  4. (Optional) Prepare keepalived and haproxy configuration files.

    NOTE:

    This step is required only when ON-Core01 or OM-Core02 has been rebuilt.

    1. Log in to the rebuilt node as the paas user. Run the following commands to copy the keepalived and haproxy configuration files from a running OM-Core node (either OM-Core01 or OM-Core02) to the rebuilt node:

      mkdir -p /var/paas/srv/

      chmod 750 /var/paas/srv

      scp -r paas@{IP address of a running OM-Core01 or OM-Core02 node}:/var/paas/srv/haproxy /var/paas/srv

      scp -r paas@{IP address of a running OM-Core01 or OM-Core02 node}:/var/paas/srv/keepalived /var/paas/srv/

      NOTE:

      Enter the paas user password of the running OM-Core node as prompted.

    2. Run the following command to interchange the values of unicast_src_ip and unicast_peer parameters in the keepalived configuration file:

      vim /var/paas/srv/keepalived/keepalived.conf

  5. Log in to the rebuilt node as the paas user. Run the following commands to add the rebuilt node to the node cluster:

    cd /var/paas/bootstrap/bin

    ./fsadm addvm CorebaseHA -m base -f ../knowledge/fusionstage_CorebaseHA.yaml

    ./fsadm restore {node type} -m CorebaseHA

    Replace {node type} with om-core1, om-core2, or om-core3, which indicates the node (OM-Core01, OM-Core02, or OM-Core03) that will be added to the cluster.

    If information similar to the following is displayed, the rebuilt node (for example, OM-Core02) has been successfully added to the cluster:

    *********************************************************************** 
    [ 2017-08-14 11:32:16 ] End exec job:  labelNode 
    *********************************************************************** 
    End of restoring node: om-core2

  6. Run the following command to delete the installation package:

    rm -rf /var/paas/FusionStage-Base

    NOTE:
    • Check pod status. Run the kubectl get pods --all-namespaces command to view pod status.

      If pod status is not Running, perform one of the subsequent steps:

      • If pod status of CFE-ETCD is faulty, follow the procedure in CFE-ETCD Restoration.
      • If the ICAgent pod on the OM zone node is faulty, run the command to delete the pod:

        kubectl delete pod name -n om --grace-period=0 --force

        Replace name with the pod name displayed in the NAME column of the pod query command output.

      • If the DBAgent pod on the OM zone node is faulty, run the following command to delete the pod:

        kubectl delete pod name -n om --grace-period=0 --force

        Replace name with the pod name displayed in the NAME column of the pod query command output.

        If the pod is still faulty, contact technical support.

      • If the kube-apiserver pod on the OM zone node is faulty, run the ps -ef | grep kube-api command to query the process ID corresponding to the pod and then run the kill -9 {process ID} command to delete the pod.
    • Check database instance status. Run the following commands to view database instance status:

      cd /opt/paas/oss/manager/apps/DBAgent/bin

      ./dbsvc_adm -cmd query-db-instance

      If Rpl Status is displayed as Abnormal, the DBM database of OM-Core01 or OM-Core02 is faulty. Follow the procedure in DBM Database Recovery.

  7. Cancel sudo privileges.

    Open the / etc / sudoers file as root and delete or cancel the line "echo"% PAAS ALL = (ALL) NOPASSWD: ALL".

Translation
Download
Updated: 2019-06-10

Document ID: EDOC1100063248

Views: 22558

Downloads: 37

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next