No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

HUAWEI CLOUD Stack 6.5.0 Troubleshooting Guide 02

Rate and give feedback :
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
FAQs

FAQs

How Do I Avoid Incorrect VM Time Due to Node Power-Off?

Symptom

No external NTP clock source is configured for management nodes of ManageOne. When a management node is powered off unexpectedly, the time on the node is inconsistent with the local time.

Procedure
  1. Use PuTTY to log in to the regionAlias-ManageOne-Deploy01 node as the sopuser user.

    The default password is D4I$awOD7k.

  2. Run the following command to switch to the root user:

    sudo su root

    The default password is Changeme_123.

  3. Run the following command to edit the NTP service configuration file of the OS:

    vim /etc/ntp.conf

  4. Append the NTP clock sources to the command line. Example commands:

    server IP address of NTP clock source 1 maxpoll 4 minpoll 3 prefer
    server IP address of NTP clock source 2 maxpoll 4 minpoll 3
    NOTE:

    prefer indicates that the time is preferentially synchronized from IP address of NTP clock source 1.

  5. Press Esc. Then, run the following command to save the configuration and exit:

    :wq!

  6. After the patch packages are installed, run the following command to restart the NTP service:

    service ntp restart

  7. Run the following command to query the status of the NTP service:

    ntpq -p

    Information similar to the following is displayed:

       remote           refid      st t when poll reach   delay   offset  jitter 
    ============================================================================== 
    *192.168.8.12    192.168.8.11    1 u   29   64  177    0.240    0.093   1.222 

    192.168.8.12 indicates the IP address of the NTP server clock source. * indicates that the NTP service status is normal, which is displayed after 5 minutes.

Clearing Disk Space

Symptom

Operations such as logging in to the system or installing software packages sometimes fail.

Possible Causes

The system has been running for a long time but the disk space has not been cleared. As a result, the disk space is insufficient.

Procedure

Deleted files cannot be restored. Exercise caution when performing the following operations.

  1. Log in to the regionAlias-ManageOne-Deploy01 node as the sopuser user in SSH mode.

    The default password is D4I$awOD7k.

  2. Run the following command to switch to the root user:

    sudo su root

    The default password is Changeme_123.

  3. Run the following command to verify the usage of each partition:

    df -h

    In the command output, if the usage of a partition exceeds 80%, clear the space.

    Filesystem                     Size  Used Avail Use% Mounted on
    /dev/xvda3                      17G  2.5G   14G  16% /
    devtmpfs                       7.8G  152K  7.8G   1% /dev
    tmpfs                          7.8G     0  7.8G   0% /dev/shm
    /dev/xvda1                    1003M   50M  903M   6% /boot
    /dev/xvda5                    1003M   18M  935M   2% /home
    /dev/xvda10                    5.0G  915M  3.9G  19% /usr
    /dev/xvda6                     3.0G  176M  2.7G   7% /var
    /dev/xvda7                     5.0G  3.0G  1.7G  64% /var/log
    /dev/xvda8                    1003M   18M  935M   2% /var/log/audit
    /dev/xvda9                    1003M   18M  935M   2% /var/tmp
    /dev/mapper/oss_vg-opt_vol      89G   76G   13G  85% /opt
    /dev/mapper/oss_vg-optlog_vol   30G  178M   28G   1% /opt/log

  4. For example, to clear the /opt directory, run the following commands to go to the /opt directory and sort the directories according to the directory sizes (in MB) in descending order:

    cd /opt

    du -sm * |sort -rn

    The following command output shows that the pub directory occupies the largest space, that is, about 12.5 GB.

    12492    pub
    5887     tools
    2598     mysql
    1092     oss
    762      log
    96       share
    15       redis
    1        sudobin2
    1        lost+found
    1        aquota.user
    1        aquota.group

  5. Run the following commands to go to the pub directory and sort the directories according to the directory sizes (in MB) in descending order:

    cd pub

    du -sm * |sort -rn

    The following command output shows that the software directory occupies the largest space.

    12492    software
    557      upload
    1        manager
    1        backup_local

  6. Go to the /software directory and find unnecessary files that occupy large space by referring to commands described in 5.
  7. Run the following command to delete the file:

    rm -r xxx

    NOTE:

    xxx: files to be deleted.

  8. Repeat the preceding operations to clear the space of directories whose usage exceeds 80% if any. Otherwise, skip this step.
Follow-up Procedure

Check and clear the disk space periodically.

How Do I Configure a Floating IP Address?

Context

Before installing a service, a temporary floating IP address is configured to meet service requirements. The server OS is unexpectedly started before services of the temporary floating IP address are taken over. As a result, the temporary floating IP address becomes invalid and the services cannot be installed. Therefore, the temporary floating IP address needs to be reconfigured.

Prerequisites

You have obtained the floating IP address of the server.

Procedure
  1. Use PuTTY to log in to the operating system of the server as the sopuser user.

    The default password is D4I$awOD7k.

  2. Run the following command to switch to the root user:

    sudo su root

    The default password is Changeme_123.

  3. Run the following command to configure the floating IP address:

    ifconfig eth0:0 <Floating IP address of the server>

  4. Run the following command to check whether the floating IP address is configured successfully:

    ifconfig eth0:0

How Do I Detect IP Address Conflicts?

Symptom

The network connection is unstable or is intermittently disconnected.

Procedure

The following assumes that node IP address is 192.168.1.100 and the NIC name of the node is eth0. Replace the IP address and NIC name with the actual ones.

  1. Use PuTTY to log in to that node as the sopuser user.

    The default password is D4I$awOD7k.

  2. Run the following command to switch to the root user:

    sudo su root

    The default password is Changeme_123.

  3. Run the following commands to check whether IP address conflicts exist:

    arping -D -I eth0 -c 2 192.168.1.100

    • If command output similar to the following is displayed, there is no IP addresses conflicts.
      ARPING 192.168.1.100 from 0.0.0.0 eth0
      Sent 2 probes (2 broadcast(s))
      Received 0 response (s)
    • If command output similar to the following is displayed, there are IP addresses conflicts.
      ARPING 192.168.1.100 from 0.0.0.0 eth0
      Unicast reply from 192.168.1.100 [20:0B:C7:A0:32:31] for 192.168.1.100 [20:0B:C7:A0:32:31] 0.810ms
      Sent 1 probes (1 broadcast(s))
      Received 1 response (s)

How Do I Power On the System After Abnormal System Is Powered Off?

Symptom

The system is powered on after it is abnormally powered off. Services need to be recovered.

NOTE:
Impact of the Operation

The service deployment system can automatically start the services after all the server OSs in the system are powered on (you can power on the servers in any sequence).

Procedure

Power on all servers and there is no requirement on the power-on sequence of the servers.

How Do I Handle Disk Exceptions When VMs Are Powered On and Off Abnormally?

Question

VM disk exceptions occur due to abnormal power-off, typically, forcible VM stop/restart and unexpected storage disconnection.

Description

Perform operations provided in this section if information similar to the following is displayed during the startup:

systemd-fsck[605]: /dev/sda2: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.  
systemd-fsck[605]: (i.e., without -a or -p options)  
[ 13.652068] systemd-fsck[605]: fsck failed with error code 4.  
Welcome to emergency mode. Use "systemctl default" or ^D to activate default  mode.  
Give root password for maintenance  
(or type Control-D to continue): 
Procedure
  1. Enter the password of user root and switch to the emergency mode to repair the faulty disk.
  2. Run the following command to rectify the fault:

    fsck -y Disk name

    For example, if the name of the faulty disk is /dev/sdb2, run the fsck -y /dev/sdb2 command.

    The disk is damaged if information similar to the following is displayed:

    fsck.ext4 /dev/sda2  
    e2fsck 1.42.7 (21-Jan-2013)  
    /dev/sda2 is mounted.  
    e2fsck: Cannot continue, aborting. 

    The information indicates that disk /dev/sda2 is being attached and cannot be repaired by running the fsck command. In this case, run the mount -on remount,ro /dev/sda2 command to set the disk to read only, and then run the fsck /dev/sda2 command to repair the disk. After the repair is complete, run the exit command.

  3. Confirm the password of user root for logging in to EulerOS.

    • If the password is forgotten, go to 4.
    • If the password is obtained, go to 5.

  4. Perform the following steps to restore the password of user root:

    1. On the grub menu, press e to enter the editing mode and type the grub account and password, root and Huawei#12, respectively.
    2. Add init=/bin/bash to the end of linux16 /boot/vmlinuz-****.
    3. Press Ctrl+X to start the shell and enter the single-user mode.
    4. Run the following command to change the mounted file system to the writable mode:

      # mount -no remount,rw /

    5. Run the passwd command and change the password of user root as prompted.

      If SELinux is enabled for the system, run the following command:

      # touch /.autorelabel

      Otherwise, the system cannot be started normally.

      After running the exec /sbin/init command to start the system or running the exec /sbin/reboot command to enter the single-user mode, run the following commands to check files in all /etc/fstab directories:

      mount -no remount,ro /

      umount -a

      /usr/sbin/fsck -AsCy

      exit

  5. If the fault persists, enter the emergency mode and run the systemctl default command to enter the system.

    After logging in to the system, run the dmesg|grep -i error command to check the startup logs. If the error logs still exist, the storage may be abnormal. In this case, contact technical support for assistance.

How Do I Associate a Restored Node on the OM Plane?

After a node is restored, Service Monitoring does not report data about the node. You need to re-associate it.

Procedure
  1. Choose System > Platform Configuration > Service Monitoring from the main menu.
  2. Choose Monitoring Panel > Service Monitoring.

    On the displayed page, you can view the created ManageOne monitoring card, as shown in Figure 23-3.

    Figure 23-3 Service Monitoring
    NOTE:

    Click to filter the created service monitoring cards by region and service type.

  3. Click in the upper right corner of the ManageOne monitoring card to edit basic information.
  4. Click Next. On the displayed 2.Node Information tab page, select the ManageOne nodes in the Operation node column.

    The selected nodes are displayed in the Selected node area.

  5. Click Next. On the displayed 3.Monitoring Template tab page, select the monitoring template of the nodes and associate the nodes with the template.

What Can I Do If Kafka Is Abnormal When MessagingBrokeService Is Restarted After the VM Is Powered Off and On?

Symptom

Kafka is abnormal when MessagingBrokeService is restarted after the VM is powered off and powered on.

Procedure
  1. Perform the following steps to log in to the regionAlias-ManageOne-Service01, regionAlias-ManageOne-Service02, and regionAlias-ManageOne-Service03 nodes in sequence. In the CSHA scenario, log in to the regionAlias-ManageOne-Service04 node.

    1. Log in to a node where MessagingBrokeService resides as the sopuser user.

      The default password is D4I$awOD7k.

    2. Run the following command to switch to the root user:

      sudo su root

      The default password is Changeme_123.

    3. Run the following command to check the running status of MessagingBrokeService:

      su ossadm -c ". /opt/oss/manager/bin/engr_profile.sh; ipmc_adm -cmd statusapp -app MessagingBrokeService"

    The following information is displayed:

     msgbrksrv-2-0 msgbrksrv MessagingBrokeService Product cluster 192.168.191.125 11006 RUNNING
    NOTE:

    RUNNING indicates that the running status is normal, and STOPPED indicates that the running status is abnormal.

  2. Run the following command to stop MessagingBrokeService on all nodes that are running properly:

    su ossadm -c ". /opt/oss/manager/bin/engr_profile.sh; ipmc_adm -cmd stopapp -app MessagingBrokeService"

  3. Run the following commands to modify the MessagingBrokeService configuration file on all nodes in sequence:

    sed -i 's/unclean.leader.election.enable=false/unclean.leader.election.enable=true/g' /opt/oss/Product/apps/MessagingBrokeService/kafka/bin/GenerateConfig/Kafka/server.properties

  4. Run the following command to start MessagingBrokeService from abnormal to normal on all nodes in sequence:

    su ossadm -c ". /opt/oss/manager/bin/engr_profile.sh; ipmc_adm -cmd startapp -app MessagingBrokeService"

    NOTE:

    Start the nodes according to the status queried in 1. During the startup, you must start another node after a node is started.

  5. Run the following command to check whether MessagingBrokeService is normal:

    su ossadm -c ". /opt/oss/manager/bin/engr_profile.sh; ipmc_adm -cmd statusapp -app MessagingBrokeService"

    • If yes, run the following command to modify the configuration file of MessagingBrokeService after all services are started:

    sed -i 's/unclean.leader.election.enable=true/unclean.leader.election.enable=false/g' /opt/oss/Product/apps/MessagingBrokeService/kafka/bin/GenerateConfig/Kafka/server.properties

    • If no, contact technical support for assistance.

  6. Run the following command to stop MessagingBrokeService on all nodes in sequence:

    su ossadm -c ". /opt/oss/manager/bin/engr_profile.sh; ipmc_adm -cmd stopapp -app MessagingBrokeService"

  7. Run the following command to start MessagingBrokeService on all nodes in sequence:

    su ossadm -c ". /opt/oss/manager/bin/engr_profile.sh; ipmc_adm -cmd startapp -app MessagingBrokeService"

  8. Run the following command to log out of user root:

    exit

Translation
Download
Updated: 2019-06-01

Document ID: EDOC1100062375

Views: 1224

Downloads: 12

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next