No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

HUAWEI CLOUD Stack 6.5.0 Alarm and Event Reference 04

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
ALM-6020 Host Partition Usage Exceeds the Threshold

ALM-6020 Host Partition Usage Exceeds the Threshold

Description

FusionSphere OpenStack checks the logical disk usages of all hosts every 300s. This alarm is generated when the logical disk usage of a host is greater than or equal to the specified alarm threshold.

NOTE:

The default alarm threshold offset is 5%. The default alarm thresholds are as follows:

  • Major: Logical disk usage of a host ≥ 95%
  • Minor: 85% ≤ Logical disk usage of a host < 95%

Attribute

Alarm ID

Alarm Severity

Auto Clear

6020

Major/Minor

Yes

Parameters

Name

Meaning

Fault Location Info

host: specifies the ID of the host for which the alarm is generated.

object: specifies the name of the logical disk for which the alarm is generated on the host.

Additional Info

  • Threshold_level: specifies the alarm threshold. The value is fixed at Minor: 85%-95%, Major: >=95%.
  • hostname: specifies the name of the host for which the alarm is generated.
  • Third parameter:
    • object: specifies the name of the logical disk for which the alarm is generated.
    • total: specifies the size of the logical disk for which the alarm is generated.
    • path: specifies the mount point of the logical disk for which the alarm is generated.
    • used: specifies the current logical disk usage.

Impact on the System

System performance may deteriorate, and new system data cannot be stored.

System services cannot be processed properly if the root partition has insufficient space.

When the Swift partition is fully occupied, the image registration service fails to be registered.

Possible Causes

Excessive files are stored on the logical disk.

Procedure

  1. Log in to the FusionSphere OpenStack web client.

    For details, see Logging In to the FusionSphere OpenStack Web Client (ManageOne Mode).

  2. On the Summary page, obtain the management IP address of the host in the OM IP Address column based on the host ID or host name in the alarm additional information.
  3. Use PuTTY to log in to the host for which the alarm is generated using the management IP address of the host.

    The default user name is fsp. The default password is Huawei@CLOUD8.

    The system supports both password and public-private key pair for identity authentication. If the public-private key pair is used for login authentication, see detailed operations in Using PuTTY to Log In to a Node in Key Pair Authentication Mode.

  4. Run the following command and enter the password of user root to switch to user root:

    su - root

    The default password of user root is Huawei@CLOUD8!.

  1. Switch to the directory where the alarm object locates and delete unnecessary files.

    Perform the required operation based on the directory partitions. To query the usage of each partition, run the df -h command.

    • If the /var/log partition is used up, go to 6.
    • If the /opt/HUAWEI/image partition is used up, go to 10.
    • If the /opt/HUAWEI/image_cache partition is used up, go to 12.
    • If the /opt/HUAWEI/swift partition is used up, go to 14.
    • If the /var/ceilometer partition is used up, go to 15.
    • If the root partition (/) is used up, go to 17.
    • If the /opt/fusionplatform/data/gaussdb_data database partition is used up, run the following command to check whether it is the active database node ($HOSTNAME indicates the host ID):

      cps host-template-instance-list $HOSTNAME |grep gaussdb

      If active is displayed in the command output, it is the active database node. In this case, go to 21.

    • If another partition is fully occupied, go to 20.

  2. Run the cd /var/log command to switch to the log partition directory and run the following command to query large-size log files:

    du -ah --max-depth=4|sort -rn|grep -v K |grep -v 0

    Log files that exceed 1 MB are ordered by size.

  3. Use either of the following methods to clear the large-sized log files:

    NOTE:

    To back up the log files before clearing them, copy them to the /home/fsp directory and use WinSCP to copy them from the host to your local PC.

    • Run the following command to clear all the content of a log file without deleting the file:

      > /var/log/filepath

      filepath indicates the path to the log files that exceed 1 MB, for example, ./fusionsphere/component/swift-proxy.log.

      This command cannot be used to clear multiple log files at a time.

    • Run the rm command to delete specified log files.

      This command can be used to delete all log files in a directory at a time. However, the released space may fail to reclaim.

      After the log files are deleted, run df -h to check whether the log space fails to reclaim. If this is true, query the log service name and restart it.

      Run lsof | grep allfilepath to query the log service.

      allfilepath indicates the path to the deleted log files (or log file directory), for example, /var/log/fusionsphere/component/swift-proxy.log.

      • If the command output is displayed, the first field in the command output is the log service.

        In this case, run the service service restart command to restart the service.

      • If no command output is displayed, no further action is required.

  4. Run the following command to load the log service again:

    service syslog reload

  5. Run the following commands to restart the MongoDB service:

    cps host-template-instance-operate --action stop --service mongodb mongodb

    cps host-template-instance-operate --action start --service mongodb mongodb

    After this step is complete, go to 20.

  6. On the Service OM web client, query VMs on local disks and manually clear unnecessary VMs on local disks.

    After 3 to 4 minutes, check whether the alarm is cleared.
    • If yes, no further action is required.
    • If no, go to 11.

  7. On the FusionSphere OpenStack web client, choose Configuration > Disk and expand the image partition capacity.

    If the capacity expansion is successful, go to 20.

  8. Check whether to use new images to create volumes or VMs on the current node when /opt/HUAWEI/image_cache is used up.

    • If yes, go to 13.
    • If no, the alarm does not affect services, and no further action is required.
    NOTE:

    If an image that has been used to create a volume or VM is used to create a volume or create a VM again, the alarm does not affect the creation.

  9. On the FusionSphere OpenStack web client, choose Configuration > Disk and expand the image-cache partition capacity.

    If the capacity expansion is successful, go to 20.

  10. Increase the available space of the Swift partition using either of the following methods.

    • On the Service OM page, choose Services > IMS, select an image that is not used, and choose More > Delete to release the Swift space.
    • Expand the Swift partition by referring to section "Expanding Service Storage Resources" in the HUAWEI CLOUD Stack 6.5.0 Capacity Expansion Guide.

    If the capacity expansion is successful, go to 20.

  11. On the FusionSphere OpenStack web client, choose Configuration > Disk and expand the MongoDB partition capacity using the following formula:

    Estimated disk space occupied by MongoDB (unit: GB): (50 x VM+50 x PM) x 1.5KB/1024/1024

    In the preceding information, VM indicates the number of VMs, PM indicates the number of physical hosts, and 1.5kb indicates the average size of a single record.

  12. Run the following commands to restart the MongoDB service:

    cps host-template-instance-operate --action stop --service mongodb mongodb

    cps host-template-instance-operate --action start --service mongodb mongodb

  13. Run the df -h command to check the disk partition usage and determine whether /home/fsp is a separately mounted partition.

    • If yes, go to 22.
    • If no, go to 18.

  14. Run the cd /home/fsp command to switch to the required directory and run the following command to query large-size files:

    du -ah --max-depth=4|sort -rn|grep -v K

    Files that exceed 1 MB are ordered by size.

    • If large files exist, go to 19.
    • If large files do not exist, go to 22.

  15. Run the rm filepath command to manually delete the large files. filepath indicates the directory in which the files with the size greater than 1 MB obtained in the previous step are stored, such as ./filename.tar.gz.

    Run the df -h command to check whether the root partition usage is reduced to less than 85%.

    • If yes, go to 20.
    • If no, go to 22.

  16. After 3 to 4 minutes, check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 22.

  17. Run the following command to check whether the token table in the Keystone database is too large and delete expired tokens in batches if the token table is too large:

    nohup python /etc/gaussdb/gaussdb/db_occupation_checker.py $CLEAN_TOKENS &

    $CLEAN_TOKENS indicates the number of tokens that are deleted at first. The value ranges from 10000 to 50000, preferably 20000.

    Deleting tokens may consume a long time, and you can query the progress in /var/log/fusionsphere/component/db_occupation_checker/db_occupation_checker.log.

    • If a message "The user having most tokens is " is displayed, the Keystone account that has the most tokens is displayed.
    • If a message "Start to clean expired tokens..." is displayed, the system starts to delete tokens.
    • If a message "Vacuum tokens successfully..." is displayed, the deletion operation is complete.
    NOTE:

    After the tokens are deleted, this does not mean that the space occupied by them is released, but application for new tokens is not affected.

    If the following is displayed, go to 22.

    • If the message "The 'base' directory is not one of the large files(dir) in gaussdb partition, please contact technical staff to check other files." is displayed, the base directory where the database table physical files locate is not the largest file in the GaussDB partition.
    • If the message "The largest file(base)'s size (kb) is more smaller than total partition size (kb), please contact technical staff." is displayed, the size of the base directory where the database table physical files locate is smaller than 70% of the total partition size.
    • If the message "Keystone database is not the largest database, please contact technical staff to check the other database:" is displayed, the Keystone database is not the largest database.
    • If the message "The token table is not the largest table, please contact technical staff to check the other table" is displayed, the token table in the Keystone database is not the largest table.

  18. Contact technical support for assistance.

Related Information

None

Translation
Download
Updated: 2019-08-30

Document ID: EDOC1100062365

Views: 37621

Downloads: 31

Average rating:
This Document Applies to these Products
Related Version
Related Documents
Share
Previous Next