No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

HUAWEI CLOUD Stack 6.5.0 Alarm and Event Reference 04

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
ALM-1507002 RabbitMQ Resource Usage Exceeds the Threshold

ALM-1507002 RabbitMQ Resource Usage Exceeds the Threshold

Description

This alarm is generated when the RabbitMQ resource usage exceeds the threshold.

NOTE:

The default alarm threshold offset is 5%. The default alarm thresholds are as follows:

  • Critical: RabbitMQ resource usage ≥ 80%
  • Major: 70% ≤ RabbitMQ resource usage < 80%

Attribute

Alarm ID

Alarm Severity

Auto Clear

1507002

Critical/Major

Yes

Parameters

Name

Meaning

Fault Location Info

instance_name: specifies the name of the instance where the service for which the alarm is generated is located.

Additional Info

  • AlarmInfo: specifies the alarm type and describes specific parameters.
  • host_id: specifies the ID of the host for which the alarm is generated.
  • hostname: specifies the name of the host for which the alarm is generated.
  • Service: specifies the service for which the alarm is generated.
  • MicroService: specifies the name of the microservice for which the alarm is generated.

Impact on the System

When the alarm is generated, the number of available resources corresponding to the components is small. If the issue persists, services are abnormal.

Possible Causes

The service loads on the host are excessively heavy.

Procedure

  1. Check whether the alarm is automatically cleared.

    • If yes, no further action is required.
    • If no, go to 2.

  2. Locate the threshold alarm item based on the additional information.

    • If the value of fd_use_rate, proc_use_rate, or sockets_use_rate exceeds the threshold, contact technical support for assistance.
    • If the value of mem_use_rate or the memory watermark (memory_high_watermark) exceeds the threshold, go to 3.
      NOTE:
      • fd_use_rate: indicates the usage of the file descriptor.
      • proc_use_rate: indicates the usage of the erlang process.
      • sockets_use_rate: indicates the usage of socket.

  3. Obtain the ID of the host for which the alarm is reported in the alarm additional information.
  4. Log in to the FusionSphere OpenStack controller node as user fsp, switch to user root, and then perform operations provided in Importing Environment Variables. Then, run the following command to query the management IP address according to the host ID above:

    su - root

    cps host-list | grep Host ID

  5. Use the management IP address to log in to the host for which the alarm is generated.
  6. Run the cps template-params-show --service rabbitmq $rabbitmq_template command. The value of $rabbitmq_template depends on whether RabbitMQ corresponding to the alarm host is a secondary database and which secondary database it is. For details about the mapping, see Table 3-6. In the subsequent steps, handle the value of $rabbitmq_template similarly.

    As shown in the following figure, check the value of memory_high_watermark in the configuration item. The default value is empty.

    • Check the host and VM scales. If the memory watermark is less than the value of memory_size in the corresponding scale listed in Table 3-7 or is empty, run the following commands ($memory_size indicates memory_size with the unit added, for example, 16G):

      memory_size$ --parameter memory_high_watermark=$rabbitmq_templatecps template-params-update --service rabbitmq

      cps commit

      Then go to 10.

      Ensure that the memory watermark is not greater than 40% of the physical memory of the host. If the physical memory is insufficient, expand the physical memory before performing the subsequent operations.

    • If the memory watermark is greater than or equal to the watermark of the corresponding scale in Table 3-7, contact technical support for assistance.

  7. Run the df -h command on the RabbitMQ-deployed node for which the alarm is generated.

    Check the available space of the $rabbitmq_template partition (example: /opt/fusionplatform/data/rabbitmq), and record the value as size. The value of $rabbitmq_template depends on whether RabbitMQ corresponding to the alarm host is a secondary database and which secondary database it is. For details about the mapping, see Table 3-6. In the subsequent steps, handle the value of $rabbitmq_template similarly.

  8. Run the cps template-params-show --service rabbitmq $rabbitmq_template command, query the value of memory_high_watermark, and record the value as memory_size.
  9. Run the cps hostcfg-list --type storage command to query the disk group information about hosts where RabbitMQ locates.

    Record the name corresponding to hostid of the RabbitMQ-deployed host for which the alarm is generated as group_name. If hostid is not displayed in the preceding figure, the group name is default.

    • If size is less than 50% of the value of memory_size, run the following commands to expand the RabbitMQ data partition:

      cps hostcfg-item-update --item logical-volume --lvname $rabbitmq_template --size $size --type storage $hostcfg_name

      cps commit

      $size indicates the recommended disk size in the current scale, as shown in Table 3-7. You need to carry the memory unit when setting its value, for example, 16g. $hostcfg_name indicates the value of the name field (group_name) in the cps hostcfg-list --type storage command output. If the host IDs of the nodes where RabbitMQ locates belong to multiple groups, perform this step for each group.

      NOTE:

      The recommended disk partition is 50% of the memory watermark.

    • If the value of size is greater than 50% of that of memory_size, contact technical support for assistance.

  10. After 3 to 4 minutes, check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, contact technical support for assistance.

Related Information

Table 3-6 Mapping between RabbitMQ template names and dedicated RabbitMQ databases

Dedicated RabbitMQ Database

RabbitMQ Template Name

No dedicated database is used

rabbitmq

Dedicated Nova database

rabbitmq_nova

Dedicated Neutron database

rabbitmq_neutron

Table 3-7 RabbitMQ parameter configuration scale

Scale

Memory Watermark (GB)

Disk Partition Size (GB)

100PM/1000VM

32

16

256PM/2000VM

32

16

512PM/5000VM

50

25

1024PM/10000VM

50

25

Translation
Download
Updated: 2019-08-30

Document ID: EDOC1100062365

Views: 38163

Downloads: 31

Average rating:
This Document Applies to these Products
Related Version
Related Documents
Share
Previous Next