No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

HUAWEI CLOUD Stack 6.5.0 Alarm and Event Reference 04

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Device Alarms

Device Alarms

ALM-48401 Process Is Not Started

Description

Processes of some API Gateway components are not started.

Attribute

Alarm ID

Alarm Severity

Alarm Type

48401

Critical

Device alarms

Alarm Parameters

Parameter

Description

alarm_id

Alarm ID.

alarm_name

Alarm name.

alarm_severity

Alarm severity.

alarm_sn

Alarm serial number.

alarm_time

UTC timestamp when the alarm was generated.

alarm_type

Alarm type.

category

Alarm category.

clear_type

Alarm clearance type.

group_id

Alarm group ID.

ne_dn

Alarm source IP address.

object_instance

Alarm location information, including the IP address of the source node for which the alarm is generated, IP address of the faulty source node, component name, and alarm cause.

probable_cause

Possible cause of the alarm.

probable_repair_actions

Alarm handling suggestion, which describes the procedure for rectifying the fault.

source

Service where the alarm is generated.

sub_ne

Name of the NE where the alarm is generated.

Impact on the System

If processes on the component that reports the alarm are abnormal, services connected to the component will be affected.

Possible Causes

The component process does not exist.

Procedure

  1. Log in to ManageOne Maintenance Portal using a browser.

    • URL: https://Address for accessing the homepage of ManageOne Maintenance Portal:31943, for example, https://oc.type.com:31943
    • Default username: admin; default password: Huawei12#$

  2. On the menu bar in the upper part of the page, choose Alarms.
  3. In the alarm list, locate and click the target alarm name in the Name column. The Alarm Details and Handling Recommendations dialog box is displayed.
  4. In the Basic Information list of the Alarm Details and Handling Recommendations dialog box, search for Threshold Information.

    • alarm_source_ip indicates IP address of the source node for which the alarm is generated.
    • component_name indicates the component name.

  5. If the node information is not found in the location information, click > next to the alarm to display the alarm details. On the displayed page that contains a drop-down list, click the Original Alarms menu to query the alarm location information.
  6. Use PuTTY and alarm_source_ip to log in to the source node on which the alarm is generated.

    Default username: paas; default password: Api@shubao88

  7. Run the following commands to switch to the root user and then switch to the apigateway user:

    su - root

    Default password: Cloud12#$

    su - apigateway

  8. Run the following command to disable session logout upon timeout:

    TMOUT=0

  9. Run the following command to check whether the component status is normal:

    sh /opt/apigateway/component_name/shell/health_check.sh

    • If yes, go to 11.
    • If no, go to 10.

  10. Check whether the /opt/apigateway/component_name/shell/component_name.lock file exists.

    • If yes, run the sh /opt/apigateway/component_name/shell/restart.sh command to restart the component. Then, go to 11.
    • If no, go to 11.

  11. Check whether restart records are included in the component start log /var/log/apigateway/component_name/runtime/component_name_shell.log.

    • If yes, go to 13.
    • If no, run the sh /opt/apigateway/component_name/shell/restart.sh command to restart the component. Then, go to 12.

      If the message "xxx start successfully" is displayed, the service is started successfully.

  12. Check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, verify the component status is normal and manually clear the alarm. Otherwise, go to 13.

  13. Contact technical support.

Alarm Clearance

This alarm will be automatically cleared after the fault is rectified.

Related Information

None

ALM-48402 Port Conflict

Description

The port has been occupied when the API Gateway component is started.

Attribute

Alarm ID

Alarm Severity

Alarm Type

48402

Critical

Device alarms

Alarm Parameters

Parameter

Description

alarm_id

Alarm ID.

alarm_name

Alarm name.

alarm_severity

Alarm severity.

alarm_sn

Alarm serial number.

alarm_time

UTC timestamp when the alarm was generated.

alarm_type

Alarm type.

category

Alarm category.

clear_type

Alarm clearance type.

group_id

Alarm group ID.

ne_dn

Alarm source IP address.

object_instance

Alarm location information, including the IP address of the source node for which the alarm is generated, IP address of the faulty source node, component name, and alarm cause.

probable_cause

Possible cause of the alarm.

probable_repair_actions

Alarm handling suggestion, which describes the procedure for rectifying the fault.

source

Service where the alarm is generated.

sub_ne

Name of the NE where the alarm is generated.

Impact on the System

The port of the component for which the alarm is generated conflicts. As a result, the component cannot be started properly and services related to the component are affected.

Possible Causes

The port to be listened on when the component is started is occupied.

Procedure

  1. Log in to ManageOne Maintenance Portal using a browser.

    • URL: https://Address for accessing the homepage of ManageOne Maintenance Portal:31943, for example, https://oc.type.com:31943
    • Default username: admin; default password: Huawei12#$

  2. On the menu bar in the upper part of the page, choose Alarms.
  3. In the alarm list, locate and click the target alarm name in the Name column. The Alarm Details and Handling Recommendations dialog box is displayed.
  4. In the Basic Information list of the Alarm Details and Handling Recommendations dialog box, search for Threshold Information.

    • alarm_source_ip indicates IP address of the source node for which the alarm is generated.
    • component_name indicates the component name.
    • port indicates the port number.

  5. If the node information is not found in the location information, click > next to the alarm to display the alarm details. On the displayed page that contains a drop-down list, click the Original Alarms menu to query the alarm location information.
  6. Use PuTTY and alarm_source_ip to log in to the source node on which the alarm is generated.

    Default username: paas; default password: Api@shubao88

  7. Run the following command to switch to the root user:

    su - root

    Default password: Cloud12#$

  8. Run the following command to disable session logout upon timeout:

    TMOUT=0

  9. Run the following command to check the process ID of the occupied port:

    netstat -tlnp | grep port

  10. Run the following command to view the information about the process of the process ID:

    ps -ef | grep Process ID

  11. Check whether the process information contains the component_name.

    • If yes, go to 14.
    • If no, run the kill process ID command to delete the invalid process and go to 12.

  12. Switch to the user corresponding to the component, as shown in Table 16-2, and run the sh /opt/apigateway/component_name/shell/restart.sh command to restart the component.

    If the message "xxx start successfully" is displayed, the service is started successfully.

    Table 16-2 Mapping between components and users

    Component

    User

    Command

    gaussdb

    apigw_db

    su - apigw_db

    adminportal

    apigw_portal

    su - apigw_portal

    apigmgr

    apigw_apimgr

    su - apigw_apimgr

    cassandra

    apigw_scdb

    su - apigw_scdb

    Others

    apigateway

    su - apigateway

  13. Check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 14.

  14. Contact technical support.

Alarm Clearance

This alarm will be automatically cleared after the fault is rectified.

Related Information

None

ALM-48409 File Handle Usage Has Reached the Threshold

Description

This alarm is generated when the file handle usage of an API gateway node exceeds 80%.

Attribute

Alarm ID

Alarm Severity

Alarm Type

48409

Minor

Device alarms

Alarm Parameters

Parameter

Description

alarm_id

Alarm ID.

alarm_name

Alarm name.

alarm_severity

Alarm severity.

alarm_sn

Alarm serial number.

alarm_time

UTC timestamp when the alarm was generated.

alarm_type

Alarm type.

category

Alarm category.

clear_type

Alarm clearance type.

group_id

Alarm group ID.

ne_dn

Alarm source IP address.

object_instance

Alarm location information, including the IP address of the source node for which the alarm is generated, IP address of the faulty source node, component name, and alarm cause.

probable_cause

Possible cause of the alarm.

probable_repair_actions

Alarm handling suggestion, which describes the procedure for rectifying the fault.

source

Service where the alarm is generated.

sub_ne

Name of the NE where the alarm is generated.

Impact on the System

If the file handle usage reaches or exceeds the threshold, the system may fail to create processes and service performance is reduced.

Possible Causes

The file handle usage of the server where the gateway is deployed reaches the threshold.

Procedure

  1. Log in to ManageOne Maintenance Portal using a browser.

    • URL: https://Address for accessing the homepage of ManageOne Maintenance Portal:31943, for example, https://oc.type.com:31943
    • Default username: admin; default password: Huawei12#$

  2. On the menu bar in the upper part of the page, choose Alarms.
  3. In the alarm list, locate and click the target alarm name in the Name column. The Alarm Details and Handling Recommendations dialog box is displayed.
  4. In the Basic Information list of the Alarm Details and Handling Recommendations dialog box, search for Threshold Information.

    • alarm_source_ip indicates IP address of the source node for which the alarm is generated.
    • component_name indicates the component name.

  5. If the node information is not found in the location information, click > next to the alarm to display the alarm details. On the displayed page that contains a drop-down list, click the Original Alarms menu to query the alarm location information.
  6. Use PuTTY and alarm_source_ip to log in to the source node on which the alarm is generated.

    Default username: paas; default password: Api@shubao88

  7. Run the following command to switch to the root user:

    su - root

    Default password: Cloud12#$

  8. Run the following command to disable session logout upon timeout:

    TMOUT=0

  9. Run the following command to check the maximum number of file handles:

    ulimit -n

    Check whether the maximum number of file handles is greater than 65535.

    • If yes, go to 10.
    • If no, go to 11.

  10. Run the following command to increase the number of system file handles and check whether 65535 is returned:

    ulimit -n 65535

    For example:

    • If yes, go to 12.
    • If no, go to 14.

  11. Run the following command to check the process ID of the alarm component:

    ps -ef | grep component_name

    For example:

  12. Run the following command to check the number of file handles occupied by the process:

    lsof -p PID | wc -l

    For example:

    After collecting the information about the faulty process, go to 13.

  13. Wait for 3 to 5 minutes and check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 14.

  14. Contact technical support.

Alarm Clearance

This alarm will be automatically cleared after the fault is rectified.

Related Information

None

ALM-48413 daemon Task Is Not Added

Description

No scheduled task exists on some components of API Gateway.

Attribute

Alarm ID

Alarm Severity

Alarm Type

48413

Warning

Device alarms

Alarm Parameters

Parameter

Description

alarm_id

Alarm ID.

alarm_name

Alarm name.

alarm_severity

Alarm severity.

alarm_sn

Alarm serial number.

alarm_time

UTC timestamp when the alarm was generated.

alarm_type

Alarm type.

category

Alarm category.

clear_type

Alarm clearance type.

group_id

Alarm group ID.

ne_dn

Alarm source IP address.

object_instance

Alarm location information, including the IP address of the source node for which the alarm is generated, IP address of the faulty source node, component name, and alarm cause.

probable_cause

Possible cause of the alarm.

probable_repair_actions

Alarm handling suggestion, which describes the procedure for rectifying the fault.

source

Service where the alarm is generated.

sub_ne

Name of the NE where the alarm is generated.

Impact on the System

The scheduled task of the component cannot run properly. When a component runs abnormally, the system scheduled task does not start the component.

Possible Causes

The shell scheduled task of the component that reports the alarm is invalid.

Procedure

  1. Log in to ManageOne Maintenance Portal using a browser.

    • URL: https://Address for accessing the homepage of ManageOne Maintenance Portal:31943, for example, https://oc.type.com:31943
    • Default username: admin; default password: Huawei12#$

  2. On the menu bar in the upper part of the page, choose Alarms.
  3. In the alarm list, locate and click the target alarm name in the Name column. The Alarm Details and Handling Recommendations dialog box is displayed.
  4. In the Basic Information list of the Alarm Details and Handling Recommendations dialog box, search for Threshold Information.

    • alarm_source_ip indicates IP address of the source node for which the alarm is generated.
    • certificate_information indicates the certificate name and expiration time of the component.

  5. If the node information is not found in the location information, click > next to the alarm to display the alarm details. On the displayed page that contains a drop-down list, click the Original Alarms menu to query the alarm location information.
  6. Use PuTTY and alarm_source_ip to log in to the source node on which the alarm is generated.

    Default username: paas; default password: Api@shubao88

  7. Run the following command to switch to the root user:

    su - root

    Default password: Cloud12#$

  8. Run the following command to disable session logout upon timeout:

    TMOUT=0

  9. Run the following command to check whether the component has no scheduled task when started properly:

    crontab -l | grep Component name

    • If yes, contact technical support.
    • If no, go to 10.

  10. Run the following command to add a scheduled task:

    (crontab -u Username -l; echo "Scheduled task content" ) | crontab -u Username -

    A command example is (crontab -u apigateway -l; echo "*/10 * * * * sh /opt/apigateway/common/check_log.sh "opsagent">/dev/null 2>&1" ) | crontab -u apigateway -

    The usernames and scheduled tasks of each component are as follows:

    Component

    Username

    Scheduled Task Content

    filebeat

    apigateway

    */10 * * * * sh /opt/apigateway/common/check_log.sh "filebeat">/dev/null 2>&1

    filebeat

    apigateway

    */1 * * * * /opt/apigateway/filebeat/shell/monitor.sh >/dev/null 2>&1

    ntp

    apigateway

    */10 * * * * sh /opt/apigateway/common/check_log.sh "ntp">/dev/null 2>&1

    ntp

    apigateway

    0 0 * * * /opt/apigateway/ntp/shell/ntpdate.sh >/dev/null 2>&1

    ntp

    apigateway

    */1 * * * * /opt/apigateway/ntp/shell/monitor.sh >/dev/null 2>&1

    ftpfilesync

    apigateway

    */10 * * * * sh /opt/apigateway/common/check_log.sh "ftpfilesync">/dev/null 2>&1

    ftpfilesync

    apigateway

    */1 * * * * /opt/apigateway/ftpfilesync/shell/monitor.sh >/dev/null 2>&1

    curl

    apigateway

    */10 * * * * sh /opt/apigateway/common/check_log.sh "curl">/dev/null 2>&1

    jre

    apigateway

    */10 * * * * sh /opt/apigateway/common/check_log.sh "jre">/dev/null 2>&1

    net-snmp

    apigateway

    */10 * * * * sh /opt/apigateway/common/check_log.sh "net-snmp">/dev/null 2>&1

    alarmserver

    apigateway

    */10 * * * * sh /opt/apigateway/common/check_log.sh "alarmserver">/dev/null 2>&1

    alarmserver

    apigateway

    */1 * * * * /opt/apigateway/alarmserver/shell/monitor.sh >/dev/null 2>&1

    opsagent

    apigateway

    */10 * * * * sh /opt/apigateway/common/check_log.sh "opsagent">/dev/null 2>&1

    opsagent

    apigateway

    */1 * * * * /opt/apigateway/opsagent/shell/monitor.sh >/dev/null 2>&1

    keepalived

    apigateway

    */10 * * * * sh /opt/apigateway/common/check_log.sh "keepalived">/dev/null 2>&1

    keepalived

    apigateway

    */1 * * * * /opt/apigateway/keepalived/shell/monitor.sh >/dev/null 2>&1

    nginx

    apigateway

    */10 * * * * sh /opt/apigateway/common/check_log.sh "nginx">/dev/null 2>&1

    nginx

    apigateway

    */1 * * * * /opt/apigateway/nginx/shell/monitor.sh >/dev/null 2>&1

    nginx-mgr

    apigateway

    */10 * * * * sh /opt/apigateway/common/check_log.sh "nginx-mgr">/dev/null 2>&1

    nginx-mgr

    apigateway

    */1 * * * * /opt/apigateway/nginx-mgr/shell/monitor.sh >/dev/null 2>&1

    redis

    apigateway

    */10 * * * * sh /opt/apigateway/common/check_log.sh "redis">/dev/null 2>&1

    redis

    apigateway

    */1 * * * * source ~/.bashrc; /opt/apigateway/redis/shell/monitor.sh >/dev/null 2>&1

    redis-mgr

    apigateway

    */10 * * * * sh /opt/apigateway/common/check_log.sh "redis-mgr">/dev/null 2>&1

    redis-mgr

    apigateway

    */1 * * * * source ~/.bashrc; /opt/apigateway/redis-mgr/shell/monitor.sh >/dev/null 2>&1

    etcd

    apigateway

    */10 * * * * sh /opt/apigateway/common/check_log.sh "etcd">/dev/null 2>&1

    etcd

    apigateway

    */1 * * * * /opt/apigateway/etcd/shell/monitor.sh >/dev/null 2>&1

    gaussdb

    apigw_db

    */10 * * * * sh /opt/apigateway/common/check_log.sh "gaussdb">/dev/null 2>&1

    apimgr

    apigw_apimgr

    */10 * * * * sh /opt/apigateway/common/check_log.sh "apimgr">/dev/null 2>&1

    apimgr

    apigw_apimgr

    */1 * * * * /opt/apigateway/apimgr/shell/monitor.sh >/dev/null 2>&1

    apimgr

    apigw_apimgr

    0 0 * * * /opt/apigateway/apimgr/shell/validate_cert.sh >/dev/null 2>&1

    adminportal

    apigw_portal

    */10 * * * * sh /opt/apigateway/common/check_log.sh "adminportal">/dev/null 2>&1

    adminportal

    apigw_portal

    */1 * * * * /opt/apigateway/adminportal/shell/monitor.sh >/dev/null 2>&1

    adminportal

    apigw_portal

    0 0 * * * /opt/apigateway/adminportal/shell/validate_cert.sh >/dev/null 2>&1

    authadv

    apigateway

    */10 * * * * sh /opt/apigateway/common/check_log.sh "authadv">/dev/null 2>&1

    authadv

    apigateway

    */1 * * * * /opt/apigateway/authadv/shell/monitor.sh >/dev/null 2>&1

    Shubao

    apigateway

    */5 * * * * sh /opt/apigateway/common/check_log.sh "Shubao">/dev/null 2>&1

    Shubao

    apigateway

    */1 * * * * /opt/apigateway/resty/shell/monitor.sh >/dev/null 2>&1

    Orchestration

    apigateway

    */10 * * * * sh /opt/apigateway/common/check_log.sh "Orchestration">/dev/null 2>&1

    Orchestration

    apigateway

    */1 * * * * /opt/apigateway/orchestration/shell/monitor.sh >/dev/null 2>&1

    zookeeper

    apigateway

    */10 * * * * sh /opt/apigateway/common/check_log.sh "zookeeper">/dev/null 2>&1

    zookeeper

    apigateway

    */1 * * * * /opt/apigateway/zookeeper/shell/monitor.sh >/dev/null 2>&1

    kafka

    apigateway

    */10 * * * * sh /opt/apigateway/common/check_log.sh "kafka">/dev/null 2>&1

    kafka

    apigateway

    */1 * * * * /opt/apigateway/kafka/shell/monitor.sh >/dev/null 2>&1

    statistics

    apigateway

    */10 * * * * sh /opt/apigateway/common/check_log.sh "statistics">/dev/null 2>&1

    statistics

    apigateway

    */1 * * * * /opt/apigateway/statistics/shell/monitor.sh >/dev/null 2>&1

    cassandra

    apigw_scdb

    */10 * * * * sh /opt/apigateway/common/check_log.sh "cassandra">/dev/null 2>&1

    cassandra

    apigw_scdb

    */1 * * * * /opt/apigateway/cassandra/shell/monitor.sh >/dev/null 2>&1

    API Gateway service

    apigateway

    */10 * * * * sh /opt/apigateway/common/check_log.sh "APIG">/dev/null 2>&1

    API Gateway service

    apigateway

    0 0 * * * sh /opt/apigateway/common/validate_cert.sh >/dev/null 2>&1

  11. Manually clear the alarm and check whether the alarm is generated again.

    • If no, no further action is required.
    • If yes, go to 12.

  12. Contact technical support.

Alarm Clearance

You need to manually clear the alarm.

Related Information

None

Translation
Download
Updated: 2019-08-30

Document ID: EDOC1100062365

Views: 45825

Downloads: 33

Average rating:
This Document Applies to these Products
Related Version
Related Documents
Share
Previous Next