No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

HUAWEI CLOUD Stack 6.5.0 Alarm and Event Reference 04

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Communications Alarms

Communications Alarms

ALM-48101 Failed to Access Shubao

Description

Nginx fails to access Shubao.

Attribute

Alarm ID

Alarm Severity

Alarm Type

48101

Critical

Communications alarm

Parameters

Parameter

Description

alarm_id

Alarm ID.

alarm_name

Alarm name.

alarm_severity

Alarm severity.

alarm_sn

Alarm serial number.

alarm_time

UTC timestamp when the alarm was generated.

alarm_type

Alarm type.

category

Alarm category.

clear_type

Alarm clearance type.

group_id

Alarm group ID.

ne_dn

Alarm source IP address.

object_instance

Alarm location information, including the IP address of the source node for which the alarm is generated, IP address of the faulty source node, component name, and alarm cause.

probable_cause

Possible cause of the alarm.

probable_repair_actions

Alarm handling suggestion, which describes the procedure for rectifying the fault.

source

Service where the alarm is generated.

sub_ne

Name of the NE where the alarm is generated.

Impact on the System

The service processing capability decreases, and the number of Shubao processes that can be invoked by the service plane LB decreases.

Possible Causes

  • The Shubao service is abnormal.
  • The Orchestration service is abnormal.
  • The AuthAdv service is abnormal.

Procedure

Shubao service exception:

  1. Log in to ManageOne Maintenance Portal using a browser.

    • URL: https://Address for accessing the homepage of ManageOne Maintenance Portal:31943, for example, https://oc.type.com:31943
    • Default username: admin; default password: Huawei12#$

  2. On the menu bar in the upper part of the page, choose Alarms.
  3. In the alarm list, locate and click the target alarm name in the Name column. The Alarm Details and Handling Recommendations dialog box is displayed.
  4. In the Basic Information list of the Alarm Details and Handling Recommendations dialog box, search for Threshold Information.

    • alarm_source_ip indicates IP address of the source node for which the alarm is generated.
    • fault_source_ip indicates the IP address of the faulty node.

  5. If the node information is not found in the location information, click > next to the alarm to display the alarm details. On the displayed page that contains a drop-down list, click the Original Alarms menu to query the alarm location information.
  6. Use PuTTY and {fault_source_ip} to log in to the faulty node.

    Default username: paasdefault password: Api@shubao88

  7. Run the following commands to switch to the root user and then switch to the apigateway user:

    su - root

    Default password: Cloud12#$

    su - apigateway

  8. Run the following command to disable session logout upon timeout:

    TMOUT=0

  9. Run the following command to ping the IP address of the source node for which the alarm is generated and check whether the network communication is normal:

    ping alarm_source_ip

    • If yes, go to 10.
    • If no, contact network engineers to rectify the network communication fault and go to 10.

  10. Run the following command to check whether the Shubao status is normal:

    sh /opt/apigateway/resty/shell/health_check.sh

    • If yes, go to 11.
    • If no, wait for 3 to 5 minutes, run the sh /opt/apigateway/resty/shell/restart.sh command to manually restart Shubao, and go to 11.

      If the message "xxx start successfully" is displayed, the service is started successfully.

  11. Wait for 1 minute and check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 12.

  12. Contact technical support.

Orchestration service exception:

  1. Rectify the fault by referring to ALM-48103 Shubao Failed to Access Orchestration.
  2. Wait for 1 minute and check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 3.

  3. Contact technical support.

AuthAdv service exception:

  1. Rectify the fault by referring to ALM-48104 Shubao Failed to Access AuthAdv.
  2. Wait for 1 minute and check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 3.

  3. Contact technical support.

Alarm Clearance

This alarm will be automatically cleared after the fault is rectified.

Related Information

None

ALM-48102 Failed to Access etcd

Description

The API Gateway component fails to access etcd.

Attribute

Alarm ID

Alarm Severity

Alarm Type

48102

Minor

Communications alarm

Alarm Parameters

Parameter

Description

alarm_id

Alarm ID.

alarm_name

Alarm name.

alarm_severity

Alarm severity.

alarm_sn

Alarm serial number.

alarm_time

UTC timestamp when the alarm was generated.

alarm_type

Alarm type.

category

Alarm category.

clear_type

Alarm clearance type.

group_id

Alarm group ID.

ne_dn

Alarm source IP address.

object_instance

Alarm location information, including the IP address of the source node for which the alarm is generated, IP address of the faulty source node, component name, and alarm cause.

probable_cause

Possible cause of the alarm.

probable_repair_actions

Alarm handling suggestion, which describes the procedure for rectifying the fault.

source

Service where the alarm is generated.

sub_ne

Name of the NE where the alarm is generated.

Impact on the System

As the gateway component cannot access etcd, the latest key-value cannot be obtained and the response is slow. This may affect the normal running of the gateway.

Possible Causes

The etcd cluster is faulty and cannot be accessed.

Procedure

  1. Log in to ManageOne Maintenance Portal using a browser.

    • URL: https://Address for accessing the homepage of ManageOne Maintenance Portal:31943, for example, https://oc.type.com:31943
    • Default username: admin; default password: Huawei12#$

  2. On the menu bar in the upper part of the page, choose Alarms.
  3. In the alarm list, check whether ALM-48305 Certificate Has Expired or ALM-48306 Certificate Authentication Fails is generated.

  4. In the alarm list, locate and click the target alarm name in the Name column. The Alarm Details and Handling Recommendations dialog box is displayed.
  5. In the Basic Information list of the Alarm Details and Handling Recommendations dialog box, search for Threshold Information.

    • alarm_source_ip indicates IP address of the source node for which the alarm is generated.
    • fault_source_ip indicates the IP address of the faulty node.

  6. If the node information is not found in the location information, click > next to the alarm to display the alarm details. On the displayed page that contains a drop-down list, click the Original Alarms menu to query the alarm location information.
  7. Use PuTTY and {fault_source_ip} to log in to the faulty node.

    Default username: paasdefault password: Api@shubao88

  8. Run the following commands to switch to the root user and then switch to the apigateway user:

    su - root

    Default password: Cloud12#$

    su - apigateway

  9. Run the following command to disable session logout upon timeout:

    TMOUT=0

  10. Run the following command to ping the IP address of the source node for which the alarm is generated and check whether the network communication is normal:

    ping alarm_source_ip

    • If yes, go to 11.
    • If no, contact network engineers to rectify the network communication fault and go to 12.

  11. Run the following command to check whether the etcd status is normal:

    sh /opt/apigateway/etcd/shell/health_check.sh

    • If yes, go to 12.
    • If no, wait for 3 to 5 minutes, run the sh /opt/apigateway/etcd/shell/restart.sh command to manually restart etcd and go to 12.

      If the message "xxx start successfully" is displayed, the service is started successfully.

  12. Check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 13.

  13. Contact technical support.

Alarm Clearance

This alarm will be automatically cleared after the fault is rectified.

Related Information

None

ALM-48103 Shubao Failed to Access Orchestration

Description

The Shubao component of API Gateway fails to access Orchestration.

Attribute

Alarm ID

Alarm Severity

Alarm Type

48103

Major

Communications alarm

Alarm Parameters

Parameter

Description

alarm_id

Alarm ID.

alarm_name

Alarm name.

alarm_severity

Alarm severity.

alarm_sn

Alarm serial number.

alarm_time

UTC timestamp when the alarm was generated.

alarm_type

Alarm type.

category

Alarm category.

clear_type

Alarm clearance type.

group_id

Alarm group ID.

ne_dn

Alarm source IP address.

object_instance

Alarm location information, including the IP address of the source node for which the alarm is generated, IP address of the faulty source node, component name, and alarm cause.

probable_cause

Possible cause of the alarm.

probable_repair_actions

Alarm handling suggestion, which describes the procedure for rectifying the fault.

source

Service where the alarm is generated.

sub_ne

Name of the NE where the alarm is generated.

Impact on the System

If Shubao fails to access Orchestration, APIs with orchestration function on the corresponding node cannot be used.

Possible Causes

  • Failed to start the Orchestration process.
  • The port used by the version API of the Orhcestration component is occupied.

Procedure

  • Cause 1: Failed to start the Orchestration process.
  1. Rectify the fault by referring to ALM-48401 Process Is Not Started.
  2. If the alarm persists, go to Cause 2.
  • Cause 2: The port used by the version API of the Orchestration component is occupied.
  1. Log in to ManageOne Maintenance Portal using a browser.

    • URL: https://Address for accessing the homepage of ManageOne Maintenance Portal:31943, for example, https://oc.type.com:31943
    • Default username: admin; default password: Huawei12#$

  2. On the menu bar in the upper part of the page, choose Alarms.
  3. In the alarm list, locate and click the target alarm name in the Name column. The Alarm Details and Handling Recommendations dialog box is displayed.
  4. In the Basic Information list of the Alarm Details and Handling Recommendations dialog box, search for Threshold Information.

    • alarm_source_ip indicates IP address of the source node for which the alarm is generated.
    • fault_source_ip indicates the IP address of the faulty node.

  5. If the node information is not found in the location information, click > next to the alarm to display the alarm details. On the displayed page that contains a drop-down list, click the Original Alarms menu to query the alarm location information.
  6. Use PuTTY and {fault_source_ip} to log in to the faulty node.

    Default username: paasdefault password: Api@shubao88

  7. Run the following command to switch to the root user:

    su - root

    Default password: Cloud12#$

  8. Run the following command to disable session logout upon timeout:

    TMOUT=0

  9. Run the following command to ping the IP address of the source node for which the alarm is generated and check whether the network communication is normal:

    ping alarm_source_ip

    • If yes, go to 10.
    • If no, contact network engineers to rectify the network communication fault and go to 13.

  10. Run the following commands to check whether the port of Orchestration is used by another process:

    1. Query the PID of the Orchestration process.

      ps -ef | grep lib/patches/esdk-orchestration-synapse-ext

    2. Query the PID of the process that listens on the 8280 port.

      lsof -i:8280 | grep LISTEN

    3. Query the PID of the process that listens on the 8281 port.

      lsof -i:8281 | grep LISTEN

  11. Check whether the queried PIDs are the same.

    • If yes, go to 13.
    • If no, go to 12.

  12. Stop the process that listens on the 8280 or 8281 port.

    1. Run the kill pid command to stop the process that listens on the 8280 or 8281 port.
    1. Run the su - apigateway command to switch to the apigateway user.
    2. Run the sh /opt/apigateway/orchestration/shell/restart.sh command to restart Orchestration.

    If the message "xxx start successfully" is displayed, the service is started successfully.

  13. Wait for 1 minute and check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 14.

  14. Contact technical support.

Alarm Clearance

This alarm will be automatically cleared after the fault is rectified.

Related Information

None

ALM-48104 Shubao Failed to Access AuthAdv

Description

The Shubao component of API Gateway fails to access AuthAdv.

Attribute

Alarm ID

Alarm Severity

Alarm Type

48104

Major

Communications alarm

Alarm Parameters

Parameter

Description

alarm_id

Alarm ID.

alarm_name

Alarm name.

alarm_severity

Alarm severity.

alarm_sn

Alarm serial number.

alarm_time

UTC timestamp when the alarm was generated.

alarm_type

Alarm type.

category

Alarm category.

clear_type

Alarm clearance type.

group_id

Alarm group ID.

ne_dn

Alarm source IP address.

object_instance

Alarm location information, including the IP address of the source node for which the alarm is generated, IP address of the faulty source node, component name, and alarm cause.

probable_cause

Possible cause of the alarm.

probable_repair_actions

Alarm handling suggestion, which describes the procedure for rectifying the fault.

source

Service where the alarm is generated.

sub_ne

Name of the NE where the alarm is generated.

Impact on the System

The AuthAdv service on the faulty node cannot be accessed, and the gateway authentication function is abnormal.

Possible Causes

  • Failed to start the AuthAdv process.
  • The port used by the version API of the AuthAdv component is occupied.

Procedure

  • Cause 1: Failed to start the AuthAdv process.
  1. Rectify the fault by referring to ALM-48401 Process Is Not Started.
  2. If the alarm persists, go to Cause 2.
  • Cause 2: The port used by the version API of the AuthAdv component is occupied.
  1. Log in to ManageOne Maintenance Portal using a browser.

    • URL: https://Address for accessing the homepage of ManageOne Maintenance Portal:31943, for example, https://oc.type.com:31943
    • Default username: admin; default password: Huawei12#$

  2. On the menu bar in the upper part of the page, choose Alarms.
  3. In the alarm list, locate and click the target alarm name in the Name column. The Alarm Details and Handling Recommendations dialog box is displayed.
  4. In the Basic Information list of the Alarm Details and Handling Recommendations dialog box, search for Threshold Information.

    • alarm_source_ip indicates IP address of the source node for which the alarm is generated.
    • fault_source_ip indicates the IP address of the faulty node.

  5. If the node information is not found in the location information, click > next to the alarm to display the alarm details. On the displayed page that contains a drop-down list, click the Original Alarms menu to query the alarm location information.
  6. Use PuTTY and {fault_source_ip} to log in to the faulty node.

    Default username: paasdefault password: Api@shubao88

  7. Run the following command to switch to the root user:

    su - root

    Default password: Cloud12#$

  8. Run the following command to disable session logout upon timeout:

    TMOUT=0

  9. Run the following command to ping the IP address of the source node for which the alarm is generated and check whether the network communication is normal:

    ping alarm_source_ip

    • If yes, go to 10.
    • If no, contact network engineers to rectify the network communication fault and go to 13.

  10. Run the following commands to check whether the port of AuthAdv is used by another process:

    1. Query the PID of the AuthAdv process.

      ps -ef | grep authadv

    2. Query the PID of the process that listens on the 8743 port.

      lsof -i:8743 | grep LISTEN

  11. Check whether the queried PIDs are the same.

    • If yes, go to 13.
    • If no, go to 12.

  12. Stop the process that listens on the 8743 port.

    1. Run the kill pid command to stop the process that listens on the 8743 port.
    2. Run the su - apigateway command to switch to the apigateway user.
    3. Run the sh /opt/apigateway/authadv/shell/restart.sh command to restart AuthAdv.

      If the message "xxx start successfully" is displayed, the service is started successfully.

  13. Wait for 1 minute and check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 14.

  14. Contact technical support.

Related Information

None

ALM-48107 Failed to Access FTP

Description

The FTP service on the host with the Ftpfilesync component of API Gateway deployed cannot be accessed.

Attribute

Alarm ID

Alarm Severity

Alarm Type

48107

Major

Communications alarm

Alarm Parameters

Parameter

Description

alarm_id

Alarm ID.

alarm_name

Alarm name.

alarm_severity

Alarm severity.

alarm_sn

Alarm serial number.

alarm_time

UTC timestamp when the alarm was generated.

alarm_type

Alarm type.

category

Alarm category.

clear_type

Alarm clearance type.

group_id

Alarm group ID.

ne_dn

Alarm source IP address.

object_instance

Alarm location information, including the IP address of the source node for which the alarm is generated, IP address of the faulty source node, component name, and alarm cause.

probable_cause

Possible cause of the alarm.

probable_repair_actions

Alarm handling suggestion, which describes the procedure for rectifying the fault.

source

Service where the alarm is generated.

sub_ne

Name of the NE where the alarm is generated.

Impact on the System

The function of downloading files, such as gateway orchestration files, cannot be provided.

Possible Causes

The FTP service of the host where the Ftpfilesync component is located is disabled or abnormal.

Procedure

  1. Log in to ManageOne Maintenance Portal using a browser.

    • URL: https://Address for accessing the homepage of ManageOne Maintenance Portal:31943, for example, https://oc.type.com:31943
    • Default username: admin; default password: Huawei12#$

  2. On the menu bar in the upper part of the page, choose Alarms.
  3. In the alarm list, locate and click the target alarm name in the Name column. The Alarm Details and Handling Recommendations dialog box is displayed.
  4. In the Basic Information list of the Alarm Details and Handling Recommendations dialog box, search for Threshold Information.

    • alarm_source_ip indicates IP address of the source node for which the alarm is generated.
    • fault_source_ip indicates the IP address of the faulty node.

  5. If the node information is not found in the location information, click > next to the alarm to display the alarm details. On the displayed page that contains a drop-down list, click the Original Alarms menu to query the alarm location information.
  6. Use PuTTY and {fault_source_ip} to log in to the faulty node.

    Default username: paasdefault password: Api@shubao88

  7. Run the following commands to switch to the root user and then switch to the apigateway user:

    su - root

    Default password: Cloud12#$

    su - apigateway

  8. Run the following command to disable session logout upon timeout:

    TMOUT=0

  9. Run the following command to ping the IP address of the source node for which the alarm is generated and check whether the network communication is normal:

    ping alarm_source_ip

    • If yes, go to 10.
    • If no, contact network engineers to rectify the network communication fault and go to 11.

  10. Run the following command to check whether the FTP service is started:

    sh /opt/apigateway/ftpfilesync/shell/health_check.sh

    • If the FTP service is in the normal, primary, or standby state, go to 11.
    • If the FTP service is abnormal, wait for 3 to 5 minutes, run the sh /opt/apigateway/ftpfilesync/shell/restart.sh command to manually restart ftpfilesync and go to 11.

      If the message "xxx start successfully" is displayed, the service is started successfully.

  11. Check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 12.

  12. Run the following command to switch to the root user:

    exit

  13. Run the following command to check whether the FTP service is normal:

    service sftpd status

    • If yes, go to 14.
    • If no, go to 16.

  14. Run the following command to manually connect to the FTP service:

    sftp -P 2022 ftpapimgr@fault_source_ip

    Default password: Rxdz%g2Y

    • If the service is connected, go to 15.
    • If the service is disconnected, go to 16.

  15. Wait for 1 minute and check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 16.

  16. Contact technical support.

Alarm Clearance

This alarm will be automatically cleared after the fault is rectified.

Related Information

None

ALM-48108 Failed to Access Shubao-Orchestration

Description

API Gateway has backend services, and the Shubao-Orchestration service is abnormal.

Attribute

Alarm ID

Alarm Severity

Alarm Type

48108

Major

Communications alarm

Alarm Parameters

Parameter

Description

alarm_id

Alarm ID.

alarm_name

Alarm name.

alarm_severity

Alarm severity.

alarm_sn

Alarm serial number.

alarm_time

UTC timestamp when the alarm was generated.

alarm_type

Alarm type.

category

Alarm category.

clear_type

Alarm clearance type.

group_id

Alarm group ID.

ne_dn

Alarm source IP address.

object_instance

Alarm location information, including the IP address of the source node for which the alarm is generated, IP address of the faulty source node, component name, and alarm cause.

probable_cause

Possible cause of the alarm.

probable_repair_actions

Alarm handling suggestion, which describes the procedure for rectifying the fault.

source

Service where the alarm is generated.

sub_ne

Name of the NE where the alarm is generated.

Impact on the System

The API orchestration function cannot be provided.

Possible Causes

The Shubao-Orchestration service is abnormal, and the Orchestration component fails to access the Shubao-Orchestration service.

Procedure

  1. Log in to ManageOne Maintenance Portal using a browser.

    • URL: https://Address for accessing the homepage of ManageOne Maintenance Portal:31943, for example, https://oc.type.com:31943
    • Default username: admin; default password: Huawei12#$

  2. On the menu bar in the upper part of the page, choose Alarms.
  3. In the alarm list, locate and click the target alarm name in the Name column. The Alarm Details and Handling Recommendations dialog box is displayed.
  4. In the Basic Information list of the Alarm Details and Handling Recommendations dialog box, search for Threshold Information.

    • alarm_source_ip indicates IP address of the source node for which the alarm is generated.
    • fault_source_ip indicates the IP address of the faulty node.

  5. If the node information is not found in the location information, click > next to the alarm to display the alarm details. On the displayed page that contains a drop-down list, click the Original Alarms menu to query the alarm location information.
  6. Use PuTTY and {fault_source_ip} to log in to the faulty node.

    Default username: paasdefault password: Api@shubao88

  7. Run the following commands to switch to the root user and then switch to the apigateway user:

    su - root

    Default password: Cloud12#$

    su - apigateway

  8. Run the following command to disable session logout upon timeout:

    TMOUT=0

  9. Run the following command to ping the IP address of the source node for which the alarm is generated and check whether the network communication is normal:

    ping alarm_source_ip

    • If yes, go to 10.
    • If no, contact network engineers to rectify the network communication fault and go to 11.

  10. Run the following command to check whether the Shubao-Orchestration status is normal:

    sh /opt/apigateway/resty/shell/health_check.sh

    • If yes, go to 11.
    • If no, run the sh /opt/apigateway/resty/shell/restart.sh command to manually restart Shubao-Orchestration and go to 11.

      If the message "xxx start successfully" is displayed, the service is started successfully.

  11. Wait for 1 minute and check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 12.

  12. Contact technical support.

Alarm Clearance

This alarm will be automatically cleared after the fault is rectified.

Related Information

None

ALM-48110 Failed to Access Databases

Description

The Database service of API Gateway is abnormal. The APIMgr component cannot be connected. APIMgr reports that Databases cannot be accessed.

Attribute

Alarm ID

Alarm Severity

Alarm Type

48110

Major

Communications alarm

Alarm Parameters

Parameter

Description

alarm_id

Alarm ID.

alarm_name

Alarm name.

alarm_severity

Alarm severity.

alarm_sn

Alarm serial number.

alarm_time

UTC timestamp when the alarm was generated.

alarm_type

Alarm type.

category

Alarm category.

clear_type

Alarm clearance type.

group_id

Alarm group ID.

ne_dn

Alarm source IP address.

object_instance

Alarm location information, including the IP address of the source node for which the alarm is generated, IP address of the faulty source node, component name, and alarm cause.

probable_cause

Possible cause of the alarm.

probable_repair_actions

Alarm handling suggestion, which describes the procedure for rectifying the fault.

source

Service where the alarm is generated.

sub_ne

Name of the NE where the alarm is generated.

Impact on the System

The gateway management plane and the alarm function cannot be used.

Possible Causes

  • The Keepalived service is abnormal.
  • The floating IP address of the database is lost.
  • The DB service is faulty or the primary and standby DB services are disabled.

Procedure

APIG on the management zone:

  1. Log in to ManageOne Maintenance Portal using a browser.

    • URL: https://Address for accessing the homepage of ManageOne Maintenance Portal:31943, for example, https://oc.type.com:31943
    • Default username: admin; default password: Huawei12#$

  2. On the menu bar in the upper part of the page, choose Alarms.
  3. In the alarm list, locate and click the target alarm name in the Name column. The Alarm Details and Handling Recommendations dialog box is displayed.
  4. In the Basic Information list of the Alarm Details and Handling Recommendations dialog box, search for Threshold Information.

    • alarm_source_ip indicates IP address of the source node for which the alarm is generated.
    • fault_source_ip indicates the IP address of the faulty node.

  5. If the node information is not found in the location information, click > next to the alarm to display the alarm details. On the displayed page that contains a drop-down list, click the Original Alarms menu to query the alarm location information.
  6. Use PuTTY and {fault_source_ip} to log in to the faulty node.

    Default username: paasdefault password: Api@shubao88

  7. Run the following command to switch to the root user:

    su - root

    Default password: Cloud12#$

  8. Run the following command to switch to the apigw_db user:

    su - apigw_db

  9. Run the following command to disable session logout upon timeout:

    TMOUT=0

  10. Run the following command to check whether GaussDB is connected:

    gsql -p 5432 -h $IP -d console -U apigw -W $gaussdb_password

    $IP is the IP address of the PUB-DB node. $gaussdb_password is the GaussDB database password.

    • If GaussDB is connected, go to 11.

    • If GaussDB is not connected, go to 14.

  11. Run the following commands to switch to the apigw_apimgr user:

    exit

    su - apigw_apimgr

  12. Run the following command to restart APIMgr:

    sh /opt/apigateway/apimgr/shell/restart.sh

    If the message "xxx start successfully" is displayed, the service is started successfully.

  13. Check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 14.

  14. Contact technical support.

Alarm Clearance

This alarm will be automatically cleared after the fault is rectified.

Related Information

None

ALM-48117 Failed to Receive Heartbeat from Opsagent

Description

The heartbeat detection between primary node Alarmserver of API Gateway and Opsagent of a node is abnormal.

Attribute

Alarm ID

Alarm Severity

Alarm Type

48117

Critical

Communications alarm

Alarm Parameters

Parameter

Description

alarm_id

Alarm ID.

alarm_name

Alarm name.

alarm_severity

Alarm severity.

alarm_sn

Alarm serial number.

alarm_time

UTC timestamp when the alarm was generated.

alarm_type

Alarm type.

category

Alarm category.

clear_type

Alarm clearance type.

group_id

Alarm group ID.

ne_dn

Alarm source IP address.

object_instance

Alarm location information, including the IP address of the source node for which the alarm is generated, IP address of the faulty source node, component name, and alarm cause.

probable_cause

Possible cause of the alarm.

probable_repair_actions

Alarm handling suggestion, which describes the procedure for rectifying the fault.

source

Service where the alarm is generated.

sub_ne

Name of the NE where the alarm is generated.

Impact on the System

The heartbeat detection between Opsagent and Alarmserver is abnormal, which affects the reporting of alarms and alarm clearance information about Opsagent heartbeat faults.

Possible Causes

The Opsagent service is abnormal or disabled, and Alarmserver cannot detect Opsagent heartbeats.

Procedure

  1. Log in to ManageOne Maintenance Portal using a browser.

    • URL: https://Address for accessing the homepage of ManageOne Maintenance Portal:31943, for example, https://oc.type.com:31943
    • Default username: admin; default password: Huawei12#$

  2. On the menu bar in the upper part of the page, choose Alarms.
  3. In the alarm list, check whether ALM-48305 Certificate Has Expired or ALM-48306 Certificate Authentication Fails is generated.

  4. In the alarm list, locate and click the target alarm name in the Name column. The Alarm Details and Handling Recommendations dialog box is displayed.
  5. In the Basic Information list of the Alarm Details and Handling Recommendations dialog box, search for Threshold Information.

    • alarm_source_ip indicates IP address of the source node for which the alarm is generated.
    • fault_source_ip indicates the IP address of the faulty node.

  6. If the node information is not found in the location information, click > next to the alarm to display the alarm details. On the displayed page that contains a drop-down list, click the Original Alarms menu to query the alarm location information.
  7. Use PuTTY and {fault_source_ip} to log in to the faulty node.

    Default username: paasdefault password: Api@shubao88

  8. Run the following commands to switch to the root user and then switch to the apigateway user:

    su - root

    Default password: Cloud12#$

    su - apigateway

  9. Run the following command to disable session logout upon timeout:

    TMOUT=0

  10. Run the following command to ping the IP address of the source node for which the alarm is generated and check whether the network communication is normal:

    ping alarm_source_ip

    • If yes, go to 11.
    • If no, contact network engineers to rectify the network communication fault and go to 12.

  11. Run the following command to manually restart Opsagent:

    sh /opt/apigateway/opsagent/shell/restart.sh

    If the message "xxx start successfully" is displayed, the service is started successfully.

  12. Wait for 3 to 5 minutes and check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 13.

  13. Contact technical support.

Alarm Clearance

This alarm will be automatically cleared after the fault is rectified.

Related Information

None

ALM-48118 Failed to Access Redis-mgr

Description

The Redis-mgr cluster service of API Gateway is abnormal and , APIMgr, and Shubao fail to access the Redis-mgr cluster.

Attribute

Alarm ID

Alarm Severity

Alarm Type

48118

Major

Communications alarm

Alarm Parameters

Parameter

Description

alarm_id

Alarm ID.

alarm_name

Alarm name.

alarm_severity

Alarm severity.

alarm_sn

Alarm serial number.

alarm_time

UTC timestamp when the alarm was generated.

alarm_type

Alarm type.

category

Alarm category.

clear_type

Alarm clearance type.

group_id

Alarm group ID.

ne_dn

Alarm source IP address.

object_instance

Alarm location information, including the IP address of the source node for which the alarm is generated, IP address of the faulty source node, component name, and alarm cause.

probable_cause

Possible cause of the alarm.

probable_repair_actions

Alarm handling suggestion, which describes the procedure for rectifying the fault.

source

Service where the alarm is generated.

sub_ne

Name of the NE where the alarm is generated.

Impact on the System

When the Redis-mgr cluster service is abnormal, APIMgr, , and Shubao fail to access Redis-Mgr, affecting APIMgr, , and AppToken authentication services and Server-push services.

Possible Causes

  • The Redis-mgr cluster service is abnormal or disabled, and APIMgr,, and Shubao fail to access Redis-mgr.
  • 2. The Redis-mgr cluster service is normal. The APIMgr,, and Shubao components are faulty and fail to access Redis-mgr.

Procedure

  1. Log in to ManageOne Maintenance Portal using a browser.

    • URL: https://Address for accessing the homepage of ManageOne Maintenance Portal:31943, for example, https://oc.type.com:31943
    • Default username: admin; default password: Huawei12#$

  2. On the menu bar in the upper part of the page, choose Alarms.
  3. In the alarm list, locate and click the target alarm name in the Name column. The Alarm Details and Handling Recommendations dialog box is displayed.
  4. In the Basic Information list of the Alarm Details and Handling Recommendations dialog box, search for Threshold Information.

    • alarm_source_ip indicates IP address of the source node for which the alarm is generated.
    • fault_source_ip indicates the IP address of the faulty node.

  5. If the node information is not found in the location information, click > next to the alarm to display the alarm details. On the displayed page that contains a drop-down list, click the Original Alarms menu to query the alarm location information.
  6. Use PuTTY and {fault_source_ip} to log in to the faulty node.

    Default username: paasdefault password: Api@shubao88

  7. Run the following commands to switch to the root user and then switch to the apigateway user:

    su - root

    Default password: Cloud12#$

    su - apigateway

  8. Run the following command to disable session logout upon timeout:

    TMOUT=0

  9. Run the following command to ping the IP address of the source node for which the alarm is generated and check whether the network communication is normal:

    ping alarm_source_ip

    • If yes, go to 10.
    • If no, contact network engineers to rectify the network communication fault and go to 14.

  10. Run the following command to check whether the Redis-mgr status is normal:

    sh /opt/apigateway/redis-mgr/shell/health_check.sh

    • If yes, go to 11.
    • If no, run the sh /opt/apigateway/redis-mgr/shell/restart.sh command on all nodes of the Redis-mgr cluster to manually restart Redis-mgr and go to 11.

      If the message "xxx start successfully" is displayed, the service is started successfully.

  11. Determine the name of the NE for which the alarm is generated based on component_name in the alarm details.

    • If the value of sub_ne is Shubao, go to 12.
    • If the value of sub_ne is APIMgr, go to 13.

  12. Run the following command to check whether the Shubao status is normal:

    sh /opt/apigateway/resty/shell/health_check.sh
    • If yes, go to 14.
    • If no, run the sh /opt/apigateway/resty/shell/restart.sh command to manually restart Shubao and go to 14.

      If the message "xxx start successfully" is displayed, the service is started successfully.

  13. Run the following command to check whether the APIMgr status is normal:

    1. Run the following command to exit the apigateway user:

      exit

    2. Run the following command to switch to the apigw_apimgr user:

      su - apigw_apimgr

    3. Run the sh /opt/apigateway/apimgr/shell/health_check.sh command.
      • If the service status is normal, go to 14.
      • If the service status is abnormal, run the sh /opt/apigateway/apimgr/shell/restart.sh command to manually restart APIMgr and go to 14.

        If the message "xxx start successfully" is displayed, the service is started successfully.

  14. Wait for 1 to 2 minutes and check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 15.

  15. Contact technical support.

Alarm Clearance

This alarm will be automatically cleared after the fault is rectified.

Related Information

None

Translation
Download
Updated: 2019-08-30

Document ID: EDOC1100062365

Views: 48182

Downloads: 33

Average rating:
This Document Applies to these Products
Related Version
Related Documents
Share
Previous Next