No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

HUAWEI CLOUD Stack 6.5.0 Alarm and Event Reference 04

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Alarm Reference

Alarm Reference

1020799 Failed to Create a Replication Copy

Description

This alarm is generated when a replication copy fails to be created.

Attribute

ID

Alarm Level

Automatically Cleared

1020799

Major

Yes

Impact on the System

The failure to create a replication copy will affect subsequent restore operations.

Possible Causes

  • The connection to eBackupWorkFlow is abnormal.
  • eBackup malfunctions.

Procedure

  1. Possible cause 1: The connection to eBackupWorkFlow is abnormal.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    2. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    3. Run the check_karbor_connect command to check whether the Check_Result value of eBackupWorkFlow is OK.
      • If yes, go to 2.
      • If no, go to 1.d.
    4. Run the docker exec -ti karborapi bash -c "cat /etc/karbor/karbor.conf" | grep ebackup_lb_ip_address command to obtain the IP address of eBackupWorkFlow. Search for workflow_management_float_ip in the xxx_export_all_EN.xlsm file exported by HUAWEI CLOUD Stack Deploy for deploying or expanding the cloud service. Check whether the two values are the same.
      • If yes, contact technical support for assistance.
      • If no, run the set_ebackup_plugin --ebackup_url IP address of eBackupWorkFlow command to reconfigure the IP address of eBackupWorkFlow. The default password is Huawei@CLOUD8!. Run the check_karbor_connect command again. If the Check_Result value of eBackupWorkFlow is not OK, contact technical support for assistance.
    5. Contact the tenant corresponding to domain_name in the alarm's additional information and instruct the tenant to log in to ManageOne Operation Portal using the VDC administrator account, switch to the project corresponding to project_name in the alarm's additional information, go to the Cloud Server Backup Service page, and copy the policy corresponding to plan_name in the alarm's additional information immediately.

  2. Possible cause 2: eBackup malfunctions.

    1. Use the VDC administrator account of the tenant corresponding to domain_name in the alarm's additional information to log in to ManageOne Operation Portal. Switch to the project corresponding to project_name in the alarm's additional information. On the Cloud Server Backup Service page, obtain the failure cause of the corresponding task or copy.
    2. Use PuTTY to log in to each Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    3. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    4. View the log file /var/log/huawei/dj/services/system/karbor/karbor-protection/karbor-protection.log to collect the details of the error logs generated at the time when the failure occurs.
    5. Contact technical support for assistance.

Related Information

None

1020791 Failed to Automatically Schedule the Backup Policy

Description

Failed to automatically schedule the backup policy.

Attribute

ID

Alarm Level

Automatically Cleared

1020791

Major

Yes

Impact on the System

No backup is automatically generated based on the backup policy, which will affect subsequent restore operations.

Possible Causes

  • The component is abnormal.
  • The connection to Nova is abnormal. VM information cannot be obtained.
  • The connection to Cinder is abnormal. Volume information cannot be obtained.
  • The backup quota is insufficient.

Procedure

  1. Possible cause 1: Component status is abnormal.

    1. Check whether any alarm indicating an abnormal component is reported.
      • If yes, rectify the component fault based on recommended suggestions.
      • If no, go to 2.
    2. Contact the tenant corresponding to domain_name in the alarm's additional information and instruct the tenant to log in to ManageOne Operation Portal using the VDC administrator account, switch to the project corresponding to project_name in the alarm's additional information, go to the Cloud Server Backup Service page, and manually back up the policy corresponding to backup_policy_name in the alarm location information.

  2. Possible cause 2: The connection to Nova is abnormal. VM information cannot be obtained.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    2. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    3. Run the check_karbor_connect command to check the Check_Result value of Nova.
      • If the value is OK, go to 3.
      • If the value is SSLError, update the FusionSphere-PKI certificate of FusionSphere by referring to HUAWEI CLOUD Stack 6.5.0 Security Management Guide and choose Certificate Management > Replacing Certificates in a Unified Manner > Updating a Certificate. Repeat 2.c.
      • If the value is Error, run the set_karbor_endpoints --nova_endpoint URL of Nova command to set the endpoint of Nova. You can search for DMK_g_regions:fsp_Cascading.nova in the xxx_export_all_EN.xlsm file exported by HUAWEI CLOUD Stack Deploy to obtain the URL of Nova. Repeat 2.c.
    4. If the fault persists, contact technical support for assistance.

  3. Possible cause 3: The connection to Cinder is abnormal. Volume information cannot be obtained.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    2. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    3. Run the check_karbor_connect command to check the Check_Result value of Cinder.
      • If the value is OK, go to 4.
      • If the value is SSLError, update the FusionSphere-PKI certificate of FusionSphere by referring to HUAWEI CLOUD Stack 6.5.0 Security Management Guide and choose Certificate Management > Replacing Certificates in a Unified Manner > Updating a Certificate. Repeat 3.c.
      • If the value is Error, run the set_karbor_endpoints --cinder_endpoint URL of Cinder command to set the endpoint of Cinder. You can search for DMK_g_regions:fsp_Cascading.cinder in the xxx_export_all_EN.xlsm file exported by HUAWEI CLOUD Stack Deploy to obtain the URL of Cinder. Repeat 3.c.
    4. If the fault persists, contact technical support for assistance.

  4. Possible cause 4: The backup quota is insufficient.

    1. Check whether the value of fail_code in the additional information is CSBS.9006.
      • If yes, contact the tenant corresponding to domain_name in the additional information and instruct the tenant to log in to ManageOne Operation Portal using the VDC administrator account, switch to the project corresponding to project_name in the additional information, go to the Cloud Server Backup Service page, and apply for backup space. For details about how to apply for space, see section Applying for Space in Cloud Server Backup Service (CSBS) of HUAWEI CLOUD Stack 6.5.0 User Guide.
      • If no, contact technical support for assistance.

Related Information

None

1020790 Failed to Automatically Schedule the Replication Policy

Description

Failed to automatically schedule the replication policy.

Attribute

ID

Alarm Level

Automatically Cleared

1020790

Major

Yes

Impact on the System

No replica is automatically generated based on the replication policy, which will affect subsequent restore operations.

Possible Causes

  • The component is abnormal.
  • The connection to Nova is abnormal. VM information cannot be obtained.
  • The connection to Cinder is abnormal. Volume information cannot be obtained.
  • The replication quota is insufficient.

Procedure

  1. Possible cause 1: Component status is abnormal.

    1. Check whether any alarm indicating an abnormal component is reported.
      • If yes, rectify the component fault based on recommended suggestions.
      • If no, go to 2.
    2. Contact the tenant corresponding to domain_name in the alarm's additional information and instruct the tenant to log in to ManageOne Operation Portal using the VDC administrator account, switch to the project corresponding to project_name in the alarm's additional information, go to the Cloud Server Backup Service page, and manually back up the policy corresponding to backup_policy_name in the alarm location information.

  2. Possible cause 2: The connection to Nova is abnormal. VM information cannot be obtained.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    2. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    3. Run the check_karbor_connect command to check the Check_Result value of Nova.
      • If the value is OK, go to 3.
      • If the value is SSLError, update the FusionSphere-PKI certificate of FusionSphere by referring to HUAWEI CLOUD Stack 6.5.0 Security Management Guide and choose Certificate Management > Replacing Certificates in a Unified Manner > Updating a Certificate. Repeat 2.c.
      • If the value is Error, run the set_karbor_endpoints --nova_endpoint URL of Nova command to set the endpoint of Nova. You can search for DMK_g_regions:fsp_Cascading.nova in the xxx_export_all_EN.xlsm file exported by HUAWEI CLOUD Stack Deploy to obtain the URL of Nova. Repeat 2.c.
    4. If the fault persists, contact technical support for assistance.

  3. Possible cause 3: The connection to Cinder is abnormal. Volume information cannot be obtained.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    2. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    3. Run the check_karbor_connect command to check the Check_Result value of Cinder.
      • If the value is OK, go to 4.
      • If the value is SSLError, update the FusionSphere-PKI certificate of FusionSphere by referring to HUAWEI CLOUD Stack 6.5.0 Security Management Guide and choose Certificate Management > Replacing Certificates in a Unified Manner > Updating a Certificate. Repeat 3.c.
      • If the value is Error, run the set_karbor_endpoints --cinder_endpoint URL of Cinder command to set the endpoint of Cinder. You can search for DMK_g_regions:fsp_Cascading.cinder in the xxx_export_all_EN.xlsm file exported by HUAWEI CLOUD Stack Deploy to obtain the URL of Cinder. Repeat 3.c.
    4. If the fault persists, contact technical support for assistance.

  4. Possible cause 4: The replication quota is insufficient.

    1. Check whether the value of fail_code in the additional information is CSBS.9006.
      • If yes, contact the tenant corresponding to domain_name in the additional information and instruct the tenant to log in to ManageOne Operation Portal using the VDC administrator account, switch to the project corresponding to project_name in the additional information, go to the Cloud Server Backup Service page, and apply for backup space. For details about how to apply for space, see section Applying for Space in Cloud Server Backup Service (CSBS) of HUAWEI CLOUD Stack 6.5.0 User Guide.
      • If no, contact technical support for assistance.

Related Information

None

1020788 Failed to Back Up ECSs

Description

This alarm is generated when ECSs fail to be backed up.

Attribute

ID

Alarm Level

Automatically Cleared

1020788

Major

Yes

Impact on the System

No backup is generated successfully, affecting subsequent restore operations.

Possible Causes

  • The connection to eBackupWorkFlow is abnormal.
  • A fault occurs on eBackup.

Procedure

  1. Possible cause 1: The connection to eBackupWorkFlow is abnormal.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    1. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    2. Run the check_karbor_connect command to check whether the Check_Result value of eBackupWorkFlow is OK.
      • If yes, go to 2.
      • If no, go to 1.d.
    3. Run the docker exec -ti karborapi bash -c "cat /etc/karbor/karbor.conf" | grep ebackup_lb_ip_address command to obtain the IP address of eBackupWorkFlow. Search for workflow_management_float_ip in the xxx_export_all_EN.xlsm file exported by HUAWEI CLOUD Stack Deploy for deploying or expanding the cloud service. Check whether the two values are the same.
      • If yes, contact technical support for assistance.
      • If no, run the set_ebackup_plugin --ebackup_url eBackupWorkFlow IP address of eBackupWorkFlow command to reconfigure the IP address of eBackupWorkFlow. The default password is Huawei@CLOUD8!. Run the check_karbor_connect command again. If the Check_Result value of eBackupWorkFlow is not OK, contact technical support for assistance.
    4. Contact the tenant corresponding to domain_name in the alarm's additional information and instruct the tenant to log in to ManageOne Operation Portal using the VDC administrator account, switch to the project corresponding to project_name in the alarm's additional information, go to the Cloud Server Backup Service page, and back up the policy corresponding to plan_name in the alarm's additional information immediately.

  2. Possible cause 2: A fault occurs on eBackup.

    1. Use the VDC administrator account of the tenant corresponding to domain_name in the alarm's additional information to log in to ManageOne Operation Portal. Switch to the project corresponding to project_name in the alarm's additional information. On the Cloud Server Backup Service page, obtain the failure cause of the corresponding task or backup.
    2. Use PuTTY to log in to each Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    3. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    4. View the log file /var/log/huawei/dj/services/system/karbor/karbor-protection/karbor-protection.log to collect the details of the error logs generated at the time when the failure occurs.
    5. Contact technical support for assistance.

Related Information

None

1020786 Failed to Send Operation Logs

Description

This alarm is generated when operation logs fail to be sent to the Cloud Trace Service.

Attribute

ID

Alarm Level

Automatically Cleared

1020786

Major

Yes

Impact on the System

Operation logs of some users may be lost.

Possible Causes

  • The network between the backup system and Cloud Trace Service is disconnected.
  • The URL of the Cloud Trace Service is incorrect.
  • The Cloud Trace Service is faulty.

Procedure

  1. Possible cause 1: The URL of the Cloud Trace Service is incorrect.

    1. Search for DMK_g_console:silvan.rest_address in the xxx_export_all_EN.xlsm file exported by HUAWEI CLOUD Stack Deploy to obtain the URL of Cloud Trace Service.
    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    1. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    2. Run the docker exec -ti karborapi bash -c "cat /etc/karbor/karbor.conf" | grep cts_endpoint command to obtain the configured URL of Cloud Trace Service.
    3. Check whether the preceding two addresses are the same.
      • If yes, go to 2.
      • If no, go to 1.f.
    4. Run the set_karbor_endpoints --cts_endpoint URL of Cloud Trace Service command to configure the correct URL of Cloud Trace Service. If Set karbor endpoints successfully is displayed in the command output, the command is successfully executed.

  2. Possible cause 2: The network between the backup system and Cloud Trace Service is disconnected.

    1. Run the ping command (run the ping6 command in the IPv6 environment) to check whether the network (obtained from the preceding URL) to Cloud Trace Service is normal.
      • If yes, go to 3.
      • If no, contact the network administrator to recover the network.

  3. Possible cause 3: The Cloud Trace Service is faulty.

    1. Contact technical support for assistance.

Related Information

None

1020783 Failed to Enable the Message Queue Service

Description

Failed to enable the message queue service.

Attribute

ID

Alarm Level

Automatically Cleared

1020783

Major

Yes

Impact on the System

The message queue service failed to be enabled and may be unable to provide services.

Possible Causes

A network fault occurs.

Procedure

  1. Possible cause: A network fault occurs.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    1. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    2. Log in to the CLI and run the following command to query the status of RabbitMQ:

      show_service --service rabbitmq

      Check whether the status of RabbitMQ in the command output is fault.

      • If yes, go to 1.d.
      • If no, manually clear the alarm. Then, no further action is required.
    3. Log in to the CLI and run the following command to stop RabbitMQ:

      stop_service --service rabbitmq

      View the command output to check whether the execution result is Successfully.

      • If yes, go to 1.e.
      • If no, go to 1.g.
    4. Log in to the CLI and run the following command to restart RabbitMQ:

      start_service --service rabbitmq

      View the command output to check whether the execution result is Successfully.

      • If yes, go to 1.f.
      • If no, go to 1.g.
    5. Log in to the CLI and run the following command to query the status of RabbitMQ again:

      show_service --node rabbitmq

      Check whether the status of RabbitMQ in the command output is fault.

      • If yes, go to 1.g.
      • If no, manually clear the alarm. Then, no further action is required.
    6. Contact technical support for assistance

Related Information

None

1020782 Responses to Messages in the Message Queue Timed Out

Description

Responses to messages in the message queue timed out.

Attribute

ID

Alarm Level

Automatically Cleared

1020782

Major

Yes

Impact on the System

Responses timed out. This may cause services, such as backup, to work improperly.

Possible Causes

The message queue service is abnormal.

Procedure

  1. Possible cause: The message queue service is abnormal.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    1. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    2. Log in to the CLI and run the following command to query the status of RabbitMQ:

      show_service --service rabbitmq

      Check whether the status of RabbitMQ in the command output is fault.

      • If yes, go to 1.d.
      • If no, go to 1.g.
    3. Log in to the CLI and run the following command to stop RabbitMQ:

      stop_service --service rabbitmq

      View the command output to check whether the execution result is Successfully.

      • If yes, go to 1.e.
      • If no, go to 1.g.
    4. Log in to the CLI and run the following command to restart RabbitMQ:

      start_service --service rabbitmq

      View the command output to check whether the execution result is Successfully.

      • If yes, go to 1.f.
      • If no, go to 1.g.
    5. Log in to the CLI and run the following command to query the status of RabbitMQ again:

      show_service --node rabbitmq

      Check whether the status of RabbitMQ in the command output is fault.

      • If yes, go to 1.g.
      • If no, manually clear the alarm. Then, no further action is required.
    6. Contact technical support for assistance

Related Information

None

1020779 Failed to Create a VBS Replica

Description

Failed to create an EVS disk replica.

Attribute

ID

Alarm Level

Automatically Cleared

1020779

Major

Yes

Impact on the System

The failure to create a replica will affect subsequent restore operations.

Possible Causes

  • The connection to Cinder is abnormal.
  • A fault occurs on eBackup.

Procedure

  1. Possible cause 1: The connection to Cinder is abnormal.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    1. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    1. Run the check_karbor_connect command to check whether the Check_Result value of Cinder is OK.
      • If yes, go to 2.
      • If no, go to 1.d.
    2. Run the docker exec –ti karborapi bash –c "cat /etc/karbor/karbor.conf" | grep cinder_endpoint command to obtain the Cinder URL. Search for the value of DMK_g_regions:fsp_Cascading.cinder in the xxx_export_all_EN.xlsm file exported by HUAWEI CLOUD Stack Deploy to check whether the value is same as the Cinder URL.
      • If no, run the set_karbor_endpoints --cinder_endpoint URL of Cinder command to set the endpoint of Cinder. Run the check_karbor_connect command again. If the Check_Result value of Cinder is not OK, contact technical support for assistance.
      • If yes, contact technical support for assistance.
    3. Contact the tenant corresponding to domain_name in the alarm's additional information and instruct the tenant to log in to ManageOne Operation Portal using the VDC administrator account, switch to the project corresponding to project_name in the alarm's additional information, go to the Volume Backup Service page, and copy the policy corresponding to plan_name in the alarm's additional information immediately.

  2. Possible cause 2: A fault occurs on eBackup.

    1. Use the VDC administrator account of the tenant corresponding to domain_name in the event's additional information to log in to ManageOne Operation Portal. Switch to the project corresponding to project_name in the event's additional information. On the Volume Backup Service page, obtain the failure cause of the corresponding task or backup.
    2. Use PuTTY to log in to each Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    3. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    4. View the log file /var/log/huawei/dj/services/system/karbor/karbor-protection/karbor-protection.log to collect the details of the error logs generated at the time when the failure occurs.
    5. Contact technical support for assistance.

Related Information

None

1020771 Failed to Automatically Schedule a VBS Backup Policy

Description

Failed to automatically schedule a VBS backup policy.

Attribute

ID

Alarm Level

Automatically Cleared

1020771

Major

Yes

Impact on the System

No backup is automatically generated based on the backup policy, which will affect subsequent restore operations.

Possible Causes

  • The component is abnormal.
  • The connection to Cinder is abnormal. Volume information cannot be obtained.
  • The backup quota is insufficient.

Procedure

  1. Possible cause 1: The component is abnormal.

    1. Check whether any alarm indicating an abnormal component is reported.
      • If yes, rectify the component fault based on recommended suggestions.
      • If no, go to 2.
    2. Contact the tenant corresponding to domain_name in the alarm's additional information. Instruct the tenant to log in to ManageOne Operation Portal using the VDC administrator account, and switch to the project corresponding to project_name in the alarm's additional information. Go to the Volume Backup Service page, and manually back up the policy corresponding to backup_policy_name in the alarm location information.

  2. Possible cause 2: The connection to Cinder is abnormal. Volume information cannot be obtained.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    1. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    2. Run the check_karbor_connect command to check the Check_Result value of Cinder.
      • If the value is OK, go to 3.
      • If the value is SSLError, update the FusionSphere-PKI certificate of FusionSphere by referring to HUAWEI CLOUD Stack 6.5.0 Security Management Guide and choose Certificate Management > Replacing Certificates in a Unified Manner > Updating a Certificate. Repeat 2.c.
      • If the value is Error, run the set_karbor_endpoints --cinder_endpoint URL of Cinder command to set the endpoint of Cinder. You can search for DMK_g_regions:fsp_Cascading.cinder in the xxx_export_all_EN.xlsm file exported by HUAWEI CLOUD Stack Deploy to obtain the URL of Cinder. Repeat 2.c.
    3. If the fault persists, contact technical support for assistance.

  3. Possible cause 3: The backup quota is insufficient.

    1. Check whether the value of error_code in the additional information is CSBS.9006.
      • If yes, contact the tenant corresponding to domain_name in the alarm's additional information. Instruct the tenant to log in to ManageOne Operation Portal using the VDC administrator account, and switch to the project corresponding to project_name in the alarm's additional information. Go to the Volume Backup Service page and apply for backup space. For details about how to apply for space, see section Applying for Space in Volume Backup Service (VBS) of HUAWEI CLOUD Stack 6.5.0 User Guide.
      • If no, contact technical support for assistance.

Related Information

None

1020770 Failed to Automatically Schedule a VBS Replication Policy

Description

Failed to automatically schedule a VBS replication policy.

Attribute

ID

Alarm Level

Automatically Cleared

1020770

Major

Yes

Impact on the System

No replica is automatically generated based on the replication policy, which will affect subsequent restore operations.

Possible Causes

  • The component is abnormal.
  • The connection to Cinder is abnormal. Volume information cannot be obtained.
  • The replication quota is insufficient.

Procedure

  1. Possible cause 1: The component is abnormal.

    1. Check whether any alarm indicating an abnormal component is reported.
      • If yes, rectify the component fault based on recommended suggestions.
      • If no, go to 2.
    2. Contact the tenant corresponding to domain_name in the alarm's additional information. Instruct the tenant to log in to ManageOne Operation Portal using the VDC administrator account, and switch to the project corresponding to project_name in the alarm's additional information. Go to the Volume Backup Service page, and manually back up the policy corresponding to backup_policy_name in the alarm location information.

  2. Possible cause 2: The connection to Cinder is abnormal. Volume information cannot be obtained.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    1. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    2. Run the check_karbor_connect command to check the Check_Result value of Cinder.
      • If the value is OK, go to 3.
      • If the value is SSLError, update the FusionSphere-PKI certificate of FusionSphere by referring to HUAWEI CLOUD Stack 6.5.0 Security Management Guide and choose Certificate Management > Replacing Certificates in a Unified Manner > Updating a Certificate. Repeat 2.c.
      • If the value is Error, run the set_karbor_endpoints --cinder_endpoint URL of Cinder command to set the endpoint of Cinder. You can search for DMK_g_regions:fsp_Cascading.cinder in the xxx_export_all_EN.xlsm file exported by HUAWEI CLOUD Stack Deploy to obtain the URL of Cinder. Repeat 2.c.
    3. If the fault persists, contact technical support for assistance.

  3. Possible cause 3: The replication quota is insufficient.

    1. Check whether the value of error_code in the additional information is CSBS.9006.
      • If yes, contact the tenant corresponding to domain_name in the alarm's additional information. Instruct the tenant to log in to ManageOne Operation Portal using the VDC administrator account, and switch to the project corresponding to project_name in the alarm's additional information. Go to the Volume Backup Service page and apply for backup space. For details about how to apply for space, see section Applying for Space in Volume Backup Service (VBS) of HUAWEI CLOUD Stack 6.5.0 User Guide.
      • If no, contact technical support for assistance.

Related Information

None

1020768 Failed to Back Up an EVS Disk

Description

Failed to back up an EVS disk.

Attribute

ID

Alarm Level

Automatically Cleared

1020768

Major

Yes

Impact on the System

No backup is generated successfully, affecting subsequent restore operations.

Possible Causes

  • The connection to Cinder is abnormal.
  • eBackup malfunctions.

Procedure

  1. Possible cause 1: The connection to Cinder is abnormal.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    1. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    1. Run the check_karbor_connect command to check whether the Check_Result value of Cinder is OK.
      • If yes, go to 2.
      • If no, go to 1.d.
    2. Run the docker exec –ti karborapi bash –c "cat /etc/karbor/karbor.conf" | grep cinder_endpoint command to obtain the Cinder URL. Search for the value of DMK_g_regions:fsp_Cascading.cinder in the xxx_export_all_EN.xlsm file exported by HUAWEI CLOUD Stack Deploy to check whether the value is same as the Cinder URL.
      • If no, run the set_karbor_endpoints --cinder_endpoint URL of Cinder command to set the endpoint of Cinder. Run the check_karbor_connect command again. If the Check_Result value of Cinder is not OK, contact technical support for assistance.
      • If yes, contact technical support for assistance.
    3. Contact the tenant corresponding to domain_name in the alarm's additional information. Instruct the tenant to log in to ManageOne Operation Portal using the VDC administrator account, and switch to the project corresponding to project_name in the alarm's additional information. Go to the Volume Backup Service page, and back up the policy corresponding to plan_name in the alarm's additional information immediately.

  2. Possible cause 2: eBackup malfunctions.

    1. Use the VDC administrator account of the tenant corresponding to domain_name in the alarm's additional information to log in to ManageOne Operation Portal. Switch to the project corresponding to project_name in the alarm's additional information. On the Volume Backup Service page, obtain the failure cause of the corresponding task or replica.
    2. Use PuTTY to log in to each Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    3. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    4. View the log file /var/log/huawei/dj/services/system/karbor/karbor-protection/karbor-protection.log to collect the details of the error logs generated at the time when the failure occurs.
    5. Contact technical support for assistance.

Related Information

None

1020762 FSP Certificate Verification Failure

Description

Failed to verify the FSP certificate.

Attribute

ID

Alarm Level

Automatically Cleared

1020762

Minor

Yes

Impact on the System

The FSP certificate will not be verified when the system connects to FSP.

Possible Causes

The FSP certificate expires, is withdrawn, or is not issued by trusted CA authorities.

Procedure

  1. Possible cause: The FSP certificate expires, is withdrawn, or is not issued by trusted CA authorities.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    1. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    1. Run the following command to check whether the FSP certificate is valid:

      curl --cacert /etc/DJSecurity/server-cert/karbor/openstack/nova_ca.crt https://nova_domain:nova_port

      In the preceding command, nova_domain indicates the Nova access domain name in the current region and nova_port indicates the Nova access port. You can run the docker exec -ti karborapi bash -c "cat /etc/karbor/karbor.conf" | grep nova_endpoint command to obtain the values.

    2. Check whether any error information about certificate verification is displayed in the command output.
      • If yes, rectify the fault based on the error information. For details, update the FusionSphere-PKI certificate of FusionSphere by referring to HUAWEI CLOUD Stack 6.5.0 Security Management Guide and choose Certificate Management > Replacing Certificates in a Unified Manner > Updating a Certificate.
      • If no, go to 1.e.
    3. Contact technical support for assistance.

Related Information

None

1020761 IAM Certificate Verification Failure

Description

Failed to verify the IAM certificate.

Attribute

ID

Alarm Level

Automatically Cleared

1020761

Minor

Yes

Impact on the System

The IAM certificate will not be verified when the system connects to FSP.

Possible Causes

The IAM certificate expires, is withdrawn, or is not issued by trusted CA authorities.

Procedure

  1. Possible cause: The IAM certificate expires, is withdrawn, or is not issued by trusted CA authorities.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    1. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    1. Run the following command to check whether the IAM certificate is valid:

      curl --cacert /etc/DJSecurity/server-ca/trust.cer https://region_iam_url:iam_port

      In the preceding command, region_iam_url indicates the IAM access address (IP address or domain name) in the current region and iam_port indicates the IAM access port. You can run the docker exec -ti karborapi bash -c "cat /etc/karbor/karbor.conf" | grep auth_url command to obtain the values.

    2. Check whether any error information about certificate verification is displayed in the command output.
      • If yes, rectify the fault based on the error information. For details, update the ManageOne-PKI certificate of ManageOne by referring to HUAWEI CLOUD Stack 6.5.0 Security Management Guide and choose Certificate Management > Replacing Certificates in a Unified Manner > Updating a Certificate.
      • If no, go to 1.e.
    3. Contact technical support for assistance.

Related Information

None

1023299 Abnormal Nodes

Description

The system periodically checks the status of all nodes. This alarm is generated when an abnormal node is detected. This alarm is cleared when the node is restored to normal.

Attribute

ID

Alarm Level

Automatically Cleared

1023299

Major

Yes

Impact on the System

The node cannot provide services.

Possible Causes

  • The VM is powered off.
  • The network connection of the node is abnormal.

Procedure

  1. Possible cause 1: The VM is powered off.

    1. Log in to ManageOne Maintenance Portal as user admin.
    2. On the Alarms page, view the Logical Location of the current alarm.
    3. Go back to the home page and choose ServiceOM > Logical Location > ECS > Compute Instances. Search for the VM corresponding to Service-CSBS by name and check whether the status and power status of the VM are Running.
      • If no, go to 1.d.
      • If yes, go to 2.
    4. Click Restart to restart the VM.

      After the device is started, wait for several minutes and then check whether the alarm is cleared.

      • If yes, no further action is required.
      • If no, go to 2.

  2. Possible cause 2: The network connection of the node is abnormal.

    1. Check whether alarm Abnormal Network Between Backup Service Nodes is reported.
      • If yes, rectify the network fault based on recommended suggestions.
      • If no, contact technical support for assistance.

Related Information

None

1023298 Abnormal Components

Description

The system periodically checks component status on each node. This alarm is generated if a node has abnormal components. This alarm is cleared after the components are restored to normal.

Attribute

ID

Alarm Level

Automatically Cleared

1023298

Major

Yes

Impact on the System

If a node has abnormal components, this node cannot carry services.

Possible Causes

The component processes exit unexpectedly.

Procedure

  1. Possible cause: The component processes exit unexpectedly.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    1. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    1. Run the following command to query the component status on this node:

      show_service --node Node name

      Check whether the command output contains components whose status is fault.

      • If yes, go to 1.d.
      • If no, go to 1.h.
    2. Run the following command to stop the faulty components:

      stop_service --service Component name

      View the command output to check whether the execution result is Successfully.

      • If yes, go to 1.e.
      • If no, go to 1.h.
    3. Run the following command to restart the faulty components:

      start_service --service Component name

      View the command output to check whether the execution result is Successfully.

      • If yes, go to 1.f.
      • If no, go to 1.h.
    4. Run the following command to re-query the component status on this node:

      show_service --node Node name

      Check whether the command output contains components whose status is fault.

      • If yes, go to 1.d.
      • If no, go to 1.g.
    5. Wait for 1 minute and check whether the alarm is cleared.
      • If yes, no further action is required.
      • If no, go to 1.h.
    6. Contact technical support for assistance.

Related Information

None

1023296 Abnormal Clock Synchronization Between the System and External NTP Server

Description

This alarm is generated when the clock synchronization between the system and external NTP server is not normal.

Attribute

ID

Alarm Level

Automatically Cleared

1023296

Major

Yes

Impact on the System

The system does not synchronize time with the external clock source.

Possible Causes

  • The network communication between the system and external NTP server is not normal.
  • The external NTP server does not run properly.

Procedure

  1. Possible cause 1: The network communication between the system and external NTP server is abnormal.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    1. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    1. Run the ntpq -p command to obtain the IP address list of the external NTP server.
    2. Run the ping command with the IP address of the external NTP server (for an IPv6 address, run ping6) to check whether the network communication between the system and external NTP server is in the normal state.
      • If yes, go to 2.
      • If no, go to 1.e.
    3. Contact the network administrator to rectify the network fault. After the system automatically synchronizes time with the external NTP server, check whether the alarm is cleared.
      • If yes, no further action is required.
      • If no, go to 2.

  2. Possible cause 2: The external NTP server does not run properly.

    1. Log in to the NTP server using SSH. The default account of the HUAWEI CLOUD Stack NTP server is untp and the default password is Huawei12#$.
    2. Run the sudo su command to switch to user root whose default password is Cloud12#$.
    3. Run the systemctl status ntpd.service command to check whether the NTP service is normal. If active(running) is displayed in the command output, the NTP service is normal.
      • If yes, contact technical support for assistance.
      • If no, run the systemctl start ntpd.service command to restart the NTP service.
    4. Wait for 10 minutes and then check whether the alarm is cleared.
      • If yes, no further action is required.
      • If no, contact technical support for assistance.

Related Information

None

1023295 Failed to Back Up System Data

Description

Failed to back up management data.

Attribute

ID

Alarm Level

Automatically Cleared

1023295

Major

Yes

Impact on the System

System data fails to be backed up, which affects system reliability.

Possible Causes

  • The disk where the backup directory is located has insufficient space.
  • The network communication between the system and external backup server is abnormal.

Procedure

  1. Check node-name in the alarm location information to locate the faulty node.

    Search for CSBS_Service1, CSBS_Service2, and CSBS_Service3 on the Tool-generated IP Parameters sheet in the xxx_export_all_EN.xlsm file exported by HUAWEI CLOUD Stack Deploy during software installation to obtain the IP addresses of the karbor1, karbor2, and karbor3 nodes.

  2. Possible cause 1: The disk where the backup directory is located has insufficient space.

    1. Use SSH to log in to the corresponding node.

      The default account is djmanager and the default password is CloudService@123!.

    2. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    3. Run the cd /opt/djbackup/db command to go to the management data backup directory and delete unnecessary backup data from the directory.

  3. Possible cause 2: The network communication between the system and external backup server is abnormal.

    1. Check whether the physical network connection between the system and external backup server is normal.
      • If yes, go to 3.b.
      • If no, contact technical support for assistance.
    2. Check whether the FTPS service on the external backup server is normal. For an FTP server, run the curl -u {ftp_username}:{ftp_password} -k ftp://{ftp_ip}:{ftp_port} --noproxy {ftp_ip} command to check whether any data can be obtained. For an FTPS server, run the url -u {ftp_username}:{decrypt_pwd} --ftp-ssl-reqd -k ftps://{ftp_ip}:{ftp_port}/ --noproxy {ftp_ip} command to check whether any data can be obtained.
      • If yes, contact technical support for assistance.
      • If no, restore the FTPS service on the external backup server. Then, check whether the alarm is cleared. If no, contact technical support for assistance.

Related Information

None

1023282 Failed to Verify the FTP Server Certificate

Description

This alarm is generated when the FTP server certificate verification fails.

Attribute

ID

Alarm Level

Automatically Cleared

1023282

Minor

Yes

Impact on the System

The system will access the FTP server using an insecure connection.

Possible Causes

The certificate has expired, has been revoked, or was issued by an untrusted CA.

Procedure

  1. Possible cause: The certificate has expired, has been revoked, or was issued by an untrusted CA.

    1. Replace the FTP server certificate by referring to Importing the CA Certificate on an FTPS Server in the HUAWEI CLOUD Stack 6.5.0 Security Management Guide and then check whether the alarm is cleared.
      • If yes, no further action is required.
      • If no, contact technical support for assistance.

Related Information

None

1023279 System Certificate Will Expire Soon

Description

This alarm is generated when the system certificate will expire soon.

Attribute

ID

Alarm Level

Automatically Cleared

1023279

Major

Yes

NOTE:
  • If the certificate validity period is less than 7 days, the alarm severity is Critical.
  • If the certificate validity period is less than 30 days, the alarm severity is major.

Impact on the System

If the system certificate expires, it is not trusted, and system functions may be affected.

Possible Causes

The period from the current time to the certificate expiry time is less than the threshold.

Procedure

  1. Possible cause: The period from the current time to the certificate expiry time is less than the threshold.

    1. View the value of certificate_name in the alarm location information.
    2. Import a new certificate based on the certificate name. For details, update the CSBS_VBS-internal certificate of CSBS_VBS by referring to HUAWEI CLOUD Stack 6.5.0 Security Management Guide and choosing Certificate Management > Replacing Certificates in a Unified Manner > Updating a Certificate.
    3. Check whether the alarm is cleared the next day.
      • If yes, no further action is required.
      • If no, contact technical support for assistance.

Related Information

None

1023278 System Certificate Has Expired

Description

This alarm is generated when the system certificate has expired.

Attribute

ID

Alarm Level

Automatically Cleared

1023278

Critical

Yes

Impact on the System

The system certificate is not trusted, and system functions may be affected.

Possible Causes

The system certificate has expired.

Procedure

  1. Possible cause: The system certificate has expired.

    1. View the value of certificate_name in the alarm location information.
    2. Import a new certificate based on the certificate name. For details, update the CSBS_VBS-internal certificate of CSBS_VBS by referring to HUAWEI CLOUD Stack 6.5.0 Security Management Guide and choosing Certificate Management > Replacing Certificates in a Unified Manner > Updating a Certificate.
    3. Check whether the alarm is cleared the next day.
      • If yes, no further action is required.
      • If no, contact technical support for assistance.

Related Information

None

1023277 Message Queue Frozen

Description

Some or all message queues are frozen so that the service cannot normally produce or consume messages.

Attribute

ID

Alarm Level

Automatically Cleared

1023277

Critical

Yes

Impact on the System

Services depending on RabbitMQ may be abnormal.

Possible Causes

RabbitMQ may be suspended after running for a long time.

Procedure

  1. Possible cause: RabbitMQ may be suspended after running for a long time.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    1. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    1. Run the stop_service --service rabbitmq command to stop all RabbitMQ nodes.
    2. Run the start_service --service rabbitmq command to start all RabbitMQ nodes. If the alarm persists, go to 1.e.
    3. Collect logs from directory /var/log/huawei/dj/services/system/rabbitmq. Then contact technical support for assistance.

Related Information

None

1023276 Network Partition Occurs in the Message Queue

Description

RabbitMQ nodes cannot intercommunicate with each other due to network faults, resulting in data inconsistency and service failures.

Attribute

ID

Alarm Level

Automatically Cleared

1023276

Critical

Yes

Impact on the System

Services depending on RabbitMQ may be abnormal.

Possible Causes

RabbitMQ nodes cannot intercommunicate with each other due to network faults on the internal plane.

Procedure

  1. Possible cause: RabbitMQ nodes cannot intercommunicate with each other due to network failure.

    1. Check whether alarm Abnormal Network Between Backup Service Nodes is reported.
      • If yes, rectify the network fault based on recommended suggestions.
      • If no, go to 1.b.
    2. Use PuTTY to log in to each Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    3. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    4. Run the stop_service --service rabbitmq command to stop all RabbitMQ nodes.
    5. Run the start_service --service rabbitmq command to start all RabbitMQ nodes. If the alarm persists, go to 1.f.
    6. Collect logs from directory /var/log/huawei/dj/services/system/rabbitmq. Then contact technical support for assistance.

Related Information

None

1023099 CPU Usage Exceeds the Threshold

Description

This alarm is generated when the CPU usage exceeds 80%.

Attribute

ID

Alarm Level

Automatically Cleared

1023099

Major

Yes

Impact on the System

The system may run slowly.

Possible Causes

The host is busy and overloaded.

Procedure

  1. Possible cause: The host is busy and overloaded.

    1. Check node-name in the alarm location information to locate the faulty node.

      Search for CSBS_Service1, CSBS_Service2, and CSBS_Service3 on the Tool-generated IP Parameters sheet in the xxx_export_all_EN.xlsm file exported by HUAWEI CLOUD Stack Deploy during software installation to obtain the IP addresses of the karbor1, karbor2, and karbor3 nodes.

    2. Use PuTTY to log in to the corresponding node.

      The default account is djmanager and the default password is CloudService@123!.

    3. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    4. Run the top command to check the processes with high CPU usage and record the PID.
    5. Run the kill PID command to forcibly end the process.
    6. Wait for several minutes and run the top command again to check whether the CPU usage decreases significantly.
      • If yes, no further action is required.
      • If no, contact technical support for assistance.

Related Information

None

1023098 Memory Usage Exceeds the Threshold

Description

This alarm is generated when memory usage exceeds 80%.

Attribute

ID

Alarm Level

Automatically Cleared

1023098

Major

Yes

Impact on the System

The system may run slowly.

Possible Causes

The host is busy and overloaded.

Procedure

  1. Possible cause: The host is busy and overloaded.

    1. Check node-name in the alarm location information to locate the faulty node.

      Search for CSBS_Service1, CSBS_Service2, and CSBS_Service3 on the Tool-generated IP Parameters sheet in the xxx_export_all_EN.xlsm file exported by HUAWEI CLOUD Stack Deploy during software installation to obtain the IP addresses of the karbor1, karbor2, and karbor3 nodes.

    2. Use PuTTY to log in to the corresponding node.

      The default account is djmanager and the default password is CloudService@123!.

    3. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    4. Run the top command to check the processes with high memory usage and record the PID.
    5. Run the kill PID command to forcibly end the process.
    6. Wait for several minutes and run the top command again to check whether the memory usage decreases significantly.
      • If yes, no further action is required.
      • If no, contact technical support for assistance.

Related Information

None

1023097 Disk Space Usage Exceeds the Threshold

Description

This alarm is generated when the disk space usage exceeds 80%.

Attribute

ID

Alarm Level

Automatically Cleared

1023097

Major

Yes

Impact on the System

System performance may deteriorate, and new data may not be saved successfully.

Possible Causes

Files stored on the disk occupy too much space.

Procedure

  1. Possible cause: Files stored on the disk occupy too much space.

    1. View alarm information to determine the partitions whose space usage is high.
    2. Check node-name in the alarm location information to locate the faulty node.

      Search for CSBS_Service1, CSBS_Service2, and CSBS_Service3 on the Tool-generated IP Parameters sheet in the xxx_export_all_EN.xlsm file exported by HUAWEI CLOUD Stack Deploy during software installation to obtain the IP addresses of the karbor1, karbor2, and karbor3 nodes.

    3. Use PuTTY to log in to the corresponding node.

      The default account is djmanager and the default password is CloudService@123!.

    4. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    5. Run the following command to access the partitions whose space usage is high:

      cd Directory of a partition whose space usage is high

    6. Run the following command to clear unnecessary files:

      rm -rf Unnecessary file name

    7. Run the following command to view the space usage of the partitions:

      df -h

      After the command is executed, the space usage of each partition is displayed. View column 6 (Mounted on) to find the directory of each partition, and view column 5 (Use%) to check whether the used space of each partition exceeds the alarm threshold.

      • If yes, go to 1.f.
      • If no, go to 1.h.
    8. Clear the partitions based on site requirements. Run the df -h command again to check whether the usage of other partitions exceeds the alarm threshold.
      • If yes, go to 1.e.
      • If no, go to 1.i.
    9. Wait for 1 minute and check whether the alarm is cleared.
      • If yes, no further action is required.
      • If no, contact technical support for assistance.

Related Information

None

1020800 Failed to Execute the Replication Policy

Description

Failed to execute the replication policy.

Attribute

ID

Alarm Level

Automatically Cleared

1020800

Major

Yes

Impact on the System

Because the replication failed, no replica is available to provision ECSs in the specified region.

Possible Causes

  • The connection to eBackupWorkFlow is abnormal.
  • eBackup malfunctions.

Procedure

  1. Possible cause 1: The connection to eBackupWorkFlow is abnormal.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    2. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    3. Run the check_karbor_connect command to check whether the Check_Result value of eBackupWorkFlow is OK.
      • If yes, go to 2.
      • If no, go to 1.d.
    4. Run the docker exec -ti karborapi bash -c "cat /etc/karbor/karbor.conf" | grep ebackup_lb_ip_address command to obtain the IP address of eBackupWorkFlow. Search for workflow_management_float_ip in the xxx_export_all_EN.xlsm file exported by HUAWEI CLOUD Stack Deploy for deploying or expanding the cloud service. Check whether the two values are the same.
      • If yes, contact technical support for assistance.
      • If no, run the set_ebackup_plugin --ebackup_url IP address of eBackupWorkFlow command to reconfigure the IP address of eBackupWorkFlow. The default password is Huawei@CLOUD8!. Run the check_karbor_connect command again. If the Check_Result value of eBackupWorkFlow is not OK, contact technical support for assistance.
    5. Contact the tenant corresponding to domain_name in the alarm's additional information and instruct the tenant to log in to ManageOne Operation Portal using the VDC administrator account, switch to the project corresponding to project_name in the alarm's additional information, go to the Cloud Server Backup Service page, and copy the policy corresponding to plan_name in the alarm's additional information immediately.

  2. Possible cause 2: eBackup malfunctions.

    1. Use the VDC administrator account of the tenant corresponding to domain_name in the alarm's additional information to log in to ManageOne Operation Portal. Switch to the project corresponding to project_name in the alarm's additional information. On the Cloud Server Backup Service page, obtain the failure cause of the corresponding task or replica.
    2. Use PuTTY to log in to each Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    3. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    4. View the log file /var/log/huawei/dj/services/system/karbor/karbor-protection/karbor-protection.log to collect the details of the error logs generated at the time when the failure occurs.
    5. Contact technical support for assistance.

Related Information

None

1020801 Failed to Perform Cross-Region Replication for the ECS

Description

Failed to perform cross-region replication for the ECS.

Attribute

ID

Alarm Level

Automatically Cleared

1020801

Major

No

Impact on the System

Because the replication failed, no replica is available to provision ECSs in the specified region.

Possible Causes

  • The connection to eBackupWorkFlow is abnormal.
  • eBackup malfunctions.

Procedure

  1. Possible cause 1: The connection to eBackupWorkFlow is abnormal.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    2. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    3. Run the check_karbor_connect command to check whether the Check_Result value of eBackupWorkFlow is OK.
      • If yes, go to 2.
      • If no, go to 1.d.
    4. Run the docker exec -ti karborapi bash -c "cat /etc/karbor/karbor.conf" | grep ebackup_lb_ip_address command to obtain the IP address of eBackupWorkFlow. Search for workflow_management_float_ip in the xxx_export_all_EN.xlsm file exported by HUAWEI CLOUD Stack Deploy for deploying or expanding the cloud service. Check whether the two values are the same.
      • If yes, contact technical support for assistance.
      • If no, run the set_ebackup_plugin --ebackup_url IP address of eBackupWorkFlow command to reconfigure the IP address of eBackupWorkFlow. The default password is Huawei@CLOUD8!. Run the check_karbor_connect command again. If the Check_Result value of eBackupWorkFlow is not OK, contact technical support for assistance.
    5. Contact the tenant corresponding to domain_name in the alarm's additional information and instruct the tenant to log in to ManageOne Operation Portal using the VDC administrator account, switch to the project corresponding to project_name in the alarm's additional information, go to the Cloud Server Backup Service page, and copy the policy corresponding to plan_name in the alarm's additional information immediately.

  2. Possible cause 2: eBackup malfunctions.

    1. Use the VDC administrator account of the tenant corresponding to domain_name in the alarm's additional information to log in to ManageOne Operation Portal. Switch to the project corresponding to project_name in the alarm's additional information. On the Cloud Server Backup Service page, obtain the failure cause of the corresponding task or replica.
    1. Use PuTTY to log in to each Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    2. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    3. View the log file /var/log/huawei/dj/services/system/karbor/karbor-protection/karbor-protection.log to collect the details of the error logs generated at the time when the failure occurs.
    4. Contact technical support for assistance.

Related Information

None

1020803 Automatic Execution Failure of the Cross-Region Replication Policy

Description

Automatic execution of the cross-region replication policy failed.

Attribute

ID

Alarm Level

Automatically Cleared

1020803

Major

Yes

Impact on the System

No cross-region replica is generated. Therefore, there is no replica available for follow-up image creation.

Possible Causes

  • The component is abnormal.
  • The connection to Nova is abnormal. As a result, the VM information cannot be obtained.
  • The connection to Cinder is abnormal. As a result, the volume information cannot be obtained.
  • The replication quota is insufficient.

Procedure

  1. Possible cause 1: Component status is abnormal.

    1. Check whether any alarm indicating an abnormal component is reported.
      • If yes, rectify the component fault based on recommended suggestions.
      • If no, go to 2.
    2. Contact the tenant corresponding to domain_name in the alarm's additional information and instruct the tenant to log in to ManageOne Operation Portal using the VDC administrator account, switch to the project corresponding to project_name in the alarm's additional information, go to the Cloud Server Backup Service page, and manually back up the policy corresponding to backup_policy_name in the alarm location information.

  2. Possible cause 2: The connection to Nova is abnormal. VM information cannot be obtained.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    2. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    3. Run the check_karbor_connect command to check the Check_Result value of Nova.
      • If the value is OK, go to 3.
      • If the value is SSLError, update the FusionSphere-PKI certificate of FusionSphere by referring to HUAWEI CLOUD Stack 6.5.0 Security Management Guide and choose Certificate Management > Replacing Certificates in a Unified Manner > Updating a Certificate. Repeat 2.c.
      • If the value is Error, run the set_karbor_endpoints --nova_endpoint URL of Nova command to set the endpoint of Nova. You can search for DMK_g_regions:fsp_Cascading.nova in the xxx_export_all_EN.xlsm file exported by HUAWEI CLOUD Stack Deploy to obtain the URL of Nova. Repeat 2.c.
    4. If the fault persists, contact technical support for assistance.

  3. Possible cause 3: The connection to Cinder is abnormal. Volume information cannot be obtained.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    2. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    3. Run the check_karbor_connect command to check the Check_Result value of Cinder.
      • If the value is OK, go to 4.
      • If the value is SSLError, update the FusionSphere-PKI certificate of FusionSphere by referring to HUAWEI CLOUD Stack 6.5.0 Security Management Guide and choose Certificate Management > Replacing Certificates in a Unified Manner > Updating a Certificate. Repeat 3.c.
      • If the value is Error, run the set_karbor_endpoints --cinder_endpoint URL of Cinder command to set the endpoint of Cinder. You can search for DMK_g_regions:fsp_Cascading.cinder in the xxx_export_all_EN.xlsm file exported by HUAWEI CLOUD Stack Deploy to obtain the URL of Cinder. Repeat 3.c.
    4. If the fault persists, contact technical support for assistance.

  4. Possible cause 4: The replication quota is insufficient.

    1. Check whether the value of fail_code in the additional information is CSBS.9006.
      • If yes, go to 4.b.
      • If no, contact technical support for assistance.
    2. Contact the tenant corresponding to domain_name in the additional information and instruct the tenant to log in to ManageOne Operation Portal using the VDC administrator account, switch to the project corresponding to project_name in the additional information.
    3. Choose Cloud Server Backup Service > Policies. Locate the policy corresponding to backup_policy_name in the location information, and view its project in Target Region under Replication.
    4. Switch to target project to apply for backup space. For details about how to apply for space, see section Applying for Space in Cloud Server Backup Service (CSBS) of HUAWEI CLOUD Stack 6.5.0 User Guide.

Related Information

None

1020759 Failed to Connect to the ManageOne Operation Platform

Description

Failed to Connect to the ManageOne Operation Platform

Attribute

ID

Alarm Level

Automatically Cleared

1020759

Major

Yes

Impact on the System

Backup space and replication space cannot be applied for by CSBS and VBS.

Possible Causes

  • The network between a Karbor node and the ManageOne operation platform is disconnected.
  • The URL of the ManageOne operation platform is incorrect.
  • The ManageOne operation platform is faulty.

Procedure

  1. Possible cause 1: The network between a Karbor node and the ManageOne Operation Portal is disconnected.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    1. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    1. Run the docker exec -ti karborapi bash -c "cat /etc/karbor/karbor.conf" | grep sc_endpoint command to obtain the URL of ManageOne Operation Portal.
    2. Run the ping command to check whether the network between the Karbor node and ManageOne Operation Portal is connected.
      • If yes, go to 2.
      • If no, contact the network administrator to recover the network.

  2. Possible cause 2: The URL of ManageOne Operation Portal is incorrect.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    1. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    1. Run the set_karbor_endpoints --sc_endpoint sc_url command to configure the correct URL of ManageOne Operation Portal. You can search for DMK_g_console:silvan.rest_address in the xxx_export_all_EN.xlsm file exported from the HUAWEI CLOUD Stack Deploy to obtain the sc_url address.
    2. Run the check_karbor_connect command to check whether the communication between the Karbor node and ManageOne Operation Portal recovers.
      • If yes, no further action is required.
      • If no, go to 3.

  3. Possible cause 3: The ManageOne Operation Portal is faulty.

    1. Contact the ManageOne Operation Portal administrator to ensure that the ManageOne Operation Portal is running properly.

Related Information

None

1020758 Failed to Report Metering Data

Description

Failed to Report Metering Data

Attribute

ID

Alarm Level

Automatically Cleared

1020758

Major

Yes

Impact on the System

CSBS cannot generate billing information.

Possible Causes

  • The network between a Karbor node and FusionSphere OpenStack is disconnected.
  • The connection to Ceilometer is abnormal.

Procedure

  1. Possible cause 1: The network between a Karbor node and FusionSphere OpenStack is disconnected.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    1. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    1. Run the docker exec -ti karborapi bash -c "cat /etc/karbor/karbor.conf" | grep ceilometer_endpoint command to obtain the URL of Ceilometer.
    2. Run the ping URL of Ceilometer command to check whether the network connection to FusionSphere OpenStack Ceilometer is normal.
      • If yes, go to 2.
      • If no, contact technical support for assistance.

  2. Possible cause 2: The connection to Ceilometer is abnormal.

    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    1. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    1. Run the check_karbor_connect command to view the value of Check_Result on Ceilometer.
      • If the value is OK, contact technical support for assistance.
      • If the value is SSLError, update the FusionSphere-PKI certificate of FusionSphere by referring to HUAWEI CLOUD Stack 6.5.0 Security Management Guide and choose Certificate Management > Replacing Certificates in a Unified Manner > Updating a Certificate. Repeat 2.c.
      • If the value is Error, run the set_karbor_endpoints --cei_endpoint URL of Ceilometer command to set the endpoint of Ceilometer. You can search for DMK_g_regions:fsp_Cascading.metering in the xxx_export_all_EN.xlsm file exported by HUAWEI CLOUD Stack Deploy to obtain the URL of Ceilometer. Repeat 2.c.
    2. If the fault persists, contact technical support for assistance.

Related Information

None

1023093 Abnormal Network Between Backup Service Nodes

Description

This alarm is generated when the network between backup service nodes is disconnected or the packet loss rate is too high.

Attribute

ID

Alarm Level

Automatically Cleared

1023093

Major

Yes

Impact on the System

None

Possible Causes

  • The VM status is abnormal.
  • The IP address does not exist.
  • Access is not allowed according to the firewall policy.
  • The route is set incorrectly.
  • The network device is faulty.

Procedure

  1. Possible cause 1: The VM status is abnormal.

    1. Log in to ManageOne Operation Portal as user admin.
    2. Check Logical Location of the current alarm on the Alarms page.
    3. Go back to the home page and choose ServiceOM > Logical Location > ECS > Compute Instances. Search for the VM based on Service-CSBS by name.
    4. Check whether the VM and power supply are running.
      • If no, go to 1.e.
      • If yes, go to 2.
    5. Restore the power supply of the VM and restart the VM.

      After the node system is started, check whether the alarm is cleared.

      • If yes, no further action is required.
      • If no, go to 2.

  2. Possible cause 2: The IP address does not exist.

    1. Click More > VNC Login to log in to the Karbor node.
    2. Run the ifconfig command to check whether the IP address exists.
      • If no, run the service network restart command to restart the network, and then check whether the alarm is cleared.
      • If yes, go to 3.

  3. Possible cause 3: Access is not allowed according to the firewall policy. Run the iptables -nL command to check whether the network segment to which DestinationIP belongs is forbidden according to the firewall policy.

    • If yes, run the iptables command to delete the forbidden items. For details about this command, run the iptables -h command.
    • If no, go to 4.

  4. Possible cause 4: The route is set incorrectly.

    Run the route -n command. Collect the command output before contacting technical support engineers.

  5. Possible cause 5: The network device is faulty.

    Contact technical support for assistance.

Related Information

None

1020756 Failed to Register CSBS-VBS with the Unified Certificate Management Service

Description

This alarm is generated when CSBS-VBS fails to be registered with the unified certificate management service of ManageOne.

Attribute

ID

Alarm Level

Automatically Cleared

1020756

Minor

Yes

Impact on the System

CSBS-VBS certificates cannot be replaced in batches using the unified certificate management service.

Possible Causes

  • The URL of the unified certificate management service is incorrect.
  • The connection between CSBS-VBS and the unified certificate management service is abnormal.

Procedure

  1. Possible cause 1: The URL of the unified certificate management service is incorrect.

    1. Check whether the value of UnifiedCertificateManagementServiceURL in the additional information of the alarm meets the https://console-silvan.{domain_name}:26335 format and whether the value of {domain_name} is the planned value (param_value) of global_domain_name in the basic_parameters sheet of the xxx_export_all_EN.xlsm exported from HUAWEI CLOUD Stack Deploy.
      • If yes, go to 1.b.
      • If no, go to 2.
    1. Use PuTTY to log in to any Service-CSBS node using the IP addresses corresponding to the CSBS_Service field.

      The default account is djmanager. The default password is CloudService@123!.

    1. Run the su root command and enter the root account's password to switch to the root account.

      The default password of the root account is Cloud12#$.

    1. Run the set_karbor_endpoints --cmc_endpoint cmc_url command to configure the URL of the unified certificate management service. In the command, cmc_url is specified based on the format and planned value of global_domain_name mentioned in 1.a. Check whether Set cmc_endpoint successfully. is displayed in the command output and whether the alarm is automatically cleared after 15 minutes.
      • If yes, no further action is required.
      • If no, go to 2.

  2. Possible cause 2: The connection between CSBS-VBS and the unified certificate management service is abnormal.

    1. Contact technical support for assistance.

Related Information

None

Translation
Download
Updated: 2019-08-30

Document ID: EDOC1100062365

Views: 34777

Downloads: 31

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next