No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

HUAWEI CLOUD Stack 6.5.0 Alarm and Event Reference 04

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
OceanStor DJ

OceanStor DJ

ALM-73399 Failed to Archive Operation Logs

Description

This alarm is generated when an exception occurs during a periodic operation log archive.

Attribute

ID

Alarm Severity

Automatically Cleared

73399

Major

Yes

Impact on the System

The space used by the system database increases.

Possible Causes

The directory for storing archived operation logs does not have sufficient space.

Procedure

  1. View alarm information to locate the faulty node, for example, the SFS_DJ01 node.To obtain its IP address, search for SFS_DJ01 in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.
  2. Use PuTTY to log in to the faulty node.

    The default user name is djmanager. The default password is CloudService@123!.

    To obtain its IP address, search for SFS_DJ01 in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.

  3. Run the following command to check whether the directory for storing archived operation logs has sufficient space.

    df -m /var/log/huawei/dj/services/system/dashboard/dump
    • If yes, increase the space of the directory for storing archived operation logs, or delete unnecessary files to release space.Go to 4.
    • If no, contact technical support for assistance.

  4. Check whether the alarm is cleared.

    • If the alarm is cleared, no further operation is required.
    • If no, contact technical support for assistance.

Related Information

None

ALM-73299 Abnormal Nodes

Description

The system periodically checks the status of all nodes. This alarm is generated when an abnormal node is detected.

Attribute

ID

Alarm Severity

Automatically Cleared

73299

Major

Yes

Impact on the System

The node cannot provide all services.

Possible Causes

  • The device of the node is powered off.
  • The network connection of the node is abnormal.
  • The hardware on the node is faulty.

Procedure

  1. View alarm information to locate the faulty node, for example, the SFS_DJ01 node. To obtain its IP address, search for SFS_DJ01 in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.
  2. Possible cause 1: The device of the node is powered off.

    1. Check whether the device of the node is powered off.
      • If yes, go to 2.b.
      • If no, go to 3.
    2. Power on the device of the node.

      After the device is started, wait 1 minute and check whether the alarm is cleared.

      • If yes, no further action is required.
      • If no, go to 3.

  3. Possible cause 2: The network connection of the node is abnormal.

    1. In IPv4 scenarios, run the ping Node IP address command. In IPv6 scenarios, run the ping6 Node IP address command. Node IP address is obtained in 1. Check whether the network connection of the node is normal.
      • If yes, go to 4.
      • If no, go to 3.b.
    2. Contact the equipment room administrator to recover the network connection of the node.

      After the network connection is recovered, wait 1 minute and check whether the alarm is cleared.

      • If yes, no further action is required.
      • If no, go to 4.

  4. Possible cause 3: The hardware on the node is faulty.

    1. Restart the node.

      Check whether the node can be restarted successfully.

      • If yes, go to 4.c.
      • If no, go to 4.b.
    2. Check whether the hardware on the node is faulty.
      1. Check whether any disks on the node are faulty. If any disks are faulty, replace them.
      2. Check whether any other components on the node are faulty. If any components are faulty, replace them.Contact technical support for assistance.
      3. After the hardware fault is rectified, restart the node.
    3. After the node is restarted, wait 1 minute and check whether the alarm is cleared.
      • If yes, no further action is required.
      • If no, contact technical support for assistance.

Related Information

None

ALM-73298 Abnormal Components

Description

The system periodically checks component status on each node. This alarm is generated if a node has abnormal components.

Attribute

ID

Alarm Severity

Automatically Cleared

73298

Major

Yes

Impact on the System

If a node has abnormal components, this node cannot carry services.

Possible Causes

The component processes unexpectedly end.

Procedure

  1. View alarm information. Use PuTTY to log in to the node mentioned in the alarm location information, for example, the SFS_DJ01 node. To obtain its IP address, search for SFS_DJ01 in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.

    The default user name is djmanager. The default password is CloudService@123!.

  2. Run the following command to query the component status on this node.

    show_service --node Node name

    Check whether the command output contains components whose status is fault.

    • If yes, go to 3.
    • If no, go to 7.

  3. Run the following command to stop the faulty components.

    stop_service --service Component name

    View the command output to check whether the execution result is success.

    • If yes, go to 4.
    • If no, go to 7.

  4. Run the following command to restart the faulty components.

    start_service --service Component name

    View the command output to check whether the execution result is success.

    • If yes, go to 5.
    • If no, go to 7.

  5. Run the following command to re-query the component status on this node.

    show_service --node Node name

    Check whether the component status is fault in the command output.

    • If yes, go to 7.
    • If no, go to 6.

  6. Wait 1 minute and check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 7.

  7. Contact technical support for assistance.

Related Information

None

ALM-73297 Service Switchover

Description

This alarm is generated when the standby service changes to the active state. It is used to prompt users to manually clear the alarm.

Attribute

ID

Alarm Severity

Automatically Cleared

73297

Information

No

Impact on the System

This alarm records the event that the standby service changes to the active state. There is no impact on the system.

Possible Causes

  • The active service is abnormal.
  • A maintenance command is run to stop the active service.
  • The node or network where the active service resides is faulty.

Procedure

  1. View alarm information. Use PuTTY to log in to the node mentioned in the alarm location information, for example, the SFS_DJ01 node. To obtain its IP address, search for SFS_DJ01 in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.

    The default user name is djmanager. The default password is CloudService@123!.

  2. Run the following command to check whether the value of status is fault or stopped.

    show_service
    • If yes, contact technical support for assistance.
    • If no, go to 3.

  3. Manually clear the alarm.

Related Information

None

ALM-73296 Abnormal Clock Synchronization Between the System and the External NTP Server

Description

This alarm is generated when the clock synchronization between the system and the external NTP server is abnormal.

Attribute

ID

Alarm Severity

Automatically Cleared

73296

Major

Yes

Impact on the System

The system does not synchronize time with the external clock source.

Possible Causes

  • The network communication between the system and the external NTP server is abnormal.
  • The external NTP server does not run properly.

Procedure

  1. Possible cause 1: The network communication between the system and the external NTP server is abnormal.

    1. Use a browser to log in to the OceanStor DJ administrator GUI.

      The login address is https://Management floating IP address of OceanStor DJ:8088. The Management floating IP address of OceanStor DJ is the value of the SFS_MANAGE_FLOAT_IP parameter in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.

      The default user name is cloud_admin. The default password is CloudService@123!.

    2. Choose System > Region Settings > NTP Server to view the NTP server IP address.
    3. Use PuTTY to log in to any OceanStor DJ management node, for example, the SFS_DJ01 node. To obtain its IP address, search for SFS_DJ01 in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.

      The default user name is djmanager. The default password is CloudService@123!.

    4. Run the ntptrace -n <NTP server ip> command to check whether the following information is displayed:
      DJNode01:/home # ntptrace -n 192.168.1.1
      192.168.1.1: stratum 11, offset 0.000000, synch distance 0.011136
      • If yes, go to 2.
      • If no, go to 1.e.
    5. Contact the network administrator to rectify the network fault. After the system automatically synchronizes time with the external NTP server, check whether the alarm is cleared.
      • If yes, no further action is required.
      • If no, go to 2.

  2. Possible cause 2: The external NTP server does not run properly.

    1. Contact the NTP server maintenance personnel to check whether the external NTP server runs properly.
      • If yes, contact technical support for assistance.
      • If no, go to 2.b.
    2. After the maintenance personnel rectify the fault, check whether the alarm is cleared.
      • If yes, no further action is required.
      • If no, contact technical support for assistance.

Related Information

None

ALM-73295 Failed to Back Up System Data

Description

This alarm is generated when system data backup fails.

Attribute

ID

Alarm Severity

Automatically Cleared

73295

Major

Yes

Impact on the System

System data fails to be backed up, which affects system reliability.

Possible Causes

  • The disk where the backup directory is located has insufficient space.
  • The network communication between the system and the external backup server is abnormal.

Procedure

  1. View alarm information to locate the faulty node.
  2. Possible cause 1: The disk where the backup directory is located has insufficient space.

    1. Use PuTTY to log in to the faulty node. for example, the SFS_DJ01 node. To obtain its IP address, search for SFS_DJ01 in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.

      The default user name is djmanager. The default password is CloudService@123!.

    2. Run the following command and enter the password of user root to switch to user root:

      su - root

    3. Delete unnecessary files in the /opt/ directory to save disk space.
    4. After the next backup task is complete, check whether the alarm persists.
      • If yes, go to 3.
      • If no, no further action is required.

  3. Possible cause 2: The network communication between the system and the external backup server is abnormal.

    1. In IPv4 scenarios, run the ping IP address of the external backup server command. In IPv6 scenarios, run the ping6 IP address of the external backup server command. Check whether the physical network connection between the system and the external backup server is normal.
      • If yes, go to 3.b.
      • If no, repair the network connection.
    2. Check whether the FTPS service on the external backup server is normal.
      For an FTP server, use PuTTY to log in to any SFS node, for example, the SFS-DJ01 node. For details about how to log in, see 2.a. On the SFS node, run the curl -u {ftp_username}:{ftp_password} -k ftp://{ftp_ip}:{ftp_port} --noproxy {ftp_ip} command to check whether any data can be obtained. For an FTPS server, run the curl -u {ftp_username}:{decrypt_pwd} --ftp-ssl-reqd -k ftps://{ftp_ip}:{ftp_port}/ --noproxy {ftp_ip} to check whether any data can be obtained.
      • If yes, contact technical support for assistance.
      • If no, recover the FTPS service on the external backup server. Then check whether the alarm is cleared.

Related Information

None

ALM-73294 Failed to Dump Alarms

Description

This alarm is generated when an exception occurs during a periodic alarm dump.

Attribute

ID

Alarm Severity

Automatically Cleared

73294

Major

Yes

Impact on the System

The space used by the system database increases.

Possible Causes

The directory for storing dumped alarms does not have sufficient space.

Procedure

  1. View alarm information to locate the faulty node.
  2. Possible cause: The directory for storing dumped alarms does not have sufficient space.

    1. Use PuTTY to log in to the faulty node. for example, the SFS_DJ01 node. To obtain its IP address, search for SFS_DJ01 in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.

      The default user name is djmanager. The default password is CloudService@123!.

    1. Run the df -h /opt/ command.
      Filesystem      Size  Used Avail Use% Mounted on
      /dev/sda1        80G   13G   68G  16% /opt
      Check whether the value of Use% reaches 100%.
      • If yes, increase the space of the /opt directory for storing dumped alarms, or delete unnecessary files to release space.
      • If no, contact technical support for assistance.

Related Information

None

ALM-73283 Failed to Reset a Password

Description

This alarm is generated when resetting a component's password fails.

Attribute

ID

Alarm Severity

Automatically Cleared

73283

Major

No

Impact on the System

An incorrect password will result in an internal service connection failure, which will affect services.

Possible Causes

An internal error occurs during password resetting.

Procedure

  1. Use a browser to log in to the OceanStor DJ administrator GUI.

    The login address is https://Management floating IP address of OceanStor DJ:8088. The Management floating IP address of OceanStor DJ is the value of the SFS_MANAGE_FLOAT_IP parameter in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.

    The default user name is cloud_admin. The default password is CloudService@123!.

  2. Choose Monitoring > Alarms > Current Alarms. Select the alarm and click Clear.
  3. Reset the password again, and check whether the alarm persists.

    • If yes, contact technical support for assistance.
    • If no, no further action is required.

Related Information

None

ALM-73282 Failed to Verify the FTP Server Certificate

Description

This alarm is generated when the FTP server certificate verification fails.

Attribute

ID

Alarm Severity

Automatically Cleared

73282

Minor

Yes

Impact on the System

The system will access the FTP server using an insecure connection.

Possible Causes

The certificate has expired, has been revoked, or was issued by an untrusted CA.

Procedure

  1. On the maintenance terminal, use WinSCP to upload the CA certificate file of the FTPS server from the customer to the /home/djmanager directory on all nodes. (To obtain the IP addresses, search for SFS_DJ01, SFS_DJ02, and SFS_DJ03 in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.)
  2. Use PuTTY to log in to an OceanStor DJ node as user djmanager. The default password of user djmanager is CloudService@123!. To ensure system security, you are advised to periodically change the password.
  3. Run the cd /opt/huawei/dj/bin/digital_certificate command to go to the directory where the import script is stored.
  4. Run the ./import_ftps_ca.sh -f /home/djmanager/file_name command to import the certificate to the default installation directory (/opt/huawei/dj/DJSecurity/ftps-ca/) of OceanStor DJ.

    ./import_ftps_ca.sh -f /home/djmanager/file_name 
    import_ftps_ca.sh: Import "/home/djmanager/file_name to directory /opt/huawei/dj/DJSecurity/ftps-ca/" OK

    file_name is the name of the CA certificate of the FTPS server.

  5. Use PuTTY to log in to the management floating IP address of OceanStor DJ. (The management floating IP address of OceanStor DJ is the value of the SFS_MANAGE_FLOAT_IP parameter in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.)
  6. Run the /bin/bash /opt/huawei/dj/bin/backup/backup/gaussdb_backup_entry.sh command and check whether the alarm persists.

    • If yes, contact technical support for assistance.
    • If no, no further action is required.

Related Information

None

ALM-73279 System Certificate Will Expire Soon

Description

This alarm is generated when the system certificate will expire soon.

Attribute

ID

Alarm Severity

Automatically Cleared

73279

Major

Yes

Impact on the System

If the system certificate expires, it is not trusted, and system functions may be affected.

Possible Causes

The period from the current time to the certificate expiry time is less than the threshold.

Procedure

  1. Use a browser to log in to the OceanStor DJ administrator GUI.

    The login address is https://Management floating IP address of OceanStor DJ:8088. The Management floating IP address of OceanStor DJ is the value of the SFS_MANAGE_FLOAT_IP parameter in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.

    The default user name is cloud_admin. The default password is CloudService@123!.

  2. Choose Monitoring > Alarms > Current Alarms to view the certificate-related alarm information.
  3. Log in to all management service nodes. The login account is djmanager and its default password is CloudService@123!. (To obtain the IP addresses of the management service nodes, search for SFS_DJ01, SFS_DJ02, and SFS_DJ03 in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.)
  4. Find the certificate based on the certificate path in the alarm location information. Replace the certificate with the new one obtained from the customer. Do not change the certificate name.
  5. Run the /bin/bash /opt/huawei/dj/bin/digital_certificate/check_certificate_date.sh command on the SFS_DJ01 node.
  6. Check whether an alarm indicating that the certificate will expire is displayed on the Current Alarms page.

    • If yes, contact technical support for assistance.
    • If no, no further action is required.

Related Information

None

ALM-73278 System Certificate Has Expired

Description

This alarm is generated when the system certificate has expired.

Attribute

ID

Alarm Severity

Automatically Cleared

73278

Critical

Yes

Impact on the System

The system certificate is not trusted, and system functions may be affected.

Possible Causes

The system certificate has expired.

Procedure

  1. Use a browser to log in to the OceanStor DJ administrator GUI.

    The login address is https://Management floating IP address of OceanStor DJ:8088. The Management floating IP address of OceanStor DJ is the value of the SFS_MANAGE_FLOAT_IP parameter in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.

    The default user name is cloud_admin. The default password is CloudService@123!.

  2. Choose Monitoring > Alarms > Current Alarms to view the certificate-related alarm information.
  3. Import a new certificate based on the certificate name to replace the certificate used for communication between OceanStor DJ components.
  4. Check whether the alarm is cleared the next day.

    • If yes, no further action is required.
    • If no, contact technical support for assistance.

Related Information

None

ALM-73277 Message Queue Frozen

Description

Part of or the whole of the message queue is frozen so that the service cannot properly produce or consume messages.

Attribute

ID

Alarm Severity

Automatically Cleared

73277

Critical

Yes

Impact on the System

Services depending on RabbitMQ may be abnormal.

Possible Causes

After long-time running of RabbitMQ, the message queue may be frozen.

Procedure

  1. Use PuTTY to log in to any OceanStor DJ management node, for example, the SFS_DJ01 node. To obtain its IP address, search for SFS_DJ01 in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.

    The default user name is djmanager. The default password is CloudService@123!.

  2. Run the stop_service --service rabbitmq command to stop all RabbitMQ nodes.
  3. Run the start_service --service rabbitmq command to start all RabbitMQ nodes.
  4. Check whether the alarm is automatically cleared.

    • If yes, no further action is required.
    • If no, go to 5.

  5. Collect logs from directory /var/log/huawei/dj/services/system/rabbitmq. Then, contact technical support for assistance.

Related Information

None

ALM-73276 Network Partition Occurs in Message Queue

Description

RabbitMQ nodes cannot intercommunicate with each other due to network faults, resulting in data inconsistency and OceanStor DJ service failures.

Attribute

ID

Alarm Severity

Automatically Cleared

73276

Critical

Yes

Impact on the System

Services depending on RabbitMQ may be abnormal.

Possible Causes

RabbitMQ nodes cannot intercommunicate with each other due to network faults on the OceanStor DJ internal plane.

Procedure

  1. Rectify the network faults on the OceanStor DJ internal plane.
  2. Use PuTTY to log in to any OceanStor DJ node, for example, the SFS_DJ01 node. To obtain its IP address, search for SFS_DJ01 in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.

    The default user name is djmanager. The default password is CloudService@123!.

  3. Run the stop_service --service rabbitmq command to stop all RabbitMQ nodes.
  4. Run the start_service --service rabbitmq command to start all RabbitMQ nodes.
  5. Check whether the alarm is automatically cleared.

    • If yes, no further action is required.
    • If no, go to 6.

  6. Collect logs from directory /var/log/huawei/dj/services/system/rabbitmq. Then, contact technical support for assistance.

Related Information

None

ALM-73099 CPU Usage Exceeds the Threshold

Description

This alarm is generated when the CPU usage exceeds 80%.

Attribute

ID

Alarm Severity

Automatically Cleared

73099

Major

Yes

Impact on the System

The system may run slowly.

Possible Causes

The host is busy and overloaded.

Procedure

  1. View the location information of the alarm to locate the faulty node, for example, the SFS_DJ01 node. To obtain its IP address, search for SFS_DJ01 in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.
  2. Use PuTTY to log in to the faulty node. The account is djmanager and its default password is CloudService@123!.
  3. Rn the top -c command to check the row where the CPU usage exceeds the threshold. Find the COMMAND value in the row and check the name of faulty process.
  4. If the process name does not contain manila or filemeter, run the kill -9 PID command (PID indicates the process ID) to check whether the CPU usage is restored 5 to 10 minutes later.

    • If yes, no further action is required.
    • If no, contact technical support for assistance.

Related Information

None

ALM-73098 Memory Usage Exceeds the Threshold

Description

This alarm is generated when the memory usage exceeds 80%.

Attribute

ID

Alarm Severity

Automatically Cleared

73098

Major

Yes

Impact on the System

The system may run slowly.

Possible Causes

The host is busy and overloaded.

Procedure

  1. View the location information of the alarm to locate the faulty node, for example, the SFS_DJ01 node. To obtain its IP address, search for SFS_DJ01 in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.
  2. Use PuTTY to log in to the faulty node. The account is djmanager and its default password is CloudService@123!.
  3. Rn the top -c command to check the row where the memory usage exceeds the threshold. Find the COMMAND value in the row and check the name of faulty process.
  4. If the process name does not contain manila or filemeter, run the kill -9 PID command (PID indicates the process ID) to check whether the memory usage is restored 5 to 10 minutes later.

    • If yes, no further action is required.
    • If no, contact technical support for assistance.

Related Information

None

ALM-73097 Disk Space Usage Exceeds the Threshold

Description

This alarm is generated when the disk space usage exceeds 80%.

Attribute

ID

Alarm Severity

Automatically Cleared

73097

Major

Yes

Impact on the System

System performance may deteriorate, and new data may not be saved successfully.

Possible Causes

Files stored on the disk occupy too much space.

Procedure

  1. View alarm information to determine partitions whose space usage exceeds 80%.
  2. Use PuTTY to log in to the faulty node and obtain the name of the faulty node based on the alarm information, for example, the SFS_DJ01 node. To obtain its IP address, search for SFS_DJ01 in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.
  3. Run the following command to access a partition whose space usage exceeds 80%.

    cd Directory of a partition whose space usage exceeds 80%

  4. Run the following command to clear unnecessary files.

    rm -rf Unnecessary file name

  5. Run the following command to view the space usage of partitions.

    df

    After the command is executed, the space usage of each partition is displayed. View column 6 (Mounted on) to find the directory of each partition, and view column 5 (Use%) to check whether the used space of each partition exceeds 80%.

    • If yes, go to 4.
    • If no, go to 6.

  6. Wait 1 minute and check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, contact technical support for assistance.

Related Information

None

ALM-73096 Subhealthy Internal Plane Network

Description

This alarm is generated when the packet loss rate or latency on the internal plane network is excessively high.

Attribute

ID

Alarm Severity

Automatically Cleared

73096

Major

Yes

Impact on the System

The system response may time out, and even services may be affected.

Possible Causes

  • The network cable is loosely inserted.
  • The network cable has a poor quality.
  • The network port is faulty.

Procedure

  1. View alarm information to locate the network port that reported the alarm.
  2. Possible cause 1: The network cable is loosely inserted.

    1. Check whether the network cable is loosely inserted.
      • If yes, go to 2.b.
      • If no, go to 3.
    2. Remove and then reinsert the network cable. Ensure that the cable is securely inserted at both ends. Check whether the connection indicator on the network port is steady on.
      • If yes, no further action is required.
      • If no, go to 3.

  3. Possible cause 2: The network cable is faulty.

    1. Check whether the network cable is faulty.
      • If yes, go to 3.b.
      • If no, go to 4.
    2. Replace the network cable. Check whether the alarm is cleared.
      • If yes, no further action is required.
      • If no, go to 4.

  4. Possible cause 3: The network port is faulty.

    Replace the network adapter. Check whether the alarm is cleared.
    • If yes, no further action is required.
    • If no, contact technical support for assistance.

Related Information

None

ALM-73095 Subhealthy Management Plane Network

Description

This alarm is generated when the packet loss rate or latency on the management plane network is excessively high.

Attribute

ID

Alarm Severity

Automatically Cleared

73095

Major

Yes

Impact on the System

The system response may time out, and even services may be affected.

Possible Causes

  • The network cable is loosely inserted.
  • The network cable has a poor quality.
  • The network port is faulty.

Procedure

  1. View alarm information to locate the network port that reported the alarm.
  2. Possible cause 1: The network cable is loosely inserted.

    1. Check whether the network cable is loosely inserted.
      • If yes, go to 2.b.
      • If no, go to 3.
    2. Remove and then reinsert the network cable. Ensure that the cable is securely inserted at both ends. Check whether the connection indicator on the network port is steady on.
      • If yes, no further action is required.
      • If no, go to 3.

  3. Possible cause 2: The network cable is faulty.

    1. Check whether the network cable is faulty.
      • If yes, go to 3.b.
      • If no, go to 4.
    2. Replace the network cable. Check whether the alarm is cleared.
      • If yes, no further action is required.
      • If no, go to 4.

  4. Possible cause 3: The network port is faulty.

    Replace the network adapter. Check whether the alarm is cleared.
    • If yes, no further action is required.
    • If no, contact technical support for assistance.

Related Information

None

ALM-73094 Subhealthy Tenant Plane Network

Description

This alarm is generated when the packet loss rate or latency on the tenant plane network is

excessively high.

Attribute

ID

Alarm Severity

Automatically Cleared

73094

Major

Yes

Impact on the System

The system response may time out, and even services may be affected.

Possible Causes

  • The network cable is loosely inserted.
  • The network cable has a poor quality.
  • The network port is faulty.

Procedure

  1. View alarm information to locate the network port that reported the alarm.
  2. Possible cause 1: The network cable is loosely inserted.

    1. Check whether the network cable is loosely inserted.
      • If yes, go to 2.b.
      • If no, go to 3.
    2. Remove and then reinsert the network cable. Ensure that the cable is securely inserted at both ends. Check whether the connection indicator on the network port is steady on.
      • If yes, no further action is required.
      • If no, go to 3.

  3. Possible cause 2: The network cable is faulty.

    1. Check whether the network cable is faulty.
      • If yes, go to 3.b.
      • If no, go to 4.
    2. Replace the network cable. Check whether the alarm is cleared.
      • If yes, no further action is required.
      • If no, go to 4.

  4. Possible cause 3: The network port is faulty.

    Replace the network adapter. Check whether the alarm is cleared.
    • If yes, no further action is required.
    • If no, contact technical support for assistance.

Related Information

None

ALM-70399 Back-End Storage Device Connection Error

Description

This alarm is generated when a back-end storage device connection error occurs.

Attribute

ID

Alarm Severity

Automatically Cleared

70399

Critical

Yes

Impact on the System

Operations on the storage device, storage pools, volumes, and file systems will fail.

Possible Causes

  • The user name or password is incorrect.
  • The network is faulty.

Procedure

  1. Possible cause 1: The user name or password is incorrect.

    Log in to the storage device as an administrator. Check whether the login is successful.
    • If yes, go to 6.
    • If no, contact the administrator to obtain the correct user name and password and go to 2.

  2. Log in to the OceanStor DJ administrator GUI.

    The login address is https://Management floating IP address of OceanStor DJ:8088. The Management floating IP address of OceanStor DJ is the value of the SFS_MANAGE_FLOAT_IP parameter in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.

    The default user name is cloud_admin. The default password is CloudService@123!.

  3. Choose Infrastructure > Storage Device > Enterprise Storage or Infrastructure > Storage Device > Distributed Storage.
  4. Select a storage device and choose More > Modify Access Info.
  5. Modify the access information of the device.
  6. Possible cause 2: The network is faulty.

    Contact the network administrator and check whether the network connection of the storage device is normal.
    • If yes, contact technical support for assistance.
    • If no, repair the network connection.

Related Information

None

ALM-70391 Failed to Verify a Device Certificate

Description

This alarm is generated when a device certificate fails to be verified.

Attribute

ID

Alarm Severity

Automatically Cleared

70391

Minor

Yes

Impact on the System

The device identity cannot be verified, and consequently operations may not be performed on the device properly.

Possible Causes

  • The device certificate has expired or has not taken effect.
  • The device certificate is not trusted.

Procedure

  1. Possible cause 1: The device certificate has expired or has not taken effect.

    1. Locate the target device based on the device IP address in the alarm information.
    2. Log in to the target device and check whether the current time is within the validity period of the device certificate.
      • If yes, go to 2.
      • If no, update the device certificate. For details, see the user guide of the target device.

  2. Possible cause 2: The device certificate is not trusted.

    1. Use a browser to log in to the OceanStor DJ administrator GUI.

      The login address is https://Management floating IP address of OceanStor DJ:8088. The Management floating IP address of OceanStor DJ is the value of the SFS_MANAGE_FLOAT_IP parameter in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.

      The default user name is cloud_admin. The default password is CloudService@123!.

    2. Choose System > Region Settings > Certificate, and import the CA certificate corresponding to the device certificate.
    3. Wait 1 minute and check whether the alarm is cleared.
      • If yes, no further action is required.
      • If no, contact technical support for assistance.

Related Information

None

ALM-70384 Insufficient Capacity of a File Storage Service Level

Description

The capacity of a file storage service level is insufficient.

Attribute

ID

Alarm Severity

Automatically Cleared

70384

Major

Yes

Impact on the System

Creating a file system with large capacity may fail.

Possible Causes

The capacity usage of the file storage service level exceeds the threshold.

Procedure

  1. Use a browser to log in to the OceanStor DJ administrator GUI.

    The login address is https://Management floating IP address of OceanStor DJ:8088. The Management floating IP address of OceanStor DJ is the value of the SFS_MANAGE_FLOAT_IP parameter in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.

    The default user name is cloud_admin. The default password is CloudService@123!.

  2. Choose Resources > Service Level > File Storage Service Level and click the desired file storage service level. In the Summary tab page, check whether the capacity usage of the file storage service level exceeds the threshold.

    • If yes, go to 3.
    • If no, contact technical support for assistance.

  3. Select Storage Pool Group and click Add Storage Pool.
  4. Select one or more storage pools and click OK.
  5. Check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, contact technical support for assistance.

Related Information

None

ALM-70895 Failed To Send Operation Logs To ManageOne log service

Description

Sending operation logs of tenants to ManageOne log service fails due to network or ManageOne log service exceptions.

Attribute

ID

Alarm Severity

Automatically Cleared

70895

Major

Yes

Impact on the System

Operation logs of tenants cannot be uploaded in a timely manner. If the fault persists for a long time, operation logs may be lost.

Possible Causes

  1. The network is faulty. As a result, OceanStor DJ cannot communicate with ManageOne log service.
  2. Connection to ManageOne log service fails. As a result, operation logs cannot be received.

Procedure

  1. Possible cause 1: The network is faulty.

    1. Use PuTTY to log in to any OceanStor DJ management node, for example, the SFS_DJ01 node. To obtain its IP address, search for SFS_DJ01 in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.

      The default user name is djmanager. The default password is CloudService@123!.

    2. Run the following command to check whether the network connection between the node and ManageOne log service is normal:

      ping ManageOne log service floating IP address

      To obtain ManageOne log service floating IP address, search for haproxy_Haproxy_CTS_Listen_IP in the 2.2 Tool-generated Other Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.
      • If yes, go to 2.
      • If no, contact technical support for assistance.

  2. Possible cause 2: Connection to ManageOne log service fails.

    1. Check whether connection to ManageOne log service succeeds. For details, see STaaS Solution 6.5.0 SFS Software Installation Guide (Private Cloud Scenario for HUAWEI CLOUD Stack 6.5.0) and choose Installing the Nth Region > Connection to Other Systems > Connecting to ManageOne log service.
      • If yes, contact technical support for assistance.
      • If no, go to 2.b.
    2. Connect to ManageOne log service again. For details, see STaaS Solution 6.5.0 SFS Software Installation Guide (Private Cloud Scenario for HUAWEI CLOUD Stack 6.5.0) and choose Installing the Nth Region > Connection to Other Systems > Connecting to ManageOne log service.
    3. Check whether the alarm is cleared.
      • If the alarm is cleared, no further operation is required.
      • If no, contact technical support for assistance.

Related Information

None

ALM-73698 Certificate Has Expired

Description

This alarm is generated when a certificate imported to the system has expired.

Attribute

ID

Alarm Severity

Automatically Cleared

73698

Minor

Yes

Impact on the System

The certificate issued by a CA certificate is not trusted and device identity cannot be verified.

Possible Causes

The certificate has expired.

Procedure

  1. Use a browser to log in to the OceanStor DJ administrator GUI.

    The login address is https://Management floating IP address of OceanStor DJ:8088. The Management floating IP address of OceanStor DJ is the value of the SFS_MANAGE_FLOAT_IP parameter in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.

    The default user name is cloud_admin. The default password is CloudService@123!.

  2. Choose System > Region Settings > Certificate. Locate and delete the certificate which has expired based on the certificate ID in the alarm.
  3. Import a valid certificate.
  4. Check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, contact technical support for assistance.

Related Information

None

ALM-73699 Certificate Is About to Expire

Description

This alarm is generated when a certificate imported to the system will expire soon.

Attribute

ID

Alarm Severity

Automatically Cleared

73699

Information

Yes

Impact on the System

The certificate issued by a CA certificate is not trusted and device identity cannot be verified.

Possible Causes

The period from the current time to the certificate expiry time is less than the threshold.

Procedure

  1. Use a browser to log in to the OceanStor DJ administrator GUI.

    The login address is https://Management floating IP address of OceanStor DJ:8088. The Management floating IP address of OceanStor DJ is the value of the SFS_MANAGE_FLOAT_IP parameter in the 2.1 Tool-generated IP Parameters sheet of the xxx_export_all_EN.xlsm file exported after deployment using HUAWEI CLOUD Stack Deploy.

    The default user name is cloud_admin. The default password is CloudService@123!.

  2. Choose System > Region Settings > Certificate. Locate and delete the certificate which will expire soon based on the certificate ID in the alarm.
  3. Import a valid certificate.
  4. Check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, contact technical support for assistance.

Related Information

None

Translation
Download
Updated: 2019-08-30

Document ID: EDOC1100062365

Views: 33610

Downloads: 31

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next