No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

FusionAccess 6.5 Alarm Handling 05

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
1000033 HA Active/Standby Heartbeat Fault

1000033 HA Active/Standby Heartbeat Fault

Description

After the high availability (HA) active/standby function is configured, the HA service checks whether the peer node heartbeats correctly every 2 minutes. This alarm is generated when the HA service detects that the peer node heartbeats incorrectly. This alarm is automatically cleared when the HA service detects that the peer node heartbeats correctly.

Attribute

Alarm ID

Alarm Severity

Auto Clear

1000033

Critical

Yes

Parameters

Name

Meaning

Alarm ID

Identifies an alarm. Each alarm is uniquely identified by an alarm ID and an alarm name.

Alarm Severity

Indicates the severity of an alarm. Value:

  • Critical: indicates that a fault affecting services provided by the system occurs. You need to rectify the fault immediately. If a device or resource is faulty, rectify it immediately even if the fault occurs during non-working hours.
  • Major: indicates that a fault affecting the service quality of the system occurs. You need to rectify the fault immediately. If the service quality of a device or resource is degraded, rectify it immediately during working hours.
  • Minor: indicates a fault that does not affect service quality. To prevent more serious faults, this type of alarm needs to be observed or handled if necessary.
  • Warning: indicates a fault that may affect service quality. This type of alarm must be handled based on the error type.

Alarm Name

Identifies an alarm. Each alarm is uniquely identified by an alarm ID and an alarm name.

Object Type

Specifies the type of the object for which the alarm is generated.

Alarm Object Name

Specifies the name of the object for which the alarm is generated.

Generation Time

Specifies the time when the alarm is generated.

Clear Time

Specifies the time when the alarm is cleared.

Clear Mode

Specifies whether the alarm is manually or automatically cleared.

Operation

Specifies the operation that can be performed on the alarm.

Value: Manually Clear Alarm

Impact on the System

  • System reliability deteriorates.
  • Two HA active nodes exist, and the system is unreliable.

Possible Causes

  • IP address conflict.
  • The HA service is abnormal, or the server where the HA service runs is abnormal.
  • The HA active/standby configuration is incorrect.

Procedure

Check whether the HA service is abnormal, or the server where the HA service runs is abnormal.

  1. Log in to the server where the alarm is generated using an administrator account and run the arping -c 3 -f -D -I eth0 IP address of the server where the alarm is generated command to check whether IP conflict occurs.

    • If yes, go to Step 2.
    • If no, go to Step 4.

      If the information similar to the following is displayed, no IP conflict occurs:

      ARPING 192.168.162.11 from 0.0.0.0 eth0 
      Sent 3 probes (3 broadcast(s)) 
      Received 0 response(s) 
      (Note: The IP addresses are only examples. Use the actual IP addresses.)     

      If the information similar to the following is displayed, IP conflict occurs:

      ARPING 192.168.162.11 from 0.0.0.0 eth0 
      Unicast reply from 192.168.162.11 [12:6E:D4:AB:CD:EF]  1.022ms 
      Sent 1 probes (1 broadcast(s)) 
      Received 1 response(s) 
      (Note: The preceding IP addresses and MAC addresses are only examples. Use the actual IP addresses and MAC addresses.)     

  2. Log in to the server that causes the IP conflict, shut down the server or change the server IP address, and run the arping -c 3 -f -D -I eth0 IP address of the server where the alarm is generated command again on the server where the alarm is generated to check whether the IP conflict persists.

    • If yes, contact Huawei technical support.
    • If no, go to Step 3.

  3. Choose FusionAccess > Alarm to check whether the alarm still exists.

    • If yes, go to Step 4.
    • If no, no further operation is required.

  4. Log in to the server directed by the peer IP address provided in alarm information as user gandalf. Check whether the login is successful.

  5. Restart the server on FusionCompute. Check whether the server restarts successfully.

    • If yes, go to Step 4.
    • If no, contact Huawei technical support.

  6. Run shell commands to restart the HA service.

    sudo service ha restart

  7. Wait 2 minutes and choose FusionAccess > Alarm to check whether the alarm still exists.

    • If yes, go to Step 8.
    • If no, no further operation is required.

Check whether HA active/standby configuration is incorrect.

  1. Log in to the server directed by the peer IP address provided in alarm information as user root, and run the sh /opt/HA/module/hacom/script/config_ha.sh -a command to view the peer IP address to check whether HA active/standby configuration is incorrect.

    • If yes, go to Step 9.
    • If no, contact Huawei technical support.

      Information similar to the following is displayed. 192.168.6.26 is the local IP address and 192.168.6.27 is the peer IP address.

      HaMode:       double 
       
      HaLocalName:  HA192220626(active) 
      HaPeerName: HA192220627(standby) 
       
      HaArbLk:      192.168.6.26:1234  --  192.168.6.27:1234 
                    192.168.6.26:1236  --  192.168.6.27:1236 
       
      HaSyncLk:     192.168.6.26:1235  --  192.168.6.27:1235 
                    192.168.6.26:1237  --  192.168.6.27:1237 
       
      HaRpcLk:      127.0.0.1:61806 
       
      HaArpLk:      192.168.6.31 
       
      HaGwLk:       192.168.6.1     

  2. Configure HA again using startTools.

    • If the server where HA runs is a GaussDB server, run startTools, choose Software > GaussDB > Configure GaussDB > Configure HA, and configure HA again.
    • If the server where HA runs is a vLB server, run startTools, choose Software > Custom Install > vLB > Configure vLB > Configure HA, and configure HA again.

  3. Wait 2 minutes and choose FusionAccess > Alarm to check whether the alarm still exists.

    • If yes, contact Huawei technical support.
    • If no, no further operation is required.

Related Information

None

Translation
Download
Updated: 2019-12-13

Document ID: EDOC1100061083

Views: 7213

Downloads: 22

Average rating:
This Document Applies to these Products

Related Version

Related Documents

Share
Previous Next