No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

eSight V300R010C00 Maintenance Guide 07

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
ALM-316010198 Data Replication Failure in the OMMHA System

ALM-316010198 Data Replication Failure in the OMMHA System

Description

This alarm is generated when eSight detects that data replication between the active and standby nodes in the OMMHA system failed. eSight queries the OMMHA system status every 5 minutes.

The alarm is triggered when the network is temporarily abnormal. After the network is recovered, the alarm can be cleared.

When the fault is eliminated, the system will automatically clear the alarm. Manual clearing is not required.

Alarm Attribute

Alarm ID

Alarm Severity

Alarm Type

316010198

Major

Environmental alarm

Impact on the System

If data cannot be synchronized between the master and slave databases, data may get lost after an active/standby switchover.

Possible Causes

The database on the standby server fails to replicate data on the active server.

Procedure

  1. Check the network between the two nodes of the two-node cluster. If the network is abnormal, correct the network.

    1. Log in to the active eSight server as the ossuser user.
    2. Ensure that the standby server is running properly and the network is reachable. You can run the ping command to connect to the system IP address and heartbeat IP address of the standby server.
      ossuser@eSightServer:~> ping -c 4 10.137.63.225
      PING 10.137.63.225 (10.137.63.225) 56(84) bytes of data.
      64 bytes from 10.137.63.225: icmp_seq=1 ttl=64 time=0.477 ms
      64 bytes from 10.137.63.225: icmp_seq=2 ttl=64 time=0.439 ms
      64 bytes from 10.137.63.225: icmp_seq=3 ttl=64 time=0.437 ms
      64 bytes from 10.137.63.225: icmp_seq=4 ttl=64 time=0.384 ms
      
      --- 10.137.63.225 ping statistics ---
      4 packets transmitted, 4 received, 0% packet loss, time 2999ms
      rtt min/avg/max/mdev = 0.384/0.434/0.477/0.036 ms
      ossuser@eSightServer:~> ping -c 4 192.168.122.1
      PING 192.168.122.1 (192.168.122.1) 56(84) bytes of data.
      64 bytes from 192.168.122.1: icmp_seq=1 ttl=64 time=0.018 ms
      64 bytes from 192.168.122.1: icmp_seq=2 ttl=64 time=0.016 ms
      64 bytes from 192.168.122.1: icmp_seq=3 ttl=64 time=0.015 ms
      64 bytes from 192.168.122.1: icmp_seq=4 ttl=64 time=0.014 ms
      
      --- 192.168.122.1 ping statistics ---
      4 packets transmitted, 4 received, 0% packet loss, time 2998ms
      rtt min/avg/max/mdev = 0.014/0.015/0.018/0.005 ms

  2. Observe for 10 minutes and check whether the alarm is cleared.

    • If yes, the environment has been recovered and no manual rectification is required.
    • If no, the environment is abnormal. You need to perform the following steps to manually rectify the fault.

  3. Verify that the file synchronization of the OMMHA two-node cluster is normal. That is, eSight does not have the "File Synchronization Failure in the OMMHA System" alarm.
  4. Log in to the active server as the ossuser user and run the following commands to restore the data replication:

    NOTE:

    If data synchronization is invalid, query service data based on run logs. You are advised to select the eSight with the latest service data as the active server.

    1. To query run logs, log in to the active and standby servers as the ossuser user and run the following commands:

      grep "NMSServer.*>normal" /opt/ommha/ha/var/ha/runlog/ha.log

      gzip -dc /opt/ommha/ha/var/ha/runlog/*.gz | grep "NMSServer.*>normal"

      The command output is the online time of the eSight resource NMSServer. Query eSight service data based on the online time.

    2. To query service data, log in to eSight, and query key service information including the NE configurations, performance data, and alarms. Then, manually perform active and standby switchover and log in to eSight again to query related information.

    > cd /opt/eSight/mttools/tools

    > ./hadatasyncrecover.sh

    When the following information is displayed, enter the password of the database administrator user and press Enter:

    Please input super password of Database:

    If the following information is displayed, the database is successfully restored. Otherwise, contact Huawei technical support.

    Database recover success

Clearing

After the fault is rectified, the system automatically clears the alarm.

Translation
Download
Updated: 2019-06-30

Document ID: EDOC1100044373

Views: 24968

Downloads: 74

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next