No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

HUAWEI CLOUD Stack 6.5.0 Alarm and Event Reference 04

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
ALM-37003 Asynchronous or Disconnected Active and Standby GTM Instances

ALM-37003 Asynchronous or Disconnected Active and Standby GTM Instances

Description

This alarm is generated when the active GTM instance is disconnected from or asynchronous with the standby GTM instance.

Attribute

Alarm ID

Alarm Severity

Auto Clear

37003

Major

Yes

Parameters

Name

Meaning

ServiceName

Identifies the service for which the alarm is generated.

RoleName

Identifies the role for which the alarm is generated.

HostName

Identifies the host for which the alarm is generated.

Instance

Identifies the instance for which the alarm is generated.

Impact on the System

If the active GTM instance is disconnected from the standby GTM instance and the active GTM instance is working in synchronous mode, the system is unavailable in 120 seconds. After detecting the fault, the system sets the mode of the active GTM instance to HA. Then the system recovers. If the active GTM instance is working in HA mode, the system works correctly.

NOTE:

When the cluster is working correctly, the active GTM instance works in synchronous mode and synchronizes received tasks to the standby instance in real time, ensuring consistency between the active and standby instances. After the standby instance is faulty and cannot recover, the active instance stops synchronizing tasks to the standby instance. The active instance works in HA mode.

Possible Causes

The active GTM instance is disconnected from the standby GTM instance.

Procedure

Locate the alarm cause.

  1. Log in to the FusionInsight Manager.

    1. Log in to the ManageOne OM plane using a browser, then choose Alarms.
      • Login address: https://URL for the homepage of the ManageOne OM plane:31943. Example: https://oc.type.com:31943.
      • Default username: admin, default password: Huawei12#$.
    2. In the alarm list, locate and click the target alarm name in the Name column. The Alarm Details and Handling Recommendations dialog box is displayed.
    3. Locate the value in the IP Address/URL/Domain Name column, which is the float IP address of the FusionInsight Manager.
    4. Log in to the FusionInsight Manager using a browser.
      • Login address: https://float IP address of the FusionInsight Manager:28443/web. Example: https://10.10.192.100:28443/web.
      • Default username: admin, default password: obtain it from the system administrator.

  2. On FusionInsight Manager, click Alarms. On the alarm list, locate the alarm and obtain the information about the node and instance for which the alarm is generated from Location in the Alarm Details area.
  3. Log in to the node where the alarm is generated as the omm user and run the following command to check whether the active and standby GTM instances of the cluster are faulty.

    Default user: omm, default password: Bigdata123@.

    source ${BIGDATA_HOME}/mppdb/.mppdbgs_profile

    gs_om -t status --detail

    [     GTM State     ]
    node     node_ip         instance                           state                    sync_state
    ---------------------------------------------------------------------------------------------------
    2  host2 10.7.66.183    1001 /opt/huawei/Bigdata/mppdb/gtm P Primary Connection ok  Sync
    3  host3 10.7.66.245    1002 /opt/huawei/Bigdata/mppdb/gtm S Standby Connection ok  Sync
    • If they are, fix them. For details, see section "Rectifying an MPPDBServer Instance" in the Product Documentation. Then go to 4.
    • If they are not, go to 4.

  4. Check whether the network of the servers running the active and standby GTM instances are normal. For example, if the NIC used by the server running the active GTM instance or standby GTM instance is eth0, run the following command to check whether the network is normal:

    /sbin/ifconfig eth0

    • If the network adapter is normal, go to 5.
    • If the network adapter is abnormal, contact hardware engineers to rectify the network adapter fault and go to 5.

  5. Check whether the alarm persists.

    • If yes, go to 6.
    • If no, no further action is required.

Collect fault information.

  1. On FusionInsight Manager, choose System > Log Download.
  2. Select MPPDB from the Services drop-down list box and click OK.
  3. Set Start Time for log collection to 1 hour ahead of the alarm generation time and End Time to 1 hour after the alarm generation time, and click Download.
  4. Contact Technical Support and send the collected logs.

Alarm Clearing

After the fault is rectified, the system automatically clears this alarm.

Related Information

None

Translation
Download
Updated: 2019-08-30

Document ID: EDOC1100062365

Views: 37613

Downloads: 31

Average rating:
This Document Applies to these Products
Related Version
Related Documents
Share
Previous Next