No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

HUAWEI CLOUD Stack 6.5.0 Alarm and Event Reference 04

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
ALM-37012 Dual-Host Monitoring Socket of a MPPDBServer Instance Is Abnormal

ALM-37012 Dual-Host Monitoring Socket of a MPPDBServer Instance Is Abnormal

Description

This alarm is generated when other processes of the operating system occupy the dual-host monitoring port.

Attribute

Alarm ID

Alarm Severity

Auto Clear

37012

Major

Yes

Parameters

Name

Meaning

ServiceName

Identifies the service for which the alarm is generated.

RoleName

Identifies the role for which the alarm is generated.

HostName

Identifies the host for which the alarm is generated.

Instance

Identifies the instance for which the alarm is generated.

Impact on the System

If the dual-host port is occupied for more than 120 seconds, the system recovers.

System Processing

  • If the dual-host port is occupied, the GaussDB process cannot be started. The cluster first tries to restart the GaussDB process. The system is unavailable during the restart.
  • If the node cannot be started in 120 seconds, the cluster switches the standby DataNode instance to the active one. Then the system is available.

Possible Causes

Other processes of the operating system occupy the dual-host monitoring port.

Procedure

Locate the alarm cause.

  1. Log in to the FusionInsight Manager.

    1. Log in to the ManageOne OM plane using a browser, then choose Alarms.
      • Login address: https://URL for the homepage of the ManageOne OM plane:31943. Example: https://oc.type.com:31943.
      • Default username: admin, default password: Huawei12#$.
    2. In the alarm list, locate and click the target alarm name in the Name column. The Alarm Details and Handling Recommendations dialog box is displayed.
    3. Locate the value in the IP Address/URL/Domain Name column, which is the float IP address of the FusionInsight Manager.
    4. Log in to the FusionInsight Manager using a browser.
      • Login address: https://float IP address of the FusionInsight Manager:28443/web. Example: https://10.10.192.100:28443/web.
      • Default username: admin, default password: obtain it from the system administrator.

  2. Finds the data directory of the DN instance that generated the alert.

    1. On FusionInsight Manager, click Alarms. On the alarm list, locate the alarm and obtain the information about the node and instance for which the alarm is generated from Location in the Alarm Details area.
    2. Log in to the node where the alarm is generated as the omm user. Default user: omm, default password: Bigdata123@.
    3. Initialize the environment variables.

      source ${BIGDATA_HOME}/mppdb/.mppdbgs_profile

    4. Run the gs_om -t status --detail command.

      Information similar to the following is displayed:

      [  CMServer State   ]
      
      node     node_ip         instance                                    state
      ----------------------------------------------------------------------------
      1  host0 10.0.0.1    1    /opt/huawei/Bigdata/mppdb/cm/cm_server Primary
      
      [   Cluster State   ]
      
      cluster_state   : Normal
      redistributing  : No
      balanced        : Yes
      
      [ Coordinator State ]
      
      node     node_ip         instance                                  state
      --------------------------------------------------------------------------
      1  host0 10.0.0.1    5001 /srv/BigData/mppdb/data1/coordinator Normal
      
      [ Central Coordinator State ]
      
      node     node_ip         instance                                    state
      ----------------------------------------------------------------------------
      1  host0 10.0.0.1    5001 /srv/BigData/mppdb/data1/coordinator Normal
      
      [     GTM State     ]
      
      node     node_ip         instance                           state     
      ----------------------------------------------------------------------
      1  host0 10.0.0.1    1001 /opt/huawei/Bigdata/mppdb/gtm P Primary 
      [  Datanode State   ]
      
      node     node_ip         instance                                     state
      -----------------------------------------------------------------------------------------
      1  host0 10.0.0.1    6001 /srv/BigData/mppdb/data1/master1 P Primary Normal
      1  host0 10.0.0.1    6002 /srv/BigData/mppdb/data2/master2 P Primary Normal
      1  host0 10.0.0.1    6003 /srv/BigData/mppdb/data3/master3 P Primary Normal

    /srv/BigData/mppdb/data1/master1 is the data directory for the DN instance.

  3. If the data directory of the instance is /srv/BigData/mppdb/data1/master1, run the following command to open the postgresql.conf file:

    vi /srv/BigData/mppdb/data1/master1/postgresql.conf

    Locate parameter replconninfo1. The localport defined by this parameter is the dual-host monitoring port. Run the following command to check whether the port is occupied by other processes. Assume that the port ID is 10000.

    netstat -anp | grep 10000

    If the port is occupied, check whether the port is occupied by a key process.

    • If yes , go to 6.
    • If no, go to 4.

  4. Run the following command to stop the process.

    kill -9 pid

  5. Check whether the alarm persists.

    • If yes, go to 6.
    • If no, no further action is required.

Collect fault information.

  1. On FusionInsight Manager, choose System > Log Download.
  2. Select MPPDB from the Services drop-down list box and click OK.
  3. Set Start Time for log collection to 1 hour ahead of the alarm generation time and End Time to 1 hour after the alarm generation time, and click Download.
  4. Contact Technical Support and send the collected logs.

Alarm Clearing

After the fault is rectified, the system automatically clears this alarm.

Related Information

None

Translation
Download
Updated: 2019-08-30

Document ID: EDOC1100062365

Views: 47701

Downloads: 33

Average rating:
This Document Applies to these Products
Related Version
Related Documents
Share
Previous Next