No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

Failed to Isolate a Host of FusionInsight LibrA C80 Using the Low-Level Command

Publication Date:  2019-04-12 Views:  98 Downloads:  0

Issue Description

When the cm_ctl stop -n num command is used to stop an instance of FusionInsight LibrA C80, an error message indicating that the instance fails to be stopped is displayed and the cluster becomes unavailable, as shown in the following figure.


Handling Process


1.     Check the CM run logs on the isolated host:

2.     It is found that the shutdown command is executed cyclically. As a result, the stop command times out.

3.     Check the cluster status. It is found that the standby GTM has not been promoted to the active status. As a result, the cluster is unavailable.

4.     Locate the log according to the error message displayed in Step 2.

5.     It is found that the remote connection cannot be stopped.

6.     Query the gtm running status.

It is found that the gtm is still running.

7.     Manually kill the process, it is found that the cluster status is restored to Degraded.

8.     Rectify the faulty instance.

9.     Start the restored host.

10.   Reconfigure the load balancing status

11.   It is found that the cluster is running properly.




Root Cause

A host where the active GTM is located cannot be stopped by the cm_ctl stop command. Specifically, hosts where the active GTM exists is not defined in the logic of this command. The GTM can be stopped only when there is no other links, but in this case, the GTM is not the last to be stopped by this command. Therefore, the active GTM process cannot be completely stopped and the standby GTM is not promoted to the active GTM process. As a result, the host where the active GTM process resides fails to be isolated.

Solution

Manually kill the GTM process and restore the instance. For details, see Step 6 to Step 11 in the preceding handling process.

Suggestions

Improve the cm_ctl stop command logic and fix its logic bug. If the host to be isolated is not a management node, isolate it through the FusionInsight GUI.

END