FusionManager Monitoring Scripts Trigger Host Restart

Publication Date:  2015-01-26 Views:  296 Downloads:  0
Issue Description
If FusionManager is used to monitor physical servers on which Computing Node Agent (CNA), database, and storage nodes are deployed, the servers may automatically restart at the following sites:
  • FusionCube V100R002C01 sites where FusionManager V100R003C00SPC300, V100R003C00SPC301, or V100R003C00SPC302 is used.
  • FusionSphere V100R003C00 sites where FusionManager V100R003C00SPC300, V100R003C00SPC301, or V100R003C00SPC302 is used and hosts are connected to the FusionManager system.
Alarm Information
none.
Handling Process
  • Sites using FusionSphere V100R003C00: Upgrade FusionCompute and FusionManager in FusionSphere V100R003C00SPC202 to V100R003C00SPC302 and V100R003C00SPC303, respectively, or later versions.
  • Sites using FusionCube V100R002C01: Upgrade FusionCompute and FusionManager in FusionCube V100R002C01SPC100 to V100R003C00SPC302 and V100R003C00SPC303, respectively, or later versions.
  • Sites using ManageOne: Delete all servers connected to the FusionManager system on the FusionManager portal, or upgrade FusionManager to V100R003C00SPC232.
Root Cause
After servers are connected to the FusionManager system, FusionManager distributes monitoring scripts to the servers and periodically execute the scripts to monitor the servers. The monitoring protection program in each monitoring script periodically checks statuses of processes. If the monitoring protection program detects that a process suspends, the program automatically terminates the process. However, the filter conditions for checking whether a process suspends is inappropriately configured in the script. As a result, some system processes and watchdog processes may be unexpectedly stopped, leading to unexpectedly restart of hosts in the system. 
Suggestions
Perform the preceding operations to rectify the fault at all sites that use FusionManager V100R003C00SPC300, V100R003C00SPC301, V100R003C00SPC302, or V100R003C10SPC231.

END