Cluster Maintenance
Common Maintenance Commands
Starting a Cluster
Run the following command to start a cluster:
smit clstart
Then, configure the following parameters:
- Start Cluster Services on these nodes
Indicates the two cluster nodes to be started upon the cluster startup. The two nodes can be started one by one or at the same time.
- Startup Cluster Information Daemon
Indicates whether to start clinfoES upon the cluster startup. If this parameter is set to false, you cannot run the #/usr/sbin/cluster/clstat -a command to view cluster running status.
Check whether any errors occur during the cluster startup.
If any errors occur, stop the cluster first and then rectify faults based on error messages. Then, start the cluster again.
Stopping a Cluster
Run the following command to stop a cluster:
smit clstop
Then, configure the following parameters:
- Stop Cluster Services on these nodes
Indicates the cluster nodes to be stopped.
- Select an Action on Resource Groups
Indicates the mode to stop a cluster.
Possible values are bring resource groups offline (graceface), move resource groups (takeover), and unmanage resource groups (forced). The meaning of each value is as follows:
- bring resource groups offline
Stops the cluster on this node. The peer node is not affected.
- move resource groups
Stops the cluster on this node. The peer node takes resources from this node.
- unmanage resource groups
Forcibly stops HA on this node without releasing any resources. The peer node is not affected.
- bring resource groups offline
Checking Cluster Status
The cluster status includes the cluster process status and cluster service status. Perform the following steps to check the cluster status:
- Check the cluster process status on nodes. The command syntax is as follows:
lssrc -g cluster
The command output is as follows:
Figure 10-1 Cluster process status - Check the cluster service status on nodes. The command syntax is as follows:
#/usr/sbin/cluster/clstat -r 2 -a
In this command, 2 indicates the display of current status every 2 seconds.
The command output is as follows:
Figure 10-2 Cluster service statusIn the output, the service IP address and the resource group of the cluster are on node ibm31 and the node is online, which indicate that the cluster is in normal state.
Cluster Switchover
Perform as follows to switch services between two nodes:
Run the smit hacmp command on the host and then choose:
System Management (C-SPOC) > HACMP Resource Group and Application Management > Move a Resource Group to Another Node
Cluster Log Analysis
Cluster logs are used in fault diagnosis when clusters encounter problems.
HACMP clusters have the following logs:
- /var/hacmp/adm/cluster.log
A major HACMP log file that contains all HACMP errors and events in chronological order.
- /var/hacmp/log/cspoc.log
Contains all messages generated by C-SPOC commands. This log file resides on the node that invokes C-SPOC commands. All messages in this log file are recorded in chronological order.
- /var/hacmp/log/hacmp.out
Contains all outputs after the execution of configuration and startup scripts. This log file is a supplement to /var/adm/cluster.log. When an anomaly occurs on a cluster, /var/hacmp/log/hacmp.out is checked first.
For more log information, see /var/hacmp/log.