No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

The failure of single node startup in S2600 storage iscsi cluster circumstance leads the cluster can’t startup normally

Publication Date:  2012-10-17 Views:  69 Downloads:  0
Issue Description
In a site (the attachment is the networking image), S2600 storage device uses single control, the version of controller software is 1.04.01.205.T01, SES version is S021, the operating system in server is AIX6100-01 and use HACMP cluster software, the version is 5.4.1.0. There are two nodes in controller A mapping two private LUN, and 2 hosts share 2 LUN as cluster storage. When the private LUN has been enabled, startup the cluster, single node fail.
Alarm Information
The main node can startup normally but the other one fails.
We find these phenomena by analyzing the storage log:
We find the dispensing of reservation command in log:  
Oct 14 23:25:56 AK-I kernel: [372919227]Reserve (6)[16] command for Host LUN 0, Device Lun 8  @ [jif=372919227] SCSI_PrintDebugInfo : 1382
And we also find the command to clear reservation:
Oct 14 23:25:56 AK-I kernel: [372919957]SCSI_ClearReserveExec
Oct 14 23:25:56 AK-I kernel: [372919957]  @ [jif=372919957] SCSI_ClearReserveExec : 2200
Oct 14 23:25:56 AK-I kernel: [372919957]This is the master controller
Oct 14 23:25:56 AK-I kernel: [372919957]  @ [jif=372919957] SCSI_ClearReserveExec : 2207
Oct 14 23:25:56 AK-I kernel: [372919957]Enter SCSI_ClearReserve
Oct 14 23:25:56 AK-I kernel: [372919957]  @ [jif=372919957] SCSI_ClearReserve : 2286
Handling Process
We appoint the problem is the order of startup and turn off node and private LUN, so we can use this solution:
1. The start order: start HA firstly, then active the LUN’s group by varyonvg command.
2. The close order: close the group of private LUN by command varyonvg, then close HA.
Root Cause
In the situation without private LUN, when we startup the cluster, the node startup early will dispense reservation, then dispense LOGOUT to clear reservation. The other node is same.
If we don’t clear the reservation, the other node can’t startup normally.
Cluster node clear reservation is finished by LOGOUT command, LOGOUT will down all session. But the private LUN and sharing disk share one session for main node, they are two connection in one session. So when the private LUN is active, LOGOUT can’t down the session and can’t clear the reservation. This is the reason of can’t startup the second node.
Suggestions
Please notice the startup order.

END