No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

Inaccessible Databases Due to the Ping-pong Effect of Multipathing

Publication Date:  2012-07-22 Views:  58 Downloads:  0
Issue Description

For the Oracle RAC database service in a project, the storage resources are shared between two hosts through a multipathing software. The LUN (LUN0) on the storage device is mapped to both hosts, but one of the hosts is inaccessible.

Product and version information:
  • S5000 series
  • The hosts are Huawei ATAE blade servers
  • The OS is SUSE 9 SP3
  • The multipathing software is UltraPath for Windows V100R002C01
The networking is as shown in Figure 1.
Figure 1 Networking of the Ping-pong effect
 
Alarm Information
None
Handling Process
  1. At the command line, type the command upadm show option to check whether the failover function is off.

     

    # upadm show option

    The detailed information is as follows:
    maxlun = 256
    maxpath = 4
    maxcontroller = 8
    maxarray = 30
    failback_interval = 60
    optimal_path_check_interval = 60
    failed_path_check_interval = 30
    iopolicy = round_robin
    lbcontroller = off
    failover = on
    maxtargetid = 512
    

     

  2. When failover is set to on, run the command upadm set failover=off to disable the failover function.

     

    # upadm set failover=off

     

  3. Run the command upadm start updateimage to update the multipath configuration.

     

    # upadm start updateimage

     

  4. Run the command upadm show option to check whether the multipathing is off.

     

    # upadm show option

    The detailed information is as follows:
    maxlun = 256
    maxpath = 4
    maxcontroller = 8
    maxarray = 30
    failback_interval = 60
    optimal_path_check_interval = 60
    failed_path_check_interval = 30
    iopolicy = round_robin
    lbcontroller = off
    failover = off
    maxtargetid = 512
    

     

Root Cause
  1. By analyzing the log, it is found that on both hosts, massive switchover of the access path for LUN0 exists for the multipathing software.
  2. At a later time, the link between DB1 and controller A is found to be in Link Down state in the log.
  3. By recollecting the log for analysis, the link between DB2 and controller A is also found to be in Link Down state. At this time, the storage device is switching over the primary controller for LUN0, and I/O timeout occurs as displayed in the database log.

Conclusion:

  • The repeated switchover of primary controller occurs on the LUN due to the Ping-pong effect of multipathing, causing the database to be inaccessible.
Suggestions
  • Do not map a LUN to two or more hosts simultaneously.
  • When users need to map a LUN to both hosts simultaneously under certain application scenarios, they must install the cluster software and configure the cluster reservation on the hosts.

 

The Ping-pong effect means the repeated switchover of primary controller for a LUN. When the Ping-pong effect occurs, the primary controller of a LUN switches between the controllers A and B repeatedly. The Ping-pong effect can result in low performance of accessing LUNs and I/O timeouts on hosts.

END