No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

Unbalanced performance on OceanStor 6800V3 HyperMetro solution

Publication Date:  2017-11-08 Views:  222 Downloads:  0
Issue Description

Fault Symptom: Customer has two 6800V3 Dual-Controller AFA storages for HyperMetro solution, mapped 4 thick LUNs(owning to different controllers) to VMware. When customer test performance with serveral workers with VMware I/O analyser and notices differences between the controllers, CTE.0B has constant higher performance rates than CTE.0A, and one LUN has consistent higher performance than the others.

Viewed by controller:

Viewed by LUN:

Networking topology:


Version information: V300R003C20SPC200

Test cases:


Handling Process

1. Check the configuration of the 4 LUNs, they are working in AA mode, 2 of them are belong to controller A and the others belong to controller B. At VMware side, customer create VMFS and striped, then create VM on datastore.

Normally, controller A and controller B should get equal performance test result. And the LUNs should get equal perfomrance test result.

2. Analyze the storage log, we can find a lot of multipath failover events, as below:

 [2017-10-18 19:50:47][Mode select send failover event ret(0), dev LUN(17), ctrl id(0), context(ffff88031f0513e0).][SCSI][scsiModeSelectMasterFailover,735][CSD_30]
 [2017-10-18 19:50:47][Mode select send failover event ret(0), dev LUN(0), ctrl id(0), context(ffff88031f37a6d0).][SCSI][scsiModeSelectMasterFailover,735][CSD_29]
 [2017-10-18 19:51:48][Mode select send failover event ret(0), dev LUN(14), ctrl id(0), context(ffff88031f2817a0).][SCSI][scsiModeSelectMasterFailover,735][CSD_20]
 [2017-10-18 20:06:27][Mode select send failover event ret(0), dev LUN(12), ctrl id(0), context(ffff88031f1f2108).][SCSI][scsiModeSelectMasterFailover,735][CSD_35]
 [2017-10-18 20:06:27][Mode select send failover event ret(0), dev LUN(14), ctrl id(0), context(ffff88031f8f75a0).][SCSI][scsiModeSelectMasterFailover,735][CSD_36]
 [2017-10-18 20:19:40][Mode select send failover event ret(0), dev LUN(12), ctrl id(0), context(ffff88031e9dac68).][SCSI][scsiModeSelectMasterFailover,735][CSD_15]

These failover commands were sent by multipath software of ESXi. So, we colllected vmsupport to analyze why VMware ESXi need to failover paths.

3. Analyze VMware kernel log(var\run\log\vmkernel.log), we found below error messages:

2017-11-01T13:23:07.961Z cpu2:33571)UPA-log: INTF_KLS_Log:1231: [19401][INFO][EVENT][LPM_INTF_PingPrintIOLatencyEvent][7149][3][2102350XCS10H6000002][12][223]High I/O latency {37}ms, path {3}, disk {11}, Host Lun ID {2}.
2017-11-01T13:23:07.961Z cpu2:33571)UPA-log: INTF_KLS_Log:1231: [19402][INFO][EVENT][LPM_INTF_PingPrintIOLatencyEvent][7149][3][2102350XCS10H6000002][12][224]High I/O latency {37}ms, path {0}, disk {11}, Host Lun ID {2}.
2017-11-01T13:23:07.961Z cpu2:33571)UPA-log: INTF_KLS_Log:1231: [19403][INFO][EVENT][LPM_INTF_PingPrintIOLatencyEvent][7149][3][2102350XCS10H6000003][12][225]High I/O latency {25}ms, path {1}, disk {14}, Host Lun ID {2}.
2017-11-01T13:23:07.961Z cpu2:33571)UPA-log: INTF_KLS_Log:1231: [19404][INFO][EVENT][LPM_INTF_PingPrintIOLatencyEvent][7149][3][2102350XCS10H6000003][12][226]High I/O latency {26}ms, path {7}, disk {14}, Host Lun ID {2}.
2017-11-01T13:25:12.183Z cpu4:33571)UPA-log: INTF_KLS_Log:1231: [19405][WARN][LPM][LPM_ReliabilityIsPhyPathHighLatency][3706]Skip Phypath{2} once because the divisor number is zero HostCount{5} ArrayCount{0}.
2017-11-01T13:25:12.184Z cpu4:33571)UPA-log: INTF_KLS_Log:1231: [19406][WARN][LPM][LPM_ReliabilityIsPhyPathHighLatency][3706]Skip Phypath{4} once because the divisor number is zero HostCount{10} ArrayCount{0}.

It means storage IO latency two high cause multipath software set the path status as degrade status and failover to other paths.

4. Analyze storage performance log again, we do find large write IO response on controller, like below:

But on disk domain layer, the write response time is normal.


5. So, this performance issue is concentrate on software, replication link and mirror link between controllers. We analyze the performance log on remote site and find the similar result.

6. First, we analyze the replication link between the HyperMetro sites, Check replication link in "Other\rss_info". As below:

diagnose>epl showarray
Total Link up Clasp count:3,succ:3,Link Down:2
=================== PhysicalArray(0): ======================
ID WWN   EBC LiveLink Status Ref Compatible OLUSTA
0 0x2100446a2efa96f4 404 0x00020002 READY 4 2  0
ARRAY(HWS1-MER2), WWN:0x2100446a2efa96f4, SN:2102350XCS10H6000002, OLUSTA:0, TYPE:3
CAPS: DIF(T), BST(T), FILE(T), Attr(0x3d), AttrCfg(0xd), Version(Max:2,Min:2,Use:2)
PHYSIC LINK 2 on ctrlmap 3h:
Ctrl 0's link:
  * LinkId  , LinkType, lCtrl, lPort   , rCtrl, rPort   , sdev    , status, flag, bandwidth, REF , RPORTS      .
    0       , iSCSI   , 0    , 0x20402 , 0    , 0x20402 , 35392   , READY , 3   , 10000000 , 2   , CTE0.R4.IOM0.P2.
    localIp (10.45.51.12                                       ), INI: iqn.2006-08.com.huawei:oceanstor:2100446a2e91e4e0::0.
    remoteIp(10.45.51.15                                       ), TGT: iqn.2006-08.com.huawei:oceanstor:2100446a2efa96f4::20402:10.45.51.15.
    HB: --
    Q : --
    Bandwidth Adapter: --

-------------------------------------------------------------------
Ctrl 1's link:
  * LinkId  , LinkType, lCtrl, lPort   , rCtrl, rPort   , sdev    , status, flag, bandwidth, REF , RPORTS      .
    256     , iSCSI   , 1    , 0x20402 , 1    , 0x20402 , 38016   , READY , 3   , 10000000 , 2   , CTE0.L4.IOM0.P2.
    localIp (10.45.52.12                                       ), INI: iqn.2006-08.com.huawei:oceanstor:2100446a2e91e4e0::1.
    remoteIp(10.45.52.15                                       ), TGT: iqn.2006-08.com.huawei:oceanstor:2100446a2efa96f4::1020402:10.45.52.15.
    HB: Norm, Keep(3019919), Lost:--. 5527421s+0ms 5527422s+0ms 5527423s+0ms 5527424s+0ms 5527425s+0ms
    Q : link(100h, U), sdev(38016, depth:128) enqueMax:417.(cC,mC,mJ). sQ(0,128,90). wQ(0,2,1)(0,216,10)(0,238,54)(0,0,0)
    Bandwidth Adapter: good/bad/total(0/0/0), level(100), time(45530037 jif).

There're only two replication links, one link on each controller, the link bandwidth is 10000bps. So, the problem is very clear, performance bottleneck is replication link.

 

Root Cause

1. The speed of replication link is 10000bps, which means maximum 1200MB/s bandwidth.

2. Since the storage is all flash array, the maximum write bandwidth on each LUN is around 800MB/s. Theoretically, each controller can reach 1600MB/s bandwidth, but the replication link can only reach 1200MB/s bandwidth. So, there will be many IO stuck at replication link and made the IO response time higher than usual.

3. At first, controller 0A encountered the issue, and VMware failover one path from controller A to controller B. In this case, there's one LUN working on controller A but 3 LUNs working on controller B.

So, customer found the LUN which working on controller A has 800MB/s write bandwidth. And the three LUN working on controller B "shared" the 1200MB/s bandwidth, each LUN 400MB/s.

Finally, the bandwidth on controller A is 800MB/s, while controller B is 1200MB/s.

Solution

Add another 2 replication links between the 2 HyperMetro storages.

Suggestions

When we design storage solution, at performance aspect, we need to check both front-end, back-end and the inter-connection between storages, to make sure there's no obviously performance bottleneck.

END