No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

Performance issue on NFS share in 5500V3 storage

Publication Date:  2016-12-30 Views:  92 Downloads:  0
Issue Description

Fault symptom: During a performance comparison test on NFS share folder with about 340,000 small files, the customer got below result. It's extremely slow on V3 storage.

On Netapp storage
$  time ls -lU | wc -l
343689

real    13m25.535s
user    0m5.447s
sys     0m26.711s

On local disks
$ time ls -lU | wc -l
341158

real    0m21.986s
user    0m0.841s
sys     0m4.649s

On 5500V3 storage
$ time ls -lU | wc -l
343708

real    921m38.447s
user    0m1.131s
sys     4m38.775s

Version information: V300R003C10SPC100 

NFS Client version: CentOS 7

Network overview(1Gbps network):


Storage configuration: 20*900GB SAS disk domain; RAID6(16D+2P);

Alarm Information
None
Handling Process

1. Check the mount option, it's correct.

172.22.200.3:/statist   /vol/statist    nfs     _netdev,vers=3,rw,hard,intr,auto        0       0

2. Use ping command to test network, it works fine.

3. Check NFSV3 operation statistics in CLI command line.


As the operation statistics, the READDIRPLUS operation is abnormal. Normally, one READDIRPLUS can read 38 files, 340000 file only need about 10000 READDIRPLUS operation. So, the problem should be some files re-read. There're three possibilities: the read cache is not enough in NFS client; old cache expired before all files listed; there's a lot of write operation caused old file need update.

4. Check NFS client configuration, there's about 4GB free memory in CentOS. And the write operation increased very slow.

5. Use tcpdump to capture network packets.


The average SRT of READDIRPLUS is about 11ms. So, we can confirm the issue caused by old cache expired.

Root Cause
Normally, it only need about 106s to list 340,000 files (34000/38 * 0.011ms =106s). The default valid period of read cache is 3-60s. 106s is larger than this threshold. So, some of the old cache expired and dropped before all files completed reading, need to be readed again.
Solution

1. Set actimeo option when mount NFS share, for example, set it as 300.

2. Put the small files in high performance tier to ensure them traversed in 60s.

3. Put these small files in local disk of NFS client.

END