No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

Performance Monitoring Guide

OceanStor Dorado V3 Series V300R002

This document describes performance monitoring of storage systems, including the monitoring method, indicator planning, configuration monitoring, and problem diagnosis.
Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Locating Problems

Locating Problems

A performance problem may occur in any aspect of the entire I/O path. For example, the host application may have an error, the network where the host resides may be congested, or the back-end storage system may have an error. Therefore, it is important to check all aspects to find out where the issue is and eliminate it.

Context

A common method of locating a performance problem is to compare the average latency of a host with that of a storage system and then determine whether the problem resides on the host side, the network link layer, or the storage side.

  • If their latency is both large and the latency difference is small, the problem may reside in the storage system. Common problems are as follows: The disk performance reaches the upper limit; the mirror bandwidth reaches the upper limit; a short-term performance deterioration occurs due to inconsistency between the owning controller and working controller.
NOTE:

The preceding problems are related to the read latency. Write latency includes the time spent on transmitting data from a host to a storage system. Therefore, if the write latency is large, the problem is not always caused by the storage system. It is necessary to check all the aspects that may cause the problem.

  • If the latency on the host side is much larger than that on the storage side, the configuration on the host side may be inappropriate, leading to a performance or network link problem. Common problems are as follows: I/Os are stacked because the concurrency capability of the block device or HBA is insufficient; the CPU usage of a host reaches 100%; the bandwidth reaches a bottleneck; the switch configuration is inappropriate; the multipathing software selects paths incorrectly.
  • Once the problem is located, the analysis and troubleshooting processes can begin to treat the issue.

Procedure

  1. Check the latency on the host side.

    • On a Linux host, use the following tools to query the host latency.
      • The AWR report facility of Oracle, a performance statistics function of the service software.
      • iostat, a Linux disk I/O query tool.

        Run iostat -kx 1.

        In the command output, await indicates the average time in processing each I/O request, namely, the I/O response time, expressed in milliseconds.

      • Vdbench, a Linux performance test tool.

        In the command output, resp indicates the average time in processing each I/O request, namely, the I/O response time, expressed in milliseconds.

    • On a Windows host, use the following tools to query the host latency.
      • The performance statistics function of service software.
      • IOmeter, a performance test tool commonly used in Windows.

      • Windows Performance Monitor, a performance monitoring tool integrated with Windows.
        • On the Windows desktop, choose Start > Run. In the Run dialog box, enter perfmon to open the performance monitoring tool. The Performance Monitor window is displayed.
        • In the left navigation tree, choose Monitoring Tools > Performance Monitor and click to add performance items.
        • In the Add Counters window, select PhysicalDisk and add performance items that you want to monitor. Then, choose Add > OK. Windows Performance Monitor starts monitoring disk performance.
          Table 4-1 describes performance items related to latency.
          Table 4-1 Disk performance items related to latency

          Indicator

          Subitem

          Description

          Latency indicator

          Avg. Disk sec/Transfer

          Average time in processing each I/O on the storage side, expressed in milliseconds.

          Avg. Disk sec/Read

          Average time in processing each read I/O on the storage side.

          Avg. Disk sec/Write

          Average time in processing each write I/O on the storage side.

    • Checking AIX host latency

      Run iostat to check AIX host latency.

    • Checking VMware ESXi host latency
      1. Run esxtop and press Enter to view CPU performance statistics.

      2. Enter U to switch to storage (LUN) performance statistics.

      3. Enter F to switch to the following page. The items with asterisks are monitored. Enter J and K to check storage latency. Press Enter to return to the main page.

      4. Enter D to switch to storage adapter (HBA) performance statistics.

      5. Enter F to switch to the following page. Enter H and I to monitor HBA latency.

      6. Enter V to switch to virtual machine performance statistics.

      7. Enter F to go to the following page. Enter G and H to monitor virtual machine latency.

  2. Check the latency on the storage side.

    • Use SystemReporter.

      Operation path: Monitoring > Real-Time Monitoring > Controller

    • Run the CLI command to query the latency on the storage side.

      Log in to the CLI of the storage system and run show performance controller to query the average I/O response time of the specified controller, namely, Average I/O Latency (ms).

      admin:/>show performance controller controller_id=0A 
      0.Max. Bandwidth(MB/s) 
      1.Queue Length 
      2.Bandwidth(MB/s) / Block Bandwidth(MB/s) 
      3.Throughput(IOPS)(IO/s) 
      4.Read Bandwidth(MB/s) 
      5.Average Read I/O Size(KB) 
      6.Read Throughput(IOPS)(IO/s) 
      7.Write Bandwidth(MB/s) 
      8.Average Write I/O Size(KB) 
      9.Write Throughput(IOPS)(IO/s) 
      10.Read I/O Granularity Distribution: [0K,1K)(%) 
      11.Read I/O Granularity Distribution: [1K,2K)(%) 
      12.Read I/O Granularity Distribution: [2K,4K)(%) 
      13.Read I/O Granularity Distribution: [4K,8K)(%) 
      14.Read I/O Granularity Distribution: [8K,16K)(%) 
      15.Read I/O Granularity Distribution: [16K,32K)(%) 
      16.Read I/O Granularity Distribution: [32K,64K)(%) 
      17.Read I/O Granularity Distribution: [64K,128K)(%) 
      18.Read I/O Granularity Distribution: [128K,256K)(%) 
      19.Read I/O Granularity Distribution: [256K,512K)(%) 
      20.Read I/O Granularity Distribution: >= 512K(%) 
      21.Write I/O Granularity Distribution: [0K,1K)(%) 
      22.Write I/O Granularity Distribution: [1K,2K)(%) 
      23.Write I/O Granularity Distribution: [2K,4K)(%) 
      24.Write I/O Granularity Distribution: [4K,8K)(%) 
      25.Write I/O Granularity Distribution: [8K,16K)(%) 
      26.Write I/O Granularity Distribution: [16K,32K)(%) 
      27.Write I/O Granularity Distribution: [32K,64K)(%) 
      28.Write I/O Granularity Distribution: [64K,128K)(%) 
      29.Write I/O Granularity Distribution: [128K,256K)(%) 
      30.Write I/O Granularity Distribution: [256K,512K)(%) 
      31.Write I/O Granularity Distribution: >= 512K(%) 
      32.CPU Usage(%) 
      33.Memory Usage(%) 
      34.Percentage of Cache Flushes to Write Requests(%) 
      35.Cache Flushing Bandwidth(MB/s) 
      36.Read Cache Hit Ratio(%) 
      37.Write Cache Hit Ratio 
      38.Cache Read Usage(%) 
      39.Cache Write Usage(%) 
      40.Average IO Size(KB) 
      41.% Read 
      42.% Write 
      43.Max IOPS(IO/s) 
      44.Service Time(Excluding Queue Time)(us) 
      45.Average I/O Latency(us) 
      46.Max. I/O Latency(us) 
      47.Average Read I/O Latency(us) 
      48.Average Write I/O Latency(us) 
      49.Max. Read I/O Size(KB) 
      50.Max. Write I/O Size(KB) 
      51.Max. I/O Size(KB) 
      52.The cumulative count of I/Os 
      53.The cumulative count of data transferred in Kbytes 
      54.The cumulative elapsed I/O time(ms) 
      55.The cumulative count of all reads 
      56.The cumulative count of all read cache hits(Reads from Cache) 
      57.The cumulative count of data read in Kbytes(1024bytes = 1KByte) 
      58.The cumulative count of all writes 
      59.The cumulative count of Write Cache Hits (Writes that went directly to Cache) 
      60.The cumulative count of data written in Kbytes 
      61.Cache page utilization(%) 
      62.Cache chunk utilization(%) 
      Input item(s) number seperated by comma:6 
      Read Throughput(IOPS) (IO/s) : 26
    NOTE:

    For details about how to log in to the CLI of a storage system and the usage of CLI commands, see the Command Reference of the corresponding version. Interfaces and CLI command outputs vary with different versions. The actual interfaces and outputs prevail.

    You can set read and write latency thresholds for controllers of the storage. The average read I/O response time is 50000 μs by default and the average write I/O response time is 20000 μs by default. If the read or write latency of the storage system's controllers exceeds the threshold, the system starts to collect information and saves it to /OSM/coffer_data/product/OMM/. The size of all files in this directory should not exceed 14 MB. Otherwise, existing files are overwritten. The information collected is used for subsequent performance tuning or problem locating and analysis.

    On the CLI, run show performance threshold to view the read and write latency of the storage. If the thresholds cannot meet service requirements, run change performance threshold on the CLI to modify them.

    NOTE:

    You can also use show performance threshold or change performance threshold to view and modify the read and write latency of LUNs and file systems.

    • LUN: The average read I/O response time is 50000 μs by default and the average write I/O response time is 20000 μs by default.
    • File system: The average read OPS response time is 50000 μs by default and the average write OPS response time is 25000 μs by default.

Translation
Download
Updated: 2019-07-17

Document ID: EDOC1100049152

Views: 11563

Downloads: 78

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next