Working Principles
This section describes I/O traffic control and hierarchical management. SmartQoS manages I/O queues based on LUNs to ensure the performance of critical applications.
I/O Traffic Control
I/O traffic control is implemented by LUN-based I/O queue management, token allocation, and dequeue management.
LUN-based I/O queue management uses a token mechanism to allocate storage resources. SmartQoS determines the number of storage resources to be allocated to an I/O queue of an LUN by counting the number of tokens owned by the queue. The more tokens owned by an I/O queue, the more resources will be allocated to that queue. Accordingly, I/Os in that queue will be preferentially processed. Figure 1-1 shows how SmartQoS manages I/O queues.
The process of managing I/O queues is described as follows:
- After application servers send I/O requests, the storage system delivers the requests to corresponding I/O queues.
- The storage system adjusts the number of tokens owned by LUN I/O queues based on the priorities of LUNs. By reducing the number of tokens owned by queues that have a low priority, the storage system ensures that sufficient resources are available for high-priority LUNs so that I/Os to these LUNs can be preferentially processed.
- The storage system processes the I/Os in queues by priority.
Example 1: Traffic control of LUNs in different SmartQoS policies
SmartQoS manages LUN 001 and LUN 002. Different SmartQoS policies are configured for LUN 001 and LUN 002. Table 1-1 lists the performance objectives.
LUN Name |
Performance Objectivea |
---|---|
LUN 001 |
Bandwidth: 300 MB/s |
LUN 002 |
Bandwidth: 200 MB/s |
a: Measurable performance objectives include bandwidth and IOPS. Performance objectives must be reasonably set to match the actual application performance characteristics. |
The storage system translates performance objectives into the number of tokens needed. Specifically, the performance objective of LUN 001 requires 300 tokens, whereas that of LUN 002 requires 200 tokens. If system resources are insufficient, the storage system limits the resources used by LUN 002 because LUN 002 has fewer tokens. The storage system provides more system resources for LUN 001, thereby delivering better performance for LUN 001.
Hierarchical Management
SmartQoS supports both common and hierarchical policies.
- Common policy: controls the traffic from a single type of application to LUNs.
- Hierarchical policy: controls the traffic from various types of applications to LUNs. Common policies can be added to hierarchical policies.
Figure 1-2 shows their relationship.
Burst Traffic Control Management
For latency-sensitive services, you can allow them to exceed the upper limit for a specific period of time. SmartQoS supports burst traffic control management to specify the burst IOPS, bandwidth, and duration for LUNs, LUN groups, or hosts.
The system accumulates the unused resources during off-peak hours and consumes them during traffic bursts to break the upper limit for a short period of time. To achieve this, the long-term average traffic of the service should be below the upper limit.
- If the traffic of a LUN, LUN group, or host does not reach the upper limit in several seconds, its traffic can exceed the upper limit in the next few seconds when burst traffic occurs. The maximum duration of a burst is configurable.
- Burst traffic control is implemented by accumulating burst durations. If the traffic of a LUN, LUN group, or host is below the upper limit in a second, the system accumulates this second for the burst duration. When the service load surges, performance can break the upper limit to reach the specified burst limit for a duration accumulated earlier (this duration will not exceed the maximum value specified).
- When the accumulated duration or the specified maximum duration is reached, the performance drops below the upper limit.
Minimum Performance Assurance
SmartQoS suppresses the performance of LUNs that have not been configured with the minimum performance assurance policy to guarantee the performance of LUNs that have been configured with this policy. The system implements minimum performance assurance as follows:
- Allocates the minimum performance assurance objectives configured in the SmartQoS policy to each LUN in the policy based on performance requirements.
- Sets the default minimum performance assurance objective for LUNs that are not added to the minimum performance assurance SmartQoS policy to ensure that services on the LUNs are not interrupted.
- Calculates the performance gap and margin based on the minimum performance assurance objective and current performance of each LUN.
- The value of the minimum performance assurance objective exceeding the actual performance is the performance margin.
- The value of the actual performance exceeding the minimum performance assurance objective is the performance gap.
- Suppresses the performance of the top 10 LUNs with the largest performance margin until no performance gap exists in the system.
- In scenarios where the service pressure is unstable (that is, the overall IOPS and bandwidth of services fluctuate dramatically), it is not advised to configure any minimum performance assurance policy. With unstable service pressure, SmartQoS re-adjusts the resource allocation between services with the minimum performance assurance policy and services without such a policy, which may cause the service performance fluctuate more dramatically.
- If the minimum performance assurance policy is configured, SmartQoS will limit resources allocated to services to which the policy is not applicable. As a result, the performance of these services will deteriorate and latency will increase. It is advised to evaluate whether the impact is acceptable before setting the minimum performance assurance policy.
Objective Distribution
All objects in a SmartQoS policy share the upper limit objectives. SmartQoS periodically collects the performance and requirement statistics of all LUNs or hosts in a traffic control policy, and distributes the traffic control objective to each LUN or host.
Currently, the optimized weighted max-min fairness algorithm is used for objective distribution. It determines the traffic control objective for each object (that is, LUN or host) based on the policy's overall objective and each object's resource requirement. The algorithm preferentially meets each object's resource requirement, and then distributes remaining resources to each object based on the object's weight. In addition, it uses a filtering mechanism to ensure a relative stable objective for each object.
You can add each LUN to a traffic control policy separately, or add LUNs to a LUN group and then add the LUN group to a traffic control policy. When a LUN is added to multiple traffic control policies, the smallest value among the upper limits takes effect for the LUN.