Configuring Low-Latency Networks
<HUAWEI> system-view [~HUAWEI] low-latency fabric Info: Please save the configuration and reboot the system to enable the operation. [*HUAWEI-low-latency-fabric] quit [*HUAWEI] commit [~HUAWEI] quit <HUAWEI> save Warning: The current configuration will be written to the device. Continue? [Y/N]: y <HUAWEI> reboot Warning: The system will reboot. Continue? [Y/N]: y
Verifying the VIQ Configuration
<HUAWEI> display qos buffer ingress-usage interface 100GE1/0/1 Ingress Buffer Usage (KBytes) on lossless priority: (Current/Total) *: Dynamic threshold -------------------------------------------------------------------------- Interface Priority Guaranteed PFC-Xoff Headroom -------------------------------------------------------------------------- 100GE1/0/1 3 0/2 0/4* 0/67 -------------------------------------------------------------------------- Service Pool: 12/30699 Headroom Pool: 0/4095 --------------------------------------------------------------------------
# Run the display dcb pfc buffer command to check the PFC threshold of lossless queues. The threshold has been automatically adjusted and the dynamic PFC threshold is configured for XOFF.
<HUAWEI> display dcb pfc buffer Xon: PFC backpressure stop threshold Xoff: PFC backpressure threshold Hdrm: Headroom buffer threshold Guaranteed: PFC guaranteed buffer threshold The actual PFC backpressure stop threshold is the higher value between the value of xon and the difference between the value of xoff and the value of xon-offset. C:cells B:bytes K:kilobytes M:megabytes D:dynamic alpha -------------------------------------------------------------------------------- Interface Queue Guaranteed Xon Xon-Offset Xoff Hdrm -------------------------------------------------------------------------------- 100GE1/0/1 3 10(C) 100(C) 20(C) 4(D) 250(C) --------------------------------------------------------------------------------
<HUAWEI> display qos buffer egress-usage interface 100GE1/0/1 Egress Buffer Usage (KBytes) on single queue: (Current/Total) *: Dynamic threshold ------------------------------------------------------------ Interface Queue Type Guaranteed Shared ------------------------------------------------------------ 100GE1/0/1 0 Lossy 0/1 0/5* 1 Lossy 0/1 0/5* 2 Lossy 0/1 0/5* 3 Lossless 0/1 0/10156 4 Lossy 0/1 0/5* 5 Lossy 0/1 0/5* 6 Lossy 0/1 0/5* 7 Lossy 0/1 0/5* ------------------------------------------------------------ Lossless Service Pool (cells): 0/0 Lossy Service Pool (cells): 0/151136 ------------------------------------------------------------
Configuring Dynamic ECN
If the device does not support the AI ECN function, you can configure the dynamic ECN function on the device. After the low-latency network function is enabled, the dynamic ECN threshold function is enabled by default.
<HUAWEI> system-view [~HUAWEI] low-latency fabric Info: Please save the configuration and reboot the system to enable the operation. [*HUAWEI-low-latency-fabric] quit [*HUAWEI] commit [~HUAWEI] quit <HUAWEI> save Warning: The current configuration will be written to the device. Continue? [Y/N]: y <HUAWEI> reboot Warning: The system will reboot. Continue? [Y/N]: y
In most cases, you do not need to perform any configuration. However, in the following situations, you can adjust the dynamic ECN threshold parameters based on the service performance counters:
- PFC is frequently triggered, and flow control is frequently performed for RoCEv2 traffic. As a result, the throughput of service flows decreases.
- PFC is not triggered, but the throughput of RoCEv2 service flows does not meet service requirements.
# Check the numbers of sent and received PFC frames to determine whether PFC is triggered frequently.
<HUAWEI> display dcb pfc interface 100ge 1/0/1 ----------------------------------------------------------------------------------------- Interface Queue Received(Frames) ReceivedRate(pps) DeadlockNum Transmitted(Frames) TransmittedRate(pps) RecoveryNum ----------------------------------------------------------------------------------------- 100GE1/0/1 3 148990 99 0 0 0 0 -----------------------------------------------------------------------------------------
In the dynamic ECN threshold function, the qos dynamic-ecn-threshold command can be used to adjust the expected forwarding delay and ECN marking probability of all lossless queues enabled with the dynamic ECN threshold function. By default, the expected forwarding delay and ECN marking probability of all lossless queues enabled with the dynamic ECN threshold function are 50 microseconds and 1, respectively.
Actually, when the qos dynamic-ecn-threshold command is used to adjust the expected forwarding delay, the ECN threshold is adjusted. If the ECN marking probability remains unchanged and the expected forwarding delay of RoCEv2 packets is decreased, a low buffer depth needs to be maintained. To implement this, the algorithm needs to reduce the ECN threshold so that the ECN flag can be added as soon as possible. If the expected forwarding delay of RoCEv2 packets is increased, a large buffer depth needs to be maintained. To implement this, the ECN threshold needs to be increased to reduce ECN marking operations. In the preceding two scenarios, you are advised to adjust the parameters as follows:
- If PFC is triggered frequently and the throughput of service flows is affected, run the qos dynamic-ecn-threshold command to reduce the expected forwarding delay of the queue to reduce the ECN threshold. In this way, triggering PFC can be reduced or even avoided.
- If PFC is not triggered but the throughput of service flows is still low, run the qos dynamic-ecn-threshold command to increase the expected forwarding delay of the queue. This increases the ECN threshold to maintain a large buffer depth, enhancing the queue's capability to accommodate burst traffic and improving the throughput of service flows.
Recommended settings:
<HUAWEI> system-view [~HUAWEI] low-latency fabric [~HUAWEI-low-latency-fabric] qos dynamic-ecn-threshold target-delay 30 mark-percentage 25
Application Scenario |
Leaf Device Model |
Spine Device Model |
Expected Delay |
Maximum Drop Probability (Maximum Marking Probability) |
---|---|---|---|---|
Distributed storage |
CE6865-48S8CQ-EI |
CE8850-64CQ-EI, CloudEngine 16800 |
30 μs |
25% |
HPC |
CE8850-64CQ-EI |
CE8850-64CQ-EI |
30 μs |
25% |
AI GPU |
CE8861-4C-EI |
CE8850-64CQ-EI |
30 μs |
25% |
Precautions:
- The AI ECN function and the dynamic ECN threshold function are mutually exclusive.
If the static ECN threshold or WRED function has been enabled for a lossless queue, the dynamic ECN threshold function cannot be enabled for lossless queues on the device.