No relevant resource is found in the selected language.

# Understanding Microburst

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Understanding Microburst

# Understanding Microburst

## Introduction

This document describes the definition, impact, detection method, and countermeasures of a microburst.

## Definition of a Microburst

A microburst refers to a situation in which a large amount of burst data is received in milliseconds, so that the burst data rate is tens or hundreds times higher than the average rate or even exceeds the port bandwidth.

The NMS or network performance monitoring software calculates the real-time network bandwidth at an interval of seconds to minutes. In such an interval, the network traffic seems to be stable, as shown in Figure 1-1. However, at lower levels of granularity, for example, milliseconds, the traffic rate exhibits a sawtooth pattern, as shown in Figure 1-2. Many microbursts occur in the actual traffic.

An example of a microburst in extreme situations is as follows: A 10GE link transmits traffic at an average rate of 1 Gbit/s. However, traffic at the rate of 10 Gbit/s is transmitted in the first 100 milliseconds but no traffic is sent in the remaining 900 milliseconds. A microburst occurs when traffic is transmitted at the rate of 10 Gbit/s in the first 100 milliseconds.

Figure 1-1 Traffic statistics at a higher level of granularity
Figure 1-2 Traffic statistics at a lower level of granularity

## Common Misunderstandings of a Microburst

Common misunderstandings on microbursts include:
1. Why cannot microbursts be presented in the value of Input peak rate on port statistics of a switch?

On a switch, the packet rate is the quotient of the total number of packets in a specific period of time divided by the period. For CE series switches, the values of Input peak rate and Last 300 seconds input rate are calculated based on a period of 300 seconds by default. The value of Input peak rate is the highest average rate in all historical statistics periods. The interval for collecting statistics on a port is configurable, but the minimum interval is 10 seconds.

Therefore, microbursts that occur in milliseconds cannot be presented in the value of Input peak rate.

Why cannot switches support a collection interval of 10 ms? If switches support a collection interval of 10 ms, microbursts will be presented in the port traffic statistics.

This is because the CPU needs to poll chips to collect statistics on packets of a port. However, a switch has a large number of ports, especially in scenarios where switches are stacked or 40GE/100GE ports are split. Polling at an interval of 10 ms is CPU-intensive. As a result, the switch will respond slowly or even does not respond.

2. Why is no alarm reported for microbursts upon buffer exhaustion?

Similarly, the CPU uses the polling mechanism to query the buffer usage. If the CPU frequently polls the buffer usage, the CPU may be overloaded. As a result, the switch will respond slowly or even does not respond.

3. The traffic curve seems to be smooth on the NMS, but actually microbursts have occurred. Why does the NMS fail to detect the microbursts though it monitors the port traffic around the clock?

Both the data reported by CE series switches and interval for the NMS to monitor data are accurate to seconds. The second-level traffic graph cannot present microbursts that occur in milliseconds.

4. Currently, the port rate and utilization are low. Therefore, microbursts will seldom occur.

This is a wrong understanding. There is no linear relationship between the average rate and burst rate. A low port rate or utilization does not mean a low burst rate.

5. Switches record the number of packets discarded due to congestions. Therefore, switches cause microbursts or packet loss.

This is a wrong understanding. Burst traffic is generated by service terminals. Except for a small number of protocol packets, switches do not generate other traffic. However, the burst may be aggravated on switches. For example, if multiple ports send data to a single port concurrently, the convergence ratio is improper, and the burst peak value is exacerbated. Therefore, the source of bursts needs to be located based on the traffic source and networking.

6. Service traffic is random on servers. Therefore, servers will not encounter heavy traffic bursts.

Service traffic is either sent constantly at a stable rate or alternately sent at a high speed with severe bursts.

The randomness of services is not equivalent to that of TCP packets. If the sending window is not full, the traditional TCP packet mechanism always wants to occupy more bandwidth resources to send packets out as soon as possible. Therefore, bursts are more likely to occur in the traditional TCP packet mechanism. Bursts are severer on server ports with higher bandwidths.

## Impact of a Microburst

When a microburst exceeds the forwarding capability of the switch, the switch buffers the burst data for later transmission. If the switch does not have sufficient buffer space, the excess data can only be discarded, causing congestion and packet loss.

Figure 1-3 Impact of microbursts

Figure 1-3 shows a typical millisecond-level microburst scenario. Assume that Port1 and Port2 respectively send 5 MB data to Port3 at a line rate of 10 Gbit/s. The total transmission rate is 20 Gbit/s. Port3 supports only a rate of 10 Gbit/s, which is a half of the total transmission rate. It sends only 5 MB data out and buffers the other 5 MB data for transmission later. However, the switch has only 1 MB buffer space. Therefore, 4 MB data is discarded due to insufficient buffer space. Without considering overhead data such as the inter-frame gap, preamble, frame checksum, and packet header, the microburst duration is 4 ms (5 MB/10 Gbit/s).

Typically, a switch is equipped with a buffer space of 1 to 20 MB. In the preceding scenario where two ports transmit data at a rate of 10 Gbit/s, the maximum microburst duration is shorter than 16 ms (20 MB/10 Gbit/s). On an actual network, traffic may be originated from multiple ports to one port. In this scenario, more buffer space is consumed, and severer congestion and packet loss occur upon microbursts.

## Detection Method of a Microburst

You can use the packet capture tools and Wireshark to detect microbursts.

On the Wireshark, open a file of captured packets and choose Statistics > I/O Graph to view the traffic graph, as shown in .Figure 1-4

Figure 1-4 Traffic graph in the Wireshark

To view the millisecond-level burst traffic in the IO graph, change the unit for the y-axis to Bits and the interval to 1 ms, as shown in Figure 1-5.

Figure 1-5 Wireshark IO graph

## Countermeasures of a Microburst

You can take the following countermeasures to mitigate microbursts:
1. For problems such as severe bursts, intensive buffer utilization, poor performance on lossy lines, and large delay and jitter in traditional TCP congestion control mechanisms, use common improvement technologies in the industry to minimize the possibility of microbursts.
2. During network service traffic planning, avoid scenarios with an excessively high convergence ratio (traffic is transmitted from multiple ports to one port), and expand the capacity in a timely manner for ports with severe bursts to eliminate burst bottlenecks.
3. On the CE12800E, CE5850EI, CE5855EI, CE5850HI, CE5880EI, and CE6800, CE7800, and CE8800 series switches, to mitigate network congestion, run the qos burst-mode enhanced command in the interface view to set the burst traffic buffer mode to enhanced.
4. If the delay is controllable and the buffer is sufficient, run the qos queue queue-index shaping { percent cir cir-percent-value [ pir pir-percent-value ] | cir cir-value [ kbps | mbps | gbps ] [ cbs cbs-value [ bytes | kbytes | mbytes ] | pir pir-value [ kbps | mbps | gbps ] [ cbs cbs-value [ bytes | kbytes | mbytes ] pbs pbs-value [ bytes | kbytes | mbytes ] ] ] } command to enable the traffic shaping function on the downlink ports of the upstream switch connected to the switch on which congestion occurs. This reduces the instantaneous peak of the traffic and controls the burst. Be aware that this solution will increase the packet forwarding delay.
5. Run the dcb pfc enable [ pfc-profile-name ] [ mode { auto | manual } ] command to enable the flow control function on interfaces of all devices on the network. When congestion occurs on a device, the device instructs the upstream device to reduce the packet sending rate or even stop sending packets. After the congestion is eliminated, packets can be sent normally. Be aware that this solution will increase the packet forwarding delay.

Translation