Customers found that BFD on NE40s flapped frequently during the peak hours (21:00-23:00) everyday and as a result VRRP switchovers occurred repeatedly. The following figure showed the networking: NE40Es functioned as PEs, and NE40s functioned as CEs to connect the UMG. Transmission equipment was deployed between NE40Es and NE40s, and BFD was configured.
Huawei performed the following operations to identify the cause and resolve the issue:
1. Suspected that the issue was caused by abnormal BFD packets handling, as the issue occurred only in peak hours when transmission bandwidth was sufficient.
2. Checked BFD flapping logs of NE40s and found that BFD flapping occurred when the BFD session was detected down.
3. Monitored the port traffic on NE40s during peak hours and found that the traffic on the board in slot 2 already reached 2. 2 Gbit/s. It indicated that burst traffic would exceed the board's forwarding capacity and then packet loss occurred.
4. Kept monitoring traffic on the board for several days and found that the traffic almost remained at 2.3 Gbi/ts during the peak hours each day.
5. Distributed the traffic to other boards.The issue was resolved.
The traffic to be forwarded was far beyond the forwarding capacity of the related board on NE40. Therefore, some packets were discarded randomly. This caused some BFD packets to be discarded and the BFD session to go down.Note: If the traffic to be forwarded was less than the forwarding capacity, packets were discarded based on their priorities.
An NE40 board supports a 2.5 Gbit/s forwarding capacity whereas the actual traffic capacity often exceeds 2.5 Gbit/s when small packets with spacing between each other are forwarded. In the case of traffic burst, the 2.5 Gbit/s forwarding capacity is insufficient.Generally, when the traffic on a board reaches 2 Gbit/s or higher, it is recommended that you expand the board capacity.