A Large Number of TC Packets Received on an S6700 Switch Causes High CPU Usage and Protocol Flapping

Publication Date:  2015-11-11 Views:  359964 Downloads:  0
Issue Description
Two S6700 switches have the Rapid Spanning Tree Protocol (RSTP) protocol enabled globally, and the Open Shortest Path First (OSPF) and Virtual Router Redundancy Protocol (VRRP) are running between the switches. When the master switch receives a large number of TC packets, the status of the OSPF and VRRP protocols flaps on the switch. Both the switches record logs about high CPU usage, and the CPU usage displayed on the network management system (NMS) exceeds 90% multiple times. Logs show that the many ARP packets are dropped because the rate of ARP packets exceeds the CPCAR.

Figure  shows the CPU usage data on the NMS.

Figure  CPU usage data on the NMS



Log Information:

1. The switches have recorded logs about high CPU usage.

S6700-1 %%01VOSCPU/4/CPU_USAGE_HIGH(l)[31]:The CPU is overloaded(CpuUsage=96%, Threshold=95%), and the tasks with top three CPU occupancy are:
FTS  total      : 18%
SRMT  total      : 11%
SOCK  total      : 8%
S6700-1 %%01VOSCPU/4/CPU_USAGE_HIGH(l)[60]:The CPU is overloaded(CpuUsage=100%, Threshold=95%), and the tasks with top three CPU occupancy are:
PPI   total      : 41%
SRMT  total      : 10%
FTS  total      : 8%

2. There are also logs indicating that a large number of ARP packets have been discarded because of CPCAR exceeding.

S6700-1 %%01DEFD/4/CPCAR_DROP_MPU(l)[56]:Rate of packets to cpu exceeded the CPCAR limit on the MPU. (Protocol=arp-miss, ExceededPacketCount=016956)
S6700-1 %%01DEFD/4/CPCAR_DROP_MPU(l)[57]:Rate of packets to cpu exceeded the CPCAR limit on the MPU. (Protocol=arp-reply, ExceededPacketCount=020699)
S6700-1 %%01DEFD/4/CPCAR_DROP_MPU(l)[58]:Rate of packets to cpu exceeded the CPCAR limit on the MPU. (Protocol=arp-request, ExceededPacketCount=0574)

3. Check statistics about TC packets sent and received on RSTP-enabled interfaces.

The number of received TC packets keeps increasing on all RSTP-enabled interfaces.

<S6700> display stp tc-bpdu statistics
-------------------------- STP TC/TCN information --------------------------
MSTID Port                        TC(Send/Receive)      TCN(Send/Receive)
0     GigabitEthernet0/0/1               19319/3271            0/0 
0     GigabitEthernet0/0/2               29761/676             0/0 
0     GigabitEthernet0/0/3               128/4                 0/0  
0     GigabitEthernet0/0/4               24615/1016            0/0
0     GigabitEthernet0/0/5               30697/98              0/0
0     GigabitEthernet0/0/6               25447/317             0/0
Handling Process
Step 1 Run the stp tc-protection command in the system view.

This command ensures that the switch updates MAC and ARP entries at most once every 2 seconds when receiving a large number of TC packets. This configuration prevents high CPU usage caused by frequent updates of MAC and ARP entries.

Step 2 Run the arp topology-change disable and mac-address update arp commands in the system view.

By default, a switch deletes MAC address entries and ages out ARP entries after receiving TC packets. If there are many ARP entries on the switch, ARP entry relearning triggers a large number of ARP packets on the network. After the arp topology-change disable and mac-address update arp commands are configured, the switch updates the outbound interfaces in ARP entries in accordance with changed outbound interfaces in the MAC address entries upon network topology changes. The commands prevent unnecessary updates of ARP entries.

Note:
      The mac-address update arp command has been supported since V100R006, and the arp topology-change disable command has been supported since V200R001.
Root Cause
TC packet statistics show that RSTP-enabled interfaces have received a large number of TC packets, and the number keeps increasing. The switches frequently delete MAC address entries and update ARP entries after receiving TC packets. ARP relearning triggers a large number of ARP-Miss, ARP Request, and ARP Reply packets, leading to high CPU usage. As OSPF Hello packets and VRRP heartbeat packets cannot be processed in a timely manner, the status of the OSPF and VRRP protocol flaps.
Suggestions
When deploying a spanning tree protocol, you are advised to enable TC protection and configure all interfaces connected to terminals as edge ports. These measures prevent state changes of some interfaces from causing flapping and re-convergence on the entire network.

END