No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade
Knowledge Base

S9300 STP service failure caused by high number of STP TC BPDU frames

Publication Date:  2019-07-17  |   Views:  105  |   Downloads:  0  |   Author:  Oleg Khanachivskyi  |   Document ID:  EKB0000478119

Contents

Issue Description

SW: S9300 V100R002C00SPC200.
Network consist of 2 s9312 switches which are interconnected between them self with 3 ports which runs stp. after applying traffic policy with acl rule number about 3700 globally on inbound direction on switch 2(ksvr) in some time switch stp has failed, which caused stp flapping.
After broadcast storm first switch (ksvl) started to drop traffic and its management ip was unavailable (10.99.99.1 vlanif 99).
Please, check attachments for topology logs and configurations.

Alarm Information

Jun 25 2010 12:27:26 swc-ksvr %%01SDKE/3/ERR(D):-Slot=2; Slot 2 layer DRV module AV level ERR: 
 bm_mac_sw_mac_delete:bm_mac_sw_mac_delete ulRet=0x2.
Jun 25 2010 12:27:28 Quidway %%01VMON/5/VMON(D): V100R002C00(Slot= 4): 
Cpu usage is over threshold,Collected infomation is
Slot ID:                4
Task Name:              ACL
Task State:             running
Record Time:            2010-06-25  12:27:23
Record Tick:            0x2b46c(CPU Tick High) 0x3985fcb0(CPU Tick Low)
Last Tick:              0x0(CPU Tick High) 0x18c0f(CPU Tick Low)
CallStack:                  StackAddr                     FuncAddr      
                            0x029ef150                    0x00a55d08
                            0x029ef150                    0x008186e4
                            0x029ef1b0                    0x0091e374
                            0x029ef1c0                    0x006d5ef0
                            0x029ef380                    0x004dfc08
                            0x029ef5b0                    0x004e008c
                            0x029ef650                    0x004f6ccc
                            0x029ef6a0                    0x004f6ddc
                            0x029ef750                    0x004a16cc
                            0x029ef780                    0x00751288
                            0x029ef7b0                    0x00753af0
                            0x029ef810                    0x007559b4
                            0x029ef870                    0x0064d494
                            0x029ef8f0                    0x0064e56c
                            0x029ef980                    0x00d8beec
                            0x029ef9b0                    0x00d7fb08
                            0x029f0280                    0x00b2893c
                            0x029f02b0                    0x00b41a24
                            0x029f0500                    0x00b420b4
                            0x029f09f0                    0x00b42674
Jun 25 2010 14:23:35 swc-ksvl %%01MSTP/5/SET_PORT_STATE(l): Instance 2's port GigabitEthernet5/0/19 has been set to FORWARD.
Jun 25 2010 14:23:35 swc-ksvl %%01MSTP/6/SET_PORT_FORWARDING(l): In MSTP process 0 instance 2,MSTP set port GigabitEthernet5/0/19 state as forwarding.
Jun 25 2010 14:23:35 swc-ksvl %%01MSTP/6/RECEIVE_MSTITC(l): MSTP received BPDU with TC, MSTP process 0 instance 1, port name is GigabitEthernet5/0/19.Jun 25 2010 14:02:55 swc-ksvl %%01IPC/4/SENDQUEFUL(D): Couldn't send IPC message, IPC queue was overflow. (SourceChannelId=180, DestinationChannelId=49332, DestinationNode=3, QueueId=7)
Jun 25 2010 14:02:55 swc-ksvl %%01IPC/4/SENDQUEFUL(D): Couldn't send IPC message, IPC queue was overflow. (SourceChannelId=180, DestinationChannelId=49332, DestinationNode=4, QueueId=7)
Jun 25 2010 14:02:55 swc-ksvl %%01IPC/4/SENDQUEFUL(D): Couldn't send IPC message, IPC queue was overflow. (SourceChannelId=180, DestinationChannelId=49332, DestinationNode=5, QueueId=7)
Jun 25 2010 14:02:55 swc-ksvl %%01IPC/4/SENDQUEFUL(D): Couldn't send IPC message, IPC queue was overflow. (SourceChannelId=180, DestinationChannelId=49332, DestinationNode=2, QueueId=7)
Jun 25 2010 14:02:55 swc-ksvl %%01IPC/4/SENDQUEFUL(D): Couldn't send IPC message, IPC queue was overflow. (SourceChannelId=180, DestinationChannelId=49332, DestinationNode=3, QueueId=7)

Handling Process

1. Check whether cpu of SRU is in normal state on KSVR.
2. Check whether LPU cpu is normal on KSVR - it is overloaded.
3. Check whether stp is working normal - it is flapping.
4. Submit problem and logs to HQ.
5. After analysys found that because of LPU CPU of KSVR is overloaded (because customer deployed big acl on vlan than on interface).
 STP on KSVR start to flap and cause generation big number TC BPDU generation. This caused STP failure on KSVL, because stp tc-protect 
thereshold 3 configuration and G48SC LPU board has 9M expanded TCAM, which can transmit through IPC channel only 3 TC in 2 seconds, 
which cause IPC switch channel overload and STP flapping on KSVL switch - than network failed.
6. Solutions for this problem are:
1) Deploy acls first on inteface view, than on vlan if view.
2) configure stp tc-protect threshold 1
3) install patch S9300 V100R002SPH009 which solves problem of high SRU cpu usage caused by INFO task, vrrp flap problem  which cause
 clearing ARP tables and relearning ARPs which increase cpu usage.

Root Cause

1. Check whether cpu of SRU is in normal state on KSVR.
2. Check whether LPU cpu is normal on KSVR - it is overloaded.
3. Check whether stp is working normal - it is flapping.

Suggestions

Update documentation for IPC channel oveload possible reasons.
update command reference for stp tc-protect thereshold  command and add remarks for different boards TCAM, which value is supported by each board for that configuration.