In July 16, 2010,Service under NE40E become interruptted after adding command of vlan 680, and it restored after deleting the related configuration.
Topology as the attachment Topology.jpg.
1.Analyse the log of the NE40E and find that L2 loop as below:
First, vlan680 was added to interface Eth-Trunk10:
Jul 16 2010 12:46:53 IPBB-LA-NE40E-02 %%01SHELL/5/CMDRECORD(l):Record command information. (Task=VT0 , Ip=10.0.74.66, User=admin, Command="interface Eth-Trunk10")
Jul 16 2010 12:47:10 IPBB-LA-NE40E-02 %%01SHELL/5/CMDRECORD(l):Record command information. (Task=VT0 , Ip=10.0.74.66, User=admin, Command="port trunk allow-pass vlan 680")
The L2 loop occurred and VRRP status become abnormal:
Jul 16 2010 12:47:13 IPBB-LA-NE40E-02 %%01VRRP/4/STATEWARNING(l): Virtual Router state BACKUP changed to MASTER, because of protocol timer expired. (Interface=GigabitEthernet3/0/30, VrId=41)
Jul 16 2010 12:47:13 IPBB-LA-NE40E-02 %%01VRRP/4/STATEWARNING(l):Virtual Router state BACKUP changed to MASTER, because of protocol timer expired. (Interface=GigabitEthernet3/0/20.2, VrId=26)
VRRP status become normal after deleting the vlan 680 related config:
Jul 16 2010 13:28:22 IPBB-LA-NE40E-02 %%01SHELL/5/CMDRECORD(l):Record command information. (Task=VT0 , Ip=10.0.74.66, User=admin, Command="interface Eth-Trunk10")
Jul 16 2010 13:28:45 IPBB-LA-NE40E-02 %%01SHELL/5/CMDRECORD(l):Record command information. (Task=VT0 , Ip=10.0.74.66, User=admin, Command="undo port trunk allow-pass vlan 680")
2.There is too many broadcast packet under the interface G3/0/16 connecting to SW which can prove that there should be broadcast traffic passing through this interface and loop occurred under it.
GigabitEthernet3/0/16 current state : UP
Line protocol current state : UP
Description:connect to LA-S5600(M)-OCS
Switch Port,The Maximum Transmit Unit is 1500
Internet protocol processing : disabled
IP Sending Frames' Format is PKTFMT_ETHNT_2, Hardware address is 0025-9e34-b62b
Media type: twisted-pair ,Link type: auto negotiation
Loopback:none, Maximal BW:1G, Current BW:1G,full-duplex mode, negotiation: enable, Pause Flowcontrol:Receive Enable and Send Enable
Last physical up time : 2010-07-13 15:42:41
Last physical down time : 2010-07-13 15:42:15
Statistics last cleared:never
Last 300 seconds input rate: 2880 bits/sec, 4 packets/sec
Last 300 seconds output rate: 0 bits/sec, 0 packets/sec
Input: 197604103910 bytes, 3064390745 packets
Output: 190510508047 bytes, 2937670649 packets
Unicast: 383454704 packets, Multicast: 102357980 packets
Broadcast: 2578578061 packets, JumboOctets: 93 packets
CRC: 0 packets, Symbol: 0 packets
Overrun: 0 packets, InRangeLength: 0 packets
LongPacket: 0 packets, Jabber: 0 packets, Alignment: 0 packets
Fragment: 0 packets, Undersized Frame: 0 packets
RxPause: 0 packets
Unicast: 676173254 packets, Multicast: 15647411 packets
Broadcast: 2245849984 packets, JumboOctets: 1353976 packets
Lost: 0 packets, Overflow: 0 packets, Underrun: 0 packets
System: 0 packets, Overruns: 0 packets
TxPause: 0 packets
So far the conclusion can be given as below:
There is a L2 loop between NE40Es and SW switch on this site after configing vlan680, form the congfiguration we can find it. And VRRP vrid 1 also be configed under vlanif 680, so the VRRP advertizement packet of vrid1 generater a broadcast storm in the L2 loop. The CPU become busy because of so many VRRP packet was forwarded to it, and both NE40E VRRP status become Master, so the service was affected.
The issue was caused by the new configuration added, so we need to check the configuration of the vlan680 first. And from the script of the operation, I find that there maybe loop in vlan680 between the NE40Es and switch SW1&SW2.
Becareful when operating on real network.