No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

The service is down when the port connects of cisco switch configured UDLD to Huawei switch

Publication Date:  2017-12-27 Views:  1240 Downloads:  17
Issue Description

The topology is shown as the below. Cisco Core switch is the gateway. CE12800 is the Aggregation switch. CE6851 is the access switch. CE12800 connects to Cisco core switch via Eth-trunk501 which  has two member ports. 

The problem is that suddenly the service in the network is totally down. After a few minutes, the service is restored again.

Handling Process

1. From the log from CE12800, we found that stp root bridge was changed, there must be a topology change in the network.

Mar 7 2017 10:34:33+03:00 CE12800 %%01MSTP/4/MSTPLOG_PROROOT_CHANGED(l):CID=0x80542723;The root bridge of MSTP process changed. (ProcessID=0, InstanceID=0, RootPortName=-, PreviousRootBridgeID=6c50-4dae-7d40, NewRootBridgeID=34a2-a2f5-8d21), RootPwName=-)


2. From the Cisco Core switch log, we found that the ports went to error disabled mode, the ports Gi2/1/5 and Gi1/1/16 were disabled for neighbor mismatch detected by UDLD (UDLD is Cisco private protocol). UDLD err-disabled all the ports connecting to CE12800, and service was affected.

Mar  7 2017 10:34:40.782 GMT: %UDLD-SW2-4-UDLD_PORT_DISABLED: UDLD disabled

Mar  7 2017 10:34:41.754 GMT: %UDLD-SW2-4-UDLD_PORT_DISABLED: UDLD disabled interface Gi1/1/16, neighbor mismatch detected.

Mar 7 2017 10:34:40.782 GMT: %PM-SW2-4-ERR_DISABLE: udld error detected on Gi2/1/5, putting Gi2/1/5 in err-disable state

Mar 7 2017 10:34:41.754 GMT: %PM-SW2-4-ERR_DISABLE: udld error detected on Gi1/1/16, putting Gi1/1/16 in err-disable state


3. 10 minutes later UDLD timed out, the ports been err-disabled were enabled, then Eth-Trunk 501 recovered and service resumed.

Mar  7 2017 10:44:41.814 GMT: %PM-SW2-4-ERR_RECOVER: Attempting to recover from udld err-disable state on Gi1/1/16

Mar  7 2017 10:44:41.926 GMT: %PM-SW1_STBY-4-ERR_RECOVER: Attempting to recover from udld err-disable state on Gi1/1/16

Mar  7 2017 10:43:30+03:00 BIA-HW-DC-Core-CSS %%01IFNET/2/linkDown_clear(l):CID=0x807a0405-alarmID=0x08520003-clearType=service_resume;The interface status changes. (ifName=10GE1/2/0/0, AdminStatus=UP, OperStatus=UP, Reason=The link protocol is up, mainIfname=Eth-Trunk501)
Mar  7 2017 10:43:30+03:00 BIA-HW-DC-Core-CSS %%01IFNET/2/linkDown_clear(l):CID=0x807a0405-alarmID=0x08520003-clearType=service_resume;The interface status changes. (ifName=Eth-Trunk501, AdminStatus=UP, OperStatus=UP, Reason=Interface physical link is up, mainIfname=Eth-Trunk501)

4. From the above, we can get the following result. 
The two member ports in Eth-trunk501 [Uplink ports to cisco core switch 10GE1/1/0/0 and 10GE1/2/0/0], both the two ports stepped into down state caused by UDLD from cisco core switch side, Eth-Trunk501 went down due to all the member ports went down physically; and it caused that the link bandwidth lost totally [link down]. The original STP root lost and new STP root to be re-elected. [This is normal operation in STP process]
Root Cause

Cisco UDLD is configured on the ports that connecting to CE12800 [Eth-trunk 501], which is not recommended as it is cisco private protocol; and it can cause abnormal link status and the service interruption accordingly.

Solution

Removed UDLD configurations on cisco core switch side.

Suggestions

CE12800 CSS dual-homes to Cisco’s core switches. Now CE12800-2 has no links connecting to Cisco core switches, suggest deploying it.

END