No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

Frequent Active Standby Switchover on the Media Interface at the Directly-connected UMG Side Is Caused by Layer 2 Loop of Other Service Fields on the Switch Attached to the NE40

Publication Date:  2012-07-27 Views:  46 Downloads:  0
Issue Description
Version of the NE40: Version 5.30.
Networking description: For the detailed networking diagram, refer to the attachment. The UMG and MSC are directly connected to two NE40s, and the NE40s are used as gateways of the UMG and MSC.VRRP is not run between two NE40s to provide the virtual NM for the UMG and MSC. The backup protection is implemented by the active/standby backup function enabled on the UMG and MSC side. Besides, these two NE40s are connected to two switches to converge maintenance terminals. VRRP (VLAN 14 and VLAN 220) is operated between the two NE40s to provide gateways for the two switches in the IN.
Fault symptom: In normal condition, the UMG and MSC sides send an ARP request every 3 seconds to detect the physical connectivity to the gateway. If the detection fails three times, the UMG switches abnormally and traps are generated on the MSC side. Frequent active/standby switchovers on the side caused by ARP detection failure occur on the current network, and an amount of traps of the ARP gateway resolution failure are generated on the MSC side.
Ping the directly connected gateway address on the UMG and MSC and find that the packet loss ratio is 1% to 3%. Ping the gateway address of the UMG and MSC on the NE40 and find that the packet loss rate is more than 1%.
In the VRRP group, the backup NE40 connected to the IN service domain switches frequently on the NE40s.
Records about the IP address conflicts of the ARP packets exist in the logs of NE40s. 
 
Alarm Information
%Apr 4 16:55:49 2008 RPR-NE40-AR-A SRM/5/ARP_DUPLICATE_IPADDR:Slot3 Receive an ARP packet with duplicate ip address 10.157.2.170 from Vlanif220, source MAC is 0018-822f-e17e
%Apr 5 02:43:48 2008 RPR-NE40-AR-B SRM/5/ARP_DUPLICATE_IPADDR:Slot1 Receive an ARP packet with duplicate ip address 10.157.2.171 from Vlanif220, source MAC is 0018-822f-e17a
%Apr 6 00:50:15 2008 RPR-NE40-AR-B VRRP/5/StateWarning:
Vlanif14 | Virtual Router 14 : BACKUP --> MASTER
%Apr 6 00:50:27 2008 RPR-NE40-AR-B VRRP/5/StateWarning:
Vlanif14 | Virtual Router 14 : MASTER --> BACKUP 
 
Handling Process
Mirror and capture packets on the interfaces of the router that connects the UMG and MSC, and find that when the ARP failure switches over on the UMG side, the NE40 receives the ARP requests but does not respond to the request. Therefore, the problem is not sending packet on the UMG and MSC side.
2. Plug out an optical fiber from the trunk links between the two NE40s and capture packets on the other interface to observe the receiving and sending of the VRRP heartbeat packets.
The traffic on the link is large and mainly consists of broadcast packets of ARP request.
The Truck links allow the VLAN220 packets to pass and result in the Layer 2 loop. Numerous ARP request packets occupy the place of VRRP heartbeat packets, which causes the frequent switchover of the VRRP group and the heavy load of the NE 40 main board. Therefore, the main board responds abnormally or has no response to ARP response packets or ping packets, and this causes the failure of ARP request detection and ping packet loss between two directly connected devices.
3. Configure the undo trunk allow pass vlan 220 command to clear the Layer 2 loop caused by the pass of VLAN 220, and then all faults are rectified. 
 
Root Cause
1.Slot 3 of the NE40-B, providing heartbeat connection and incoming interface for two devices, has problem in receiving and sending packets.3 on the NE40-A restarts abnormally and thus is replaced, the of the same type on the NE40-B may have the same problem.
2. Packet loss and ARP request failures are caused by the poor quality of lines from the UMG and MSC to the NE40.VRRP heartbeat packets are lost because of the lines between the two NE40s.
3. Problems may lie in the ARP detection mechanism or sending packets on the UMG and MSC sides.
4. The IP address conflicts recorded in the log may be caused by the problem existed in the Layer 2 loop. 
 
Suggestions
Avoid the Layer 2 loop in network planning. Otherwise, unpredictable faults occur.
When multiple fault symptoms appear, observe the status of devices carefully to locate the fault, such as the abnormal traffic on the interface. 
 

END