No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

Directly connected eBGP peer cannot be established due to learning a route with 32 bits mask from another peer

Publication Date:  2014-12-31 Views:  37 Downloads:  0
Issue Description

Directly connected eBGP peer configured between RouterA and RouterC, but the peer cannot be established, the state of peer changed between OpenConfirm and Idle for a long time.  

 

Handling Process

Analysis approach:

1. Confirm whether the interface is up, and whether the remote address could ping success.

2. Analyse the Finite State Machine of BGP peer, confirm the issue occur in the process of TCP connection or BGP negotiation or sending routes.

3. Confirm whether BGP packets would be dropped by some policy or by transmission facilities.

4. According as the phenomenon of issue, analyse the difference between the packets can be received and cannot be received, and make a depth analysis.

 

 

Analysis process:

1. According as customer's description, the interface is up on RouterA, local address is 13.1.1.1/30, remote address is 13.1.1.2/30, and the address of BGP peer configured is same as the remote address of interface. Ping the remote address for a long time, and ping with large packets, it's success.

 

<RouterA>display current-configuration interface Ethernet1/0/1

#

interface Ethernet1/0/1

 undo shutdown

 ip address 13.1.1.1 255.255.255.252

#

 

2. Check the count of received messages and sent messages, we can see: RouterA can send OPEN packet to RouterC; RouterA can enter OpenConfirm state after received OPEN packet from RouterC; RouterA can reply KeepAlive packet after receive OPEN packet from RouterC, but cannot receive KeepAlive packet from RouterC until Hold Timer expired after 3 minutes.

 

<RouterA>display bgp peer 13.1.1.2 verbose

 

         BGP Peer is 13.1.1.2,  remote AS 200

         Type: EBGP link

         BGP version 4, Remote router ID 1.1.1.3

         Update-group ID: 2

         BGP current state: OpenConfirm

         BGP current event: KATimerExpired

         BGP last state: OpenSent

         BGP Peer Up count: 1

         Port:  Local - 0        Remote - 0

 Received: Total 1 messages

                  Update messages                0

                  Open messages                  1

                  KeepAlive messages             0

                  Notification messages          0

                  Refresh messages               0

 Sent: Total 4 messages

                  Update messages                0

                  Open messages                  1

                  KeepAlive messages             3

                  Notification messages          0

                  Refresh messages               0

BGP can send OPEN packet to each other, means TCP connection is ok, the issue occurs in process of negotiation.

According as customer's description, RouterA had upgraded from V3R3 to V6R1. The difference between these two version is V3R3 cannot support 4 byte AS capabilty, V6R1 can support 4 byte AS capabilty.

So, we open the debugging, check the capability info from RouterC, there is 4 byte AS capability in OPEN packet.Therefore, we rule out this doubtful point.

 

Dec  9 2014 10:13:19.480.14 RouterA RM/6/RMDEBUG:

         BGP.Public: Recv OPEN MSG from peer 13.1.1.2 Length: 45

         Version: 4, Remote AS: 200, HoldTime : 180,

         Router ID: 1.1.1.3, TotOptLen: 16

 

         OPT Type:   2 (Capability)     OPT Len: 14

         CAP Type:   1 (Multiprotocol)  CAP Len:  4

                                        IPv4-UNC (1/1)

         CAP Type:   2 (RouteRefresh)   CAP Len:  0

         CAP Type:  65 (4-byte-as)      CAP Len:  4   AS number: 200

 

Dec  9 2014 10:16:18.550.9 RouterA RM/6/RMDEBUG:

 BGP_TIMER: HOLD Timer Expired for Peer 13.1.1.2

 

3. Check the configuration on RouterA, there is no policy to restrict the packets of protocal BGP. RouterC is other vender's device, we cannot confirm whether there is some policy to restrict BGP packets, and we cannot confirm whether there is some packets dropped on transmission path.

According as the result of ping for long time and ping with large packets, there is no packet dropped, basically we can rule out transmission path issue. TCP connection can be up normally each time can prove it also.

 

4. Compare the difference between the packets can be received and cannot be received, make depth analysis.

If RouterC receives OPEN message from RouterA, RouterC will reply a KeepAlive message to RouterA.

RouterA can receive OPEN packet but cannot receive KeepAlive packet which is less than OPEN packet. Since transmission device doesn't drop it, and there is no policy dropped it on RouterA, the only possiblity is RouterC doesn't receive the OPEN packet from RouterA. Then, the doubtful point is focus on MTU and ttl.

 

Because the OPEN packet is only some dozens of bytes, and the result of ping large packets is success, we can rule out MTU.

 

For eBGP packet, default value of ttl is 1. So, exclude directly connected peer, eBGP need to configure multihop. But, in live network, the address of BGP peer configured is same as the remote address of interface, it is one hop, need not configure multihop.

 

Till now, it seems all the doubtful points are excluded, the analysis gets into trouble.

 

Actually, we neglect an important point: this peer is directly connected, isn't it? Whether this is a muntihop peer, it is necessary to verify this conclusion.

 

The result of traceroute is unexpected. It reaches the directly connected interface address via multihop.

 

<RouterA>tracert -a 13.1.1.1 13.1.1.2

 traceroute to  13.1.1.2(13.1.1.2), max hops: 30 ,packet length: 40

 1 12.1.1.2 50 ms  50 ms  50 ms

 2 24.1.1.4 60 ms  40 ms  50 ms

 3 34.1.1.3 130 ms  80 ms  50 ms

 

Check the routing-table, we find that 13.1.1.2 matches a BGP route with 32 bits mask from RouterB(1.1.1.2), not direct route.

<RouterA>display ip routing-table 13.1.1.2

Route Flags: R - relay, D - download to fib

------------------------------------------------------------------------------

Routing Table : Public

Summary Count : 1

Destination/Mask    Proto  Pre  Cost       Flags NextHop         Interface

 

       13.1.1.2/32  BGP    255  0           RD   1.1.1.2         Ethernet1/0/0

 

 

The reason is clear. Normally, 13.1.1.2 ought to match the directly connected route 13.1.1.0/30. But, there is a route 13.1.1.2/32, the packet matches the route with the longest mask, which cause the packet goes through RouterB. Since the eBGP peer doesn't configure multihop, the ttl of eBGP packet is default valule 1, so the packet will be dropped on the road through RouterB.

<RouterA>display ip routing-table protocol direct

Destination/Mask    Proto  Pre  Cost       Flags NextHop         Interface

       13.1.1.0/30  Direct 0    0            D   13.1.1.1        Ethernet1/0/1

       13.1.1.1/32  Direct 0    0            D   127.0.0.1       InLoopBack0

 

Check the route on RouterB, it is from RouterD(1.1.1.4), which is in the same AS of RouterC.

Normally, the host route with 32 bits mask on the interface will not advertise to other router. So, it need to let RouterD confirm where the route with 32 bits mask comes from.

<RouterB>display bgp routing-table 13.1.1.2

 

 BGP local router ID : 1.1.1.2

 Local AS number : 100

 Paths:   1 available, 1 best, 1 select

 BGP routing table entry information of 13.1.1.2/32:

 From: 24.1.1.4 (1.1.1.4)

 Route Duration: 01h55m49s

 Direct Out-interface: Ethernet1/0/2

 Original nexthop: 24.1.1.4

 Qos information : 0x0

 AS-path 200, origin igp, MED 0, pref-val 0, valid, external, best, select, active, pre 255

 Advertised to such 1 peers:

    1.1.1.1

 

Customer confirms that RouterD configures a static route 13.1.1.2/32 for backup purpose, but redistributes it to BGP peer RouterB by mistake.

 

 

Root Cause

RouterA learns a mistake route with 32 bits mask from RouterB, which cause the packet matches this route, not directly connected route. In live network, eBGP peer doesn't configure multihop, the ttl value of eBGP packets is 1, so the packets are dropped on the road through RouterB.

Solution
Workaround: config peer 13.1.1.2 ebgp-max-hop 3 on RouterA. But the packets still go through RouterB, not directly connected interface.

 

Solution: RouterC redistribute the directly connected route to BGP or IGP, then the route will be sent to RouterD for backup. It's unnecessary that RouterD config a static route which has longer mask than directly connected route of RouterC's interface. Otherwise, it is very likely affect the normal traffics when other routers learn this route.

Suggestions
When we find out the root cause, it's not complex. Perhaps we can find the ttl is abnormal when we ping the peer address at the beginning of analysis.

 

Why do we analyse too many points, but even all the doubtable points are excluded, the analysis gets into trouble? Because we see the peer address is same as the remote address of the directly connected interface. From the beginning of analysis, we habitually assume this is a directly connected peer, it will use directly connected interface.

 

The inference, which has not been proved, would mislead our analysis sometime. It is a simple problem, but the wrong inference will cause a lot of trouble.

END