NE20E路由器ospf邻居异常状态卡在ExStart

发布时间:  2016-06-16 浏览次数:  321 下载次数:  0
问题描述

NE20E-S路由器与对端C厂家路由器建立ospf邻居异常,状态卡在在ExStart。

告警信息

Area 0.0.0.0 interface 10.255.13.22 (GE0/1/2)'s neighbors
Router ID: 3.3.3.3          Address: 10.255.13.21    
   State: ExStart        Mode:Nbr is Slave      Priority: 1
   DR: 10.255.13.21      BDR: 10.255.13.22      MTU: 0
   Dead timer due in  36  sec
   Retrans timer interval: 5
   Neighbor is up for 00h00m00s
   Authentication Sequence: [ 0 ]丢弃ospf错误报文:

Apr 20 2016 20:29:28 BeiJing_NE20E-S4-1 %%01OSPF/3/RCV_ERR_PACKET(l):VS=Admin-VS-CID=0x80830444;OSPFv2 received error packet and dropped it.(ProcessId=1, PktType=4, ErrPktCnt=441, LocalComp=0x80830444, PeerComp=0x820430, IfName=GigabitEthernet0/1/2)

处理过程

1、ospf邻居状态卡在ExStart首先怀疑是两侧MTU值不匹配导致,经确认两侧接口MTU=1500,无异常。

2、在NE20E-S8路由器上开启debugging ospf packet interface GigabitEthernet 0/1/2:确认ospf接口能收发Hello、Link-State Update报文,但是收不到对端设备DB Description报文;初步断定本侧丢弃对端设备发DB Description报文(从日志提示信息OSPFv2 received error packet and dropped it可以证明)。

3、再次检查NE20E-S路由器配置,发现对C公司路由器对接的接口GE0/1/2接口配置了nat server。导致dd报文命令nat server后目前地址被nat转换,导致收到的ospf被认为为非法而被丢弃;配置如下:

#
service-location 1
location slot 5
#
service-instance-group group1
service-location 1
#
service-instance-group group2
service-location 1
#
nat instance nat1 id 1
service-instance-group group1
nat alg all
redirect ip-nexthop zzz.zzz.zzz.zzz outbound
nat address-group address-group1 group-id 1 yyy.yyy.yyy.yyy yyy.yyy.yyy.yyy
nat server global xxx.xxx.xxx.xxx inside 10.255.13.21
#
acl number 3001
rule 101 permit ip destination xxx.xxx.xxx.xxx 0
rule 102 permit ip source 10.255.13.21 0   //包含了互联IP地址
#
traffic classifier classifier1 operator or
if-match acl 3001
#
traffic behavior behavior1
nat bind instance nat1
#
traffic policy policy1
share-mode
classifier classifier1 behavior behavior1 precedence 1
#
interface GigabitEthernet0/1/2
undo shutdown
ip address 10.255.13.22 255.255.255.252
undo dcn
traffic-policy policy1 inbound
#

 

根因

NE20E路由器接口GE0/2/1配置了nat server,并且acl 3001包含了对端设备与NE20E路由器互联IP地址10.255.13.21 。

解决方案

删除ACL 3001中rule 102后,ospf邻居建立正常。

#
>system-view
]acl number 3001
]undo rule 102
]quit
#

建议与总结
ospf邻居建立过程中单播报文匹配到nat流策略导致做nat后报文被发送出去,邻居无法建立。而在ospf邻居建立之后靠ospf的hello报文维持邻居关系,hello报文为组播报文,组播报文在匹配流策略之前已经上送CPU,所以不会再匹配到nat流策略,邻居关系得以维持。而DD报文为单播报文被nat命中,导致交互异常。

END