No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

Lots of S9300 OSPF protocol message can’t be processed as CPCAR limit leads to the peer state abnormal.

Publication Date:  2012-09-18 Views:  75 Downloads:  0
Issue Description
Version information: S9300, V100R003C00SPC200+s9300v100r003sph011.pat
 Networking summary:



 Fault phenomenon:
1.Network service interruptions, check routing, most of the network routing missing.
2.View the OSPF peer again found that most of peers stay in exchange/Exstart state:
0.0.0.14        Vlanif283                  10.17.213.190    Exchange 
....
0.0.0.14        Vlanif287                  10.11.116.6      Exchange 
0.0.0.14        Vlanif359                  10.243.93.222    Full     
0.0.0.14        Vlanif407                  10.243.96.107    ExStart   
......
0.0.0.14        Vlanif449                  10.243.96.149    ExStart  
0.0.0.14        Vlanif466                  10.243.67.26     Exchange
3.Restart the S9300, the peers are all established successfully and the fault disappears.



Alarm Information
View the collected diag information, we can see as follows:
Jan  8 2012 00:09:22 S9300 %%01QOSE/4/CPCAR_DROP_LPU(l): Some packets are dropped by cpcar on the LPU in slot 6. (Protocol=ospf, Drop-Count=0881)
Jan  8 2012 00:19:22 S9300 %%01QOSE/4/CPCAR_DROP_LPU(l): Some packets are dropped by cpcar on the LPU in slot 1. (Protocol=ospf, Drop-Count=03048)
Jan  8 2012 00:19:22 S9300 %%01QOSE/4/CPCAR_DROP_LPU(l): Some packets are dropped by cpcar on the LPU in slot 6. (Protocol=ospf, Drop-Count=0111899)
Jan  8 2012 00:13:17 S9300 %%01OSPF/6/NBR_DOWN_REASON(l): Neighbor state leaves full or changed to Down. (ProcessId=10, NeighborRouterId=10.243.99.250, NeighborAreaId=0, NeighborInterface=Vlanif801,NeighborDownImmediate reason=Neighbor Down Due to Inactivity, NeighborDownPrimeReason=Hello Not Seen, NeighborChangeTime=[2012/01/08] 00:13:17)
Handling Process
1、View the time log of the fault and find there are lots of OSPF CPCAR packets lost:
Jan  8 2012 00:09:22 S9300 %%01QOSE/4/CPCAR_DROP_LPU(l): Some packets are dropped by cpcar on the LPU in slot 6. (Protocol=ospf, Drop-Count=0881)
Jan  8 2012 00:19:22 S9300 %%01QOSE/4/CPCAR_DROP_LPU(l): Some packets are dropped by cpcar on the LPU in slot 1. (Protocol=ospf, Drop-Count=03048)
Jan  8 2012 00:19:22 S9300 %%01QOSE/4/CPCAR_DROP_LPU(l): Some packets are dropped by cpcar on the LPU in slot 6. (Protocol=ospf, Drop-Count=0111899)
2、View the reason caused OSPF down, it is almost the end timeout down:
Jan  8 2012 00:13:17 S9300 %%01OSPF/6/NBR_DOWN_REASON(l): Neighbor state leaves full or changed to Down. (ProcessId=10, NeighborRouterId=10.243.99.250, NeighborAreaId=0, NeighborInterface=Vlanif801,NeighborDownImmediate reason=Neighbor Down Due to Inactivity, NeighborDownPrimeReason=Hello Not Seen, NeighborChangeTime=[2012/01/08] 00:13:17)
The fault is recovered by restart S9300, in the restart process, because of the VLANIF port is up gradually, it is ease the OSPF message sending concurrent to a certain extent, so the peer gradually establish a recovery.
Through the analysis above, as the OSPF session is more, the OSPF interactive message corresponding also more, make hello message be cared in the bottom and can’t be send to protocol stack processing normally, which leads to failure. We can stable the state of network through configuring corresponding optimization command.
Root Cause
1、View the OSPF peer established process found that most peer stay EXCHANGE state, some can reach FULL sate but turn to DOWN state immediately, and then shake in EXCHANGE-FULL-DOWN states.
2、S9300 always exist OSPF and CPCAR packet loss, it will lead to the OSPF hello message can’t be processed by protocol stack.
3、The network has almost 2000 OSPF routings, an update may carry up a flow to 2400kbps, but the default CPCAR to interface board and main control board are 256kbps and 512kbps respectively. So the default value can’t satisfy the practical application of network, we need to adjust the CPCAR up, the command will be effective immediately after configured, the configure process will not influence the network business.
Suggestions
1、 Adjust according to the network application scale, need to guarantee the network OSPF session neighbor set up. Neighbor number and routing information are not allowed to change, so adjust from CPCAR.
2、S9300 has two stage CAR protection, so need to adjust the main control board:
#
cpu-defend policy 1 
car packet-type ospf cir 768
#
cpu-defend-policy 1 global   ///apply under all business board
cpu-defend-policy 1  //apply under SUR main control board
#
3、The portfolio of network OSPF deployment is bigger, peer number is more, all the OSPF message maintain heartbeat is the most optimal scheme in the scene.
Configure sham-hello enable under OSPF process, the equipment can maintain neighbor state when received OSPF message, not only depends on OSPF hello. The command will be effective after configured and will not influence the business existing.
#
ospf  1
sham-hello enable
#

END