No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

Troubleshooting Guide

CloudEngine 16800, 12800, 12800E, 8800, 7800, 6800, and 5800 Series Switches

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
OSPF Neighbor Flapping Occurs

OSPF Neighbor Flapping Occurs

Common Causes

Fault Symptom

  • The OSPF interface flaps.
  • A BFD session flaps.
  • The local device discards received Hello packets.
  • The device cannot receive 1-way Hello packets within the period.
  • The kill neighbor event is generated on the neighbor state machine.

Common Causes

This fault is commonly caused by the following:

  • The device link fails.
  • A BFD session fails.
  • Packets are discarded by the CPU for CAR exceeding.
  • The LDM module discards packets.
  • The SOCKET module discards packets.
  • The peer device is faulty.
  • The CPU usage is too high.

Continuous OSPF neighbor flapping often occurs when a large number of services are deployed in the system.

Troubleshooting Flowchart

When OSPF neighbor flapping occurs continuously, rectify the fault according to Figure 6-6.

Figure 6-6 Troubleshooting flowchart for OSPF neighbor flapping

Troubleshooting Procedure

Continuous OSPF neighbor flapping often occurs when a large number of services are deployed in the system. In this situation, contact technical support personnel to ensure that services can be restored rapidly.

Procedure

  1. Check the cause of OSPF neighbor flapping.

    Run the display ospf peer last-nbr-down command in any view of the device. In the command output, the Immediate Reason field indicates the direct cause for neighbor down, and the Primary Reason field indicates the root cause for neighbor down. You can determine the cause of OSPF neighbor flapping according to the two fields.

    <HUAWEI> display ospf peer last-nbr-down
    
              OSPF Process 1 with Router ID 192.168.2.200                                                                                         
                                                                                                                                        
                             Last Down OSPF Peer                                                                                        
                                                                                                                                        
     ...
               : Neighbor Down Due to Kill Neighbor                                               
                 : Link Fault or Interface Configuration Change                     
    ...
    • If the Immediate Reason field displays Neighbor Down Due to LL Down, the link may be faulty. Go to step 2.
    • If the Primary Reason field displays BFD Session Down, the BFD session may be faulty. Go to step 3.
    • If the Immediate Reason field displays Neighbor Down Due to Inactivity, Hello packets are not received within the specified period. Go to step 4.
    • If the Immediate Reason field displays Neighbor Down Due to 1-Wayhello, the peer device does not receive Hello packets from the local device. Go to step 5.
    • If the Immediate Reason field displays Neighbor Down Due to Kill Neighbor, the local device's configurations change. Go to step 6.
    • If the Immediate Reason field displays other information than the preceding information, go to step 7.

  2. Check the status of the OSPF interface on which neighbor flapping occurs.

    Run the display ospf interface command in any view for multiple times to check the OSPF interface status.

    <HUAWEI>  display ospf interface
    
             OSPF Process 1 with Router ID 192.168.2.200                                                                                         
                                                                                                                                        
     Area: 0.0.0.0          MPLS TE not enabled                                                                                         
                                                                                                                                        
     Interface             IP Address      Type             Cost    Pri                                    
     Eth-Trunk255          192.168.0.101   Broadcast    Down     65535   1                                                              
     Loop0                 192.168.2.200   P2P          P-2-P    0       1                                                              
     Vlanif200             192.168.2.2     Broadcast    Down     1       1
    • If the State field does not display Down, the OSPF interface does not flap. Go to step 3.
    • If the State field displays Down sometimes, the OSPF interface flaps. Check whether the device link is faulty and prevent the interface from going Down. If the fault persists, go to step 7.

  3. Check whether a BFD session flaps.

    Check whether the OSPF neighbor has BFD enabled. For example, run the display this command in the views of the OSPF process and OSPF interface in which neighbor flapping occurs to check whether BFD is enabled (an OSPF process is used as an example).

    [~HUAWEI] ospf 1     
    [~HUAWEI-ospf-1] display this
    #
    ospf 1
     
     area 0.0.0.0
      network 10.10.10.10 0.0.0.0
      network 192.168.1.0 0.0.0.255
    #
    return
    [~HUAWEI] interface vlanif 100               
    [~HUAWEI-Vlanif100] display this
    #
    interface Vlanif100
     ip address 10.1.3.1 255.255.255.0
     
    #
    return
    • If BFD is not enabled in the OSPF process view, run the bfd all-interfaces enable command in the OSPF process view to enable BFD.
    • If BFD is enabled in the interface view or OSPF process view, run the display ospf bfd session command in the user view for multiple times to check whether a BFD session flaps according to the BFDState field.
      <HUAWEI> display ospf bfd session all
                OSPF Process 1 with Router ID 192.168.2.200                                                                               
                                                                                                                                          
        Area 0.0.0.0 interface 10.0.0.2 (Vlanif4000)'s BFD Sessions                                                                       
                                                                                                                                          
       NeighborId:10.1.1.3           AreaId:0.0.0.0           Interface:Vlanif4000                                                         
       :Unknown             rx    :14929             tx       :14929                                           
      ...
      • If BFDState remains Up, no BFD session flaps.
      • If BFDState happens to be Down or remains Unknown, a BFD session flaps, go to step 7.

  4. Check whether the local device discards received Hello packets.

    • Run the display cpu-defend statistics packet-type packet-type slot slot-id command in the user view to check statistics about packets discarded by the specified LPU. If the field keeps increasing, the LPU may receive a large number of packets and cannot send received Hello packets to the LDM module in time. As a result, OSPF cannot receive the Hello packets within the specified period.

      <HUAWEI> display cpu-defend statistics packet-type ospf slot 2                                                      
      
      Statistics(packets) on slot 2 :                                                                                                     
      --------------------------------------------------------------------------------                                                    
      PacketType               Total Passed           Last Dropping Time                          
                          Last 5 Min Passed                                                          
      --------------------------------------------------------------------------------                                                    
      ospf                                0                    0                    -                                                   
                                          0                    0                                                                          
      --------------------------------------------------------------------------------

      Run the car packet-type packet-type pps pps-value command to adjust the CAR threshold at which CPU packets are sent.

    • Run the display ldm innerdata packet-box-receive slot slot-id command in the diagnostic view to check statistics about packets discarded by the LDM module. If the TotalDropNum field increases rapidly, the LDM module continuously discards packets, possibly because it receives a large number of packets from the LPU and cannot process these packets in time. Subsequently, OSPF cannot receive the Hello packets within the specified period.

      [~HUAWEI] diagnose
      [~HUAWEI-diagnose] display ldm innerdata packet-box-receive slot 1
      ...
        : 0                                                                                                                  
      TotalDropBytes : 0                                                                                                                  
      --------------------------------                                                                                                                ...

      Wait for a period and then check whether the system LDM is stable. If the TotalDropNum cannot become stable, go to step 7.

    • Run the display ospf socket interface interface-type interface-number command in the diagnostic view to calculate the difference between the From LDM and To APP fields. If the difference keeps increasing, the SOCKET module discards packets received from the LDM module, possibly because a large number of services are deployed and the CPU usage remains high for a long period. Subsequently, OSPF cannot receive the Hello packets within the specified period.

      [~HUAWEI-diagnose] display ospf socket interface Vlanif 200
      
           OSPF 1 Socket Information                                                                                                 
       ...                                                                                 
      Packet Statistics:                                                                                                                  
          : 1895 Pkt   129568 Byte                                                                       
          From APP: 2106 Pkt    92420 Byte                                                                                                
          From IPV4Lib: 1895 Pkt    129568 Byte                                                                                           
          To LDM: 2002 Pkt    132044 Byte                                                                                                 
          : 1895 Pkt    129568 Byte                                                                        
          Flow Control To App: 8    Long Cong Time:0                                                                                      
      ...

      Wait for a period and then check whether the system CPU is stable. If the CPU cannot become stable and the preceding modules keep discarding packets, the OSPF neighbor relationship cannot become stable. Go to step 7.

      Continuous OSPF neighbor flapping is possibly caused by continuous packet loss. If allowed, shut down all interfaces during an upgrade. After the CPU becomes stable, establish 20 OSPF neighbor relationships each time until all OSPF neighbor relationships are established. This prevents the system from processing a large number of packets when many neighbor relationships need to be established simultaneously.

  5. Check whether the peer device receives Hello packets from the local device.

    1. Check whether the local device sends Hello packets.

      Run the debugging ospf packet hello interface interface-type interface-number command in the user view to enable OSPF debugging to check whether the local device sends Hello packets.

      <HUAWEI> debugging ospf packet hello interface Vlanif200
      <HUAWEI> terminal monitor
      Info: Current terminal monitor is on.
      <HUAWEI> terminal debugging
      Info: Current terminal debugging is on.
      <HUAWEI>
      Jul 15 2015 14:42:37.221 128_14.60 %%01OSPF/6/OSPF_DEBUG(d):CID=0x808204d5;                                                         
      FileID: 0x13 Line: 1012 Level: 0x5                                                                                                  
        OSPFv2 1  Packet, Interface: Vlanif4000                                                                   
        ...

      If SEND information is displayed, the OSPF module has sent Hello packets. If SEND information is not displayed, the OSPF module does not send Hello packets. Go to step 7.

    2. If the OSPF module has sent Hello packets, enable LDM debugging to check whether the LDM module sends packets.
      [~HUAWEI-diagnose] debugging ldm packet send ipv4 protocol ospf number 1
      [~HUAWEI-diagnose]                                                                                                          
      ...
      3 2015 09:47:59.288 PE2 %%01LDM/6/LDM_PKT(d):CID=0x8078275b; ret=0   
      ...

      If LDM send pkt to FE is displayed, the LDM module has sent packets. If it is not displayed, go to step 7.

    3. If the local device sends Hello packets, log in the remote device, run the debugging ospf packet hello interface interface-type interface-number command in the user view to enable Hello debugging to check whether the remote device receives the Hello packets sent by the local device.
      <HUAWEI> debugging ospf packet hello interface Vlanif200
      <HUAWEI> terminal monitor
      Info: Current terminal monitor is on.
      <HUAWEI> terminal debugging
      Info: Current terminal debugging is on.
      <HUAWEI>
      15 2015 14:41:01.203 128_14.60 %%01OSPF/6/OSPF_DEBUG(d):CID=0x808204d5;                                                         
      FileID: 0x1d Line: 1085 Level: 0x5                                                                                                  
        OSPFv2 1  Packet, Interface: Vlanif4000                                                             
      ...

      If RECV is displayed, the OSPF module has received packets. If the Hello packets were not sent out, check whether congestion occurs on the intermediate link. If the fault persists, go to step 7.

  6. Check whether the OSPF configuration of the local device changes.

    Run the display current-configuration | include ospf command to check the configuration of OSPF.

    Running the following commands in the OSPF interface view will disconnect OSPF neighbors:

    • Run the ospf network-type type command to change the OSPF interface's network type.
    • Run the ospf authentication-mode command to change the OSPF authentication type.
    • Run the ospf timer hello command to change the local device's Hello interval to be different from the peer device' Hello interval.
    • Run the ospf timer dead command to change the local device's dead interval to be different from the peer device' dead interval.

    Running the following commands in the OSPF process or area view will disconnect OSPF neighbors:

    • Run the silent-interface command in the OSPF process view.
    • Run the opaque-capability command in the OSPF process view.
    • Run the stub or nssa command in the OSPF area view to change the OSPF area type.

    Check whether OSPF neighbor flapping is caused by frequent modification of the local device's OSPF configuration. If so, prevent frequent OSPF configuration modification.

    If the fault persists, go to step 7.

  7. Collect the following information and contact technical support personnel.

    • Results of the preceding troubleshooting procedure.
    • Configuration file, logs, and alarms of the device

Translation
Download
Updated: 2020-01-07

Document ID: EDOC1000060766

Views: 616726

Downloads: 2965

Average rating:
This Document Applies to these Products

Related Version

Related Documents

Share
Previous Next