No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

Troubleshooting Guide

CloudEngine 16800, 12800, 12800E, 8800, 7800, 6800, and 5800 Series Switches

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
CPU Usage Is High

CPU Usage Is High

Common Causes

CPU usage is the percentage of time the CPU spends in executing codes against the total time period specified. CPU usage is an important performance indicator of a device.

To check whether CPU usage is high, run the display cpu [ slot slot-id ] command. The System CPU Using Percentage field in the command indicates CPU usage. If the value exceeds 70%, for example, CPU usage is high. If the SYSTEM_1.3.6.1.4.1.2011.5.25.129.2.4.1 hwCPUUtilizationRisingAlarm alarm is generated, by default, CPU usage exceeds the alarm threshold (95% in V100R005C00 and earlier versions or 90% in V100R005C10 and later versions). A high CPU usage will cause service faults, for example, BGP route flapping, frequent VRRP active/standby switchover, card reset, and even device login failure.

A high CPU usage on a device is caused by some tasks that occupy many CPU resources. The high CPU of a task is commonly caused by one of the following:
  • A large number of packets are sent to the CPU due to a loop, DoS attack, or other reasons.

  • The device receives a large number of topology change (TC) BPDUs due to frequent topology changes on the Spanning Tree Protocol (STP) network. As a result, the device frequently deletes MAC address entries and ARP entries.

  • The device generates a large number of logs, consuming a lot of CPU resources.

Troubleshooting Flowchart

Rectify the fault according to Figure 6-2.
Figure 6-2 Troubleshooting flowchart for a high CPU usage

Troubleshooting Procedure

  • Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide technical support personnel.

  • The following procedures can be performed in any sequence.

  • The command outputs in the following procedures vary according to the device model and version. The following procedures provide methods to view related information.

  1. Check for the task with a high CPU usage.

    • Run the display cpu command to check the CPU usage of each task on the active MPU.

    • Run the display cpu slot slot-id command to check the CPU usage of each task on the specified MPU or LPU.

    Record the names of tasks whose CPU usages rank top three.

    <HUAWEI> display cpu
    CPU utilization statistics at 2015-10-14 03:05:45 801 ms
    System CPU Using Percentage :  11%
    CPU utilization for five seconds: 7%, one minute: 6%, five minutes: 6%.
    Max CPU Usage :                74%
    Max CPU Usage Stat. Time : 2015-10-12 01:53:10 697 ms
    State: Non-overload
    Overload threshold:  90%, Overload clear threshold:  75%, Duration:     480s
    ---------------------------
      
    ---------------------------
    SYSTEM           11%
    AAA               0%
    ARP               0%
    CMF               0%
    DEVICE            0%
    EUM               0%
    FEA               0%
    FIBRESM           0%
    FEC               0%
    IFM               0%
    IP STACK          0%
    LOCAL PKT         0%
    MSTP              0%
    ND                0%
    NETSTREAM         0%
    OAM               0%
    PEM               0%
    PNP               0%
    RGM               0%
    RM                0%
    SLA               0%
    SMLK              0%
    STACKMNG          0%
    SVRO              0%
    TNLM              0%
    TUNNEL            0%
    VLAN              0%
    ---------------------------
    CPU Usage Details
    ----------------------------------------------------------------
    CPU     Current  FiveSec   OneMin  FiveMin  Max MaxTime
    ----------------------------------------------------------------
    cpu0        16%      11%       9%      10%  81% 2015-10-12 01:53:09
    cpu1        15%       9%       6%       6%  61% 2015-10-12 01:53:09
    cpu2         8%       4%       5%       5%  76% 2015-10-12 01:53:09
    cpu3         5%       4%       4%       5%  78% 2015-10-12 01:53:09
    ----------------------------------------------------------------
    
    The following table lists the CPU-intensive tasks.
    Table 6-1 CPU-intensive tasks

    Task Name

    Description

    IP STACK

    IP protocol stack task. This task has a high CPU usage if the CPU is receiving and processing a large number of protocol packets. When this occurs, the device may be undergoing an IP packet attack. If this task has a high CPU usage and an LPU also has a high CPU usage, an attack may have occurred. Check the LPU to analyze the cause.

    RM

    Route management task. This task has a high CPU usage if the system is learning a large number of routes or many routes flap. When this occurs, check routing information to determine whether the route management module is faulty.

    FEA

    Forwarding engine management task. This task has a high CPU usage if the system is adding or deleting a large number of chip entries or the chips are sending and receiving a large number of packets. When this occurs, check whether many services are configured, many entries are being learned or flapping, or an attack has occurred.

  2. Check whether a large number of packets are sent to the CPU.

    Run the display cpu-defend statistics all command to check statistics about the packets sent to the CPU and focus on the Drop field.

    <HUAWEI> display cpu-defend statistics all
    Statistics(packets) on slot 4 :
    --------------------------------------------------------------------------------
    PacketType                          Last Dropping Time
                           
    --------------------------------------------------------------------------------
    aaa                                 0                    0   -
                                        0                    0
    arp                                59                    0   -
                                        0                    0
    arp-miss                            0                    0   -
                                        0                    0
    bfd                                 0                    0   -
                                        0                    0
    bgp                                 0                    0   -
                                        0                    0
    bpdu-tunnel                         0                    0   -
                                        0                    0
    dhcp                                0                    0   -
                                        0                    0
    dldp                                0                    0   -
                                        0                    0
    dns                                 0                    0   -
                                        0                    0
    dot1x                               0                    0   -
                                        0                    0
    efm                                 0                    0   -
                                        0                    0
    erps                                0                    0   -
                                        0                    0
    fcoe                                0                    0   -
                                        0                    0
    fib-hit                             0                    0   -
                                        0                    0
    ftp                                 0                    0   -
                                        0                    0
    gmac                                0                    0   -
                                        0                    0
    gre                                 0                    0   -
                                        0                    0
    icmp                                0                    0   -
                                        0                    0
    ipsec                               0                    0   -
                                        0                    0
    isis                                0                    0   -
                                        0                    0
    lacp                                0                    0   -
                                        0                    0
    ldt                                 0                    0   -
                                        0                    0
    lldp                                0                    0   -
                                        0                    0
    mpls                                0                    0   -
                                        0                    0
    mtu                                 0                    0   -
                                        0                    0
    multicast                           0                    0   -
                                        0                    0
    nd                                  0                    0   -
                                        0                    0
    netstream                           0                    0   -
                                        0                    0
    ntp                                 0                    0   -
                                        0                    0
    openflow                            0                    0   -
                                        0                    0
    ospf                                0                    0   -
                                        0                    0
    rip                                 0                    0   -
                                        0                    0
    smart-link                          0                    0   -
                                        0                    0
    snmp                                0                    0   -
                                        0                    0
    stp                            265808                    0   -
                                      423                    0
    telnet                              0                    0   -
                                        0                    0
    trill                               0                    0   -
                                        0                    0
    trill-management                    0                    0   -
                                        0                    0
    ttl-expired                         0                    0   -
                                        0                    0
    udp-helper                          0                    0   -
                                        0                    0
    unknown-multicast                   0                    0   -
                                        0                    0
    vrrp                                0                    0   -
                                        0                    0
    vxlan-detect                        0                    0   -
                                        0                    0
    --------------------------------------------------------------------------------
    
    • If the value of Dropped of a type of packet displays a large value and the corresponding task has a high CPU usage in step 1, a packet attack has occurred. Go to step 5.

    • If the number of packets sent to the CPU is within a normal range, go to step 3.

  3. Check whether the device has received a large number of TC BPDUs.

    If STP is enabled on the device, the device deletes MAC address entries and ARP entries when receiving TC BPDUs. If an attacker sends pseudo TC BPDUs to the device, the device receives a large number of TC BPDUs within a short time and frequently deletes MAC address entries and ARP entries. As a result, the CPU usage of the device becomes high.

    Run the display stp tc-bpdu statistics command to check statistics about the received TC and topology change notification (TCN) BPDUs.

    <HUAWEI> display stp tc-bpdu statistics
     -------------------------- STP TC/TCN information --------------------------
     MSTID Port                              
     0     10GE1/0/3                   2/3                   0/0
     1     10GE1/0/5                   1/0                   -/-
    
    • If a large number of TC and TCN BPDUs are received, run the stp tc-protection command in the system view to suppress TC/TCN BPDUs. After this command is used, the device processes TC BPDUs once every 2s only. To change the number of TC/TCN BPDUs to be received, use either of the following two methods: Set the maximum number of TC/TCN BPDUs that can be processed in a specified period using the stp tc-protection threshold threshold command; change the Hello timer length using the stp timer hello hello-time command.

      The Hello timer length configured using the stp timer hello hello-time command is expressed in centiseconds.

      <HUAWEI> system-view
      [~HUAWEI] stp tc-protection
      [*HUAWEI] stp tc-protection threshold 15
      [*HUAWEI] stp timer hello 400
      [*HUAWEI] commit
      
    • If only a small number of TC BPDUs are received, go to step 4.
  4. Check whether there are loops on the network.

    If a loop exists in a VLAN on the device, packets are circulated among some interfaces, resulting in a high CPU usage.

    Run the display current-configuration command to check whether MAC address flapping detection is enabled on the device. By default, MAC address flapping detection is enabled.
    • If MAC address flapping detection is not enabled, run the mac-address flapping detection command to enable this function.

      With MAC address flapping detection configured, the device can generate an alarm FEI_COMM_1.3.6.1.4.1.2011.5.25.160.3.13 hwMflpVlanLoopAlarm when two interfaces learn the same MAC address due a loop. For example:
      Oct 13 2015 22:58:24 HUAWEI %%01FEI_COMM/4/hwMflpVlanLoopAlarm(t):CID=0x807f0447-OID=1.3.6.1.4.1.2011.5.25.160.3.13;MAC flapping det
      ected, VlanId = 1, Original-Port = 10GE4/0/6, Flapping port 1 = 10GE4/0/4, port 2 = -. Check the network connected to the interface
      learning a flapping MAC address : f84a-bff0-cac2.
      You can also run the display mac-address flapping command to check MAC address flapping records.
      <HUAWEI> display mac-address flapping
      MAC Address Flapping Configurations :
      -------------------------------------------------------------------------------
        Flapping detection          : Enable
        Aging  time(s)              : 300
        Quit-VLAN Recover time(m)   : --
        Exclude VLAN-list           : --
        Security level              : Middle
        Exclude BD-list             : --
      -------------------------------------------------------------------------------
      S  : start time    E  : end time    (D) : error down
      -------------------------------------------------------------------------------
      Time                  VLAN MAC Address    Original-Port  Move-Ports     MoveNum
                            /BD
      -------------------------------------------------------------------------------
      S:2014-12-11 11:00:08 3    0000-08cc-2206 10GE1/0/1      10GE1/0/2      120
      E:2014-12-11 11:33:13 /-
      
      -------------------------------------------------------------------------------
      Total items on slot 1: 1
      
      Check the interface connection and networking according to the alarm:
      • If the ring network is not required, shut down one of the two interfaces according to the networking diagram.

      • If the ring network is required, disable MAC address flapping detection and enable a loop prevention protocol, such as STP.

    • If MAC address flapping detection is enabled but there is no MAC address flapping alarm or record, go to step 5.

  5. Check whether a large number of logs are generated on the device.

    The device generates diagnostic information or logs continuously in some conditions, for example, attacks, errors, or frequent interface status transitions occur. In these conditions, the system frequently reads and writes data in the storage device, causing a high CPU usage.

    Run the display logbuffer command to check whether a large number of abnormal logs are displayed. For example, a large number of the same logs are generated continuously. If so, go to step 6.

  6. Collect the following information and contact technical support personnel:

    • Results of the preceding troubleshooting procedure
    • Configuration file, logs, and alarms of the device

Relevant Alarms and Logs

Relevant Alarms

  • SYSTEM_1.3.6.1.4.1.2011.5.25.129.2.4.1 hwCPUUtilizationRisingAlarm

Relevant Logs

None

Translation
Download
Updated: 2020-01-07

Document ID: EDOC1000060766

Views: 615104

Downloads: 2962

Average rating:
This Document Applies to these Products

Related Version

Related Documents

Share
Previous Next