No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

High CPU Utilization on S Series Switches

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
High CPU Usage on S Series Switches

High CPU Usage on S Series Switches

Introduction

This document provides information about CPU and CPU usage on Huawei S series switches and how to locate and rectify faults when the CPU usage is high. In addition, it provides typical examples and references to maintenance engineers.

Prerequisites

The functions and commands supported by different models may be different. This document uses V200R007 as an example. For the functions and commands used on your switch, see the related switch documents.

Knowledge About High CPU Usage

This section describes knowledge about high CPU usage on switches, including impact of high CPU usage, reason why CPU usage is high, fault locating methods, method of lowering CPU usage, and method of preventing high CPU usage.

CPU and CPU Usage Overview

CPU - The Core of a Switch

A switch uses the distributed architecture, including forwarding and control planes. The forwarding plane implements Layer 2 and Layer 3 forwarding; the control plane implements forwarding control.

As shown in Figure 1-1, the control plane uses the universal embedded CPU and the forwarding plane uses forwarding chip:

  • The forwarding chip implements Layer 2 and Layer 3 forwarding, for example, updating the MAC address table for Layer 2 forwarding and Layer 3 forwarding table for IP forwarding. The forwarding chip implements data forwarding with a high throughput.
  • The CPU maintains software entries, such as routing and ARP entries, and configures the hardware Layer 3 forwarding table in chip based on the software forwarding entries. The CPU can also provide software-based Layer 3 forwarding. However, a disadvantage of CPU is that it has a low processing capability.
Figure 1-1 Distributed architecture

Packets on a network can be classified into control packets and data packets depending on their functions. If a switch does not have any hardware forwarding entry, the first packet reaching the switch is forwarded by the CPU and a Layer 3 forwarding hardware entry is created. The follow-up packets enter the forwarding chip through the inbound interface. Figure 1-2 shows this process.

Figure 1-2 Processing non-initial packets
  • Flow 1 (data packets) is sent out by the forwarding chip, and does not pass the CPU. The flow processing does not consume CPU resources.
  • Flow 2 (control packets and a part of data packets) is forwarded to the CPU through the forwarding chip. The CPU determines whether to send the flow out or terminate it. Flow 2 consumes CPU resources, and cannot be forwarded in a high speed.

The Layer 2 and Layer 3 hardware entries in the forwarding chip determine whether a switch can implement high-speed forwarding; however, the hardware entries in the forwarding chip are created based on the software entries maintained in the CPU. Therefore, the CPU is the core of a switch.

CPU Usage

After a switch starts, the CPU runs more than 200 active tasks to manage the switch and monitor Layer 3 entry learning. The number of tasks may vary according to switch models. In addition, when more features are configured on a switch, more tasks run in the system

CPU usage is the percentage of the amount of time a CPU spends processing non-idle tasks. It has the following characteristics:

  • Constantly changing: A switch's CPU usage keeps changing with system operations and changes of the environment.
  • Non-real-time: CPU usage data reflects CPU usage within a statistical period.
  • Entity-relevant: CPU usage is calculated based on physical CPU. Generally, each service card on a switch has an independent physical CPU. Therefore, the CPU usages of different cards are calculated separately.

A CPU usage reflects task running status at a specified time point. In Figure 1-3, task A occupies CPU resource for 10 ms, task B occupies CPU resource for 30 ms, and they stop for 60 ms. Then, task A occupies CPU resource for 10 ms, task B occupies CPU resource for 30 ms, and they stop for 60 ms. In this period, the CPU usage is 40%. A high CPU usage indicates that the switch is running many tasks.

Figure 1-3 Tasks occupy CPU resources

It can be found that the CPU usage is directly related to CPU performance. Therefore, the CPU usage is a key indicator of switch performance.

CPU and CPU Usage Working Mechanism

How Does a CPU Process Packets (Modular Switch)

Huawei switches forward data packets through the forwarding chip without involving the CPU. The following packets will be sent to the CPU for processing on a switch:

  • Protocol packets to be terminated by the switch

    All packets destined for the switch, including:

    • Control packets of protocols, such as STP, LLDP, LNP, LACP, VCMP, DLDP, EFM, GVRP, and VRRP
    • Route update packets of routing protocols, such as RIP, OSPF, BGP, and IS-IS
    • SNMP, Telnet, SSH packets
    • ARP and ND reply packets
  • Packets requiring special processing
    • ICMP packets carrying options
    • IPv6 packets with hop-by-hop option
    • IPv4/IPv6 packets with a TTL value less than or equal to 1
    • Packets with the switch's local IP address as the destination address
    • ARP/ND/FIB Miss packets
  • Packets forwarded to the CPU by matching ACL
    • Packets discarded by the deny action in ACL rules after the logging function is enabled
    • Packets redirected to the CPU by traffic policies
  • Multicast-related packets
    • PIM, IGMP, MLD, and MSDP protocol packets
    • Unknown IP multicast packets
  • Packets related to other features
    • DHCP packets
    • ARP and ND broadcast request packets
    • Layer 2 protocol packets forwarded through software by L2PT (Devices on two ends of a tunnel forward Layer 2 protocol packets through software, and intermediate devices forward these packets through chip.)

In Figure 1-4, multiple rate limiting operations are performed on the packets that are sent to the CPU of an MPU. For example, forwarding chips and SFU chips will limit the rate. The rate limiting ensures security of the MPU CPU.

Figure 1-4 Rate limiting for packets on a modular switch

In Figure 1-5, rate limiting on each chip or logic includes protocol-based rate limiting, queue-based rate limiting, and port-based rate limiting. The following provides default CPU rate limiting configuration on non-X1E LPUs of the S9300 running V200R007. To check the default CPU rate limiting configuration in other switch models and versions, run the display cpu-defend configuration all command.

Figure 1-5 Rate limiting types for packets to be sent to the CPU

Table 1-1 Protocol-based rate limits on the S9300

Packet Type

Rate Limit on LPU (in kbit/s)

Rate Limit on MPU (in kbit/s)

802.1x, arp-miss, mpls-ping, nd, nd-miss, loopbacktest, nd-redirect

64

64

smart-link, lacp, lldp, dldp, ttl-expired, mpls-ttl-expired, ntp, hw-tacacs, fib-miss, hgmp-bc, smlk-rrpp, hotlimit, mpls-vccv-ping, arp-request, arp-reply, arp-mff, vpls-arp

64

128

eoam-3ah, mpls-one-label

64

256

vpls-igmp, mpls-rsvp, ipmc-invalid, bpdu

64

512

vrrp, bgp4plus, vrrp6, hvrp, ssh, ftp, snmp, gvrp, eoam-1ag-lblt, pppoe, hopbyhop, hgmp-mc, hgmp-uc, nac-nd, nd-snp-rs, nd-snp-rans, nd-snp-na, mad, nac-arp

128

128

mpls-oam, igmp, pim, rip, telnet, tcp, fib-hit, rrpp, udp-helper

128

256

stp, mld, unknown-multicast, bpdu-tunnel, ipmc-miss

128

512

fib6-hit, mpls-fib-hit

128

1024

icmp

192

256

http, pimv6, icmpv6, easy-operation, eoam-1ag, heart-packet

256

256

isis, ospf, ospf-hello, bgp, bfd, mpls-ldp, ripng, ospfv3, nac-dhcp, vpls-dhcp-request, vpls-dhcp-reply, nac-dhcpv6, ospfv3-uc

256

512

dhcp-client, dhcpv6-request, dhcpv6-reply, radius, y1731

512

512

dhcp-server

512

1024

Table 1-2 CPU queues for different packets on an LPU (a larger queue ID indicates a higher forwarding priority)

Queue ID on an LPU

Packet Type

Description

7

lacp

Fast protocol packets (fast protocols have fast responses in interaction, for example, the response time of BFD is within 100 ms. The loss of a few packets will cause protocol flapping.)

6

vp (VRRP packets are moved from queue 5 to queue 6 in V200R010.)

Packets sent from an LPU's CPU to the MPU's CPU

5

stp, smart-link, ldt, lldp, dldp, vrrp, mpls-oam, isis, pim, rip, ospf, ospf-hello, bgp, bfd, mpls-rsvp, mpls-ldp, mpls-ttl-expired, ntp, ripng, ospfv3, bgp4plus, pimv6, vrrp6, hvrp, telnet, ssh, mpls-ping, gvrp, bpdu-tunnel, rrpp, eoam-3ah, eoam-1ag, eoam-1ag-lblt, nd, y1731, mpls-one-label, loopbacktest, bpdu, nap, hgmp-mc, hgmp-uc, hgmp-bc, nd-redirect, nd-snp-rs, nd-snp-rans, nd-snp-na, mad, smlk-rrpp, ospfv3-uc

Important control plane protocol packets

4

other

-

3

arp-request, arp-reply, dhcp-client, dhcp-server, gmp, vpls-igmp, icmp, 8021x, http, dhcpv6-request, dhcpv6-reply, icmpv6, mld, ftp, snmp, radius, hw-tacacs, tcp, easy-operation, fib-hit, fib-miss, arp-miss, unknown-packet, udp-helper, arp-mff, pppoe, hopbyhop, mpls-vccv-ping, fib6-hit, nd-miss, nac-dhcp, vpls-arp, vpls-dhcp-request, vpls-dhcp-reply, nac-arp, icmp-ttl-expired, mpls-fib-hit, nac-nd, nac-dhcpv6, heart-packet

Important control plane protocol packets

2

ttl-expired, hotlimit

Secondary control plane protocol packets

1

unknown-multicast, ipmc-invalid, ipmc-miss

Secondary control plane protocol packets

0

other

-

Table 1-3 CPU queues for different packets on an MPU (a larger queue ID indicates a higher forwarding priority)

Queue ID on an MPU

Packet Type

Description

7

lacp

Fast protocol packets (fast protocols have fast responses in interaction, for example, the response time of BFD is within 100 ms. The loss of a few packets will cause protocol flapping.)

6

vp (VP packets are the same as those in the original protocol packet queue in V200R003 and later versions. VRRP packets are moved from queue 5 to queue 6 in V200R010.)

Packets sent from an LPU's CPU to the MPU's CPU

5

stp, smart-link, ldt, lldp, dldp, vrrp, mpls-oam, isis, pim, rip, ospf, ospf-hello, bgp, bfd, mpls-rsvp, mpls-ldp, mpls-ttl-expired, ntp, ripng, ospfv3, bgp4plus, pimv6, vrrp6, hvrp, telnet, ssh, mpls-ping, gvrp, bpdu-tunnel, rrpp, eoam-3ah, eoam-1ag, eoam-1ag-lblt, nd, y1731, loopbacktest, bpdu, nap, hgmp-mc, hgmp-uc, hgmp-bc, nd-redirect, nd-snp-rs, nd-snp-rans, nd-snp-na, mad, smlk-rrpp, ospfv3-uc

Important control plane protocol packets

4

other

-

3

arp-request, arp-reply, dhcp-client, dhcp-server, gmp, vpls-igmp, icmp, 8021x, http, dhcpv6-request, dhcpv6-reply, icmpv6, mld, ftp, snmp, radius, hw-tacacs, tcp, easy-operation, fib-hit, fib-miss, arp-miss, unknown-packet, udp-helper, arp-mff, pppoe, hopbyhop, mpls-vccv-ping, fib6-hit, nd-miss, nac-dhcp, mpls-one-label, vpls-arp, vpls-dhcp-request, vpls-dhcp-reply, nac-arp, icmp-ttl-expired, mpls-fib-hit, nac-nd, nac-dhcpv6, heart-packet

Important control plane protocol packets

2

ttl-expired, hotlimit

Secondary control plane protocol packets

1

unknown-multicast, ipmc-invalid, ipmc-miss

Secondary control plane protocol packets

0

sFlow, NetStream

Data packets or messages

A switch determines into which CPU queues packets will be placed based on the packets' importance and plane (management, control, or forwarding plane). A CPU queue has a priority. For example, when both the Telnet management packets and dhcp-client protocol packets are sent to the CPU, the CPU first processes the Telnet management packets in queue 5. This mechanism ensures device stability and manageability under a heavy CPU load. The CPU can use a weighting mechanism to ensure that packets in low-priority queues can be processed. On a stable network, the number of packets sent to the CPU is limited within a specified range, and therefore the CPU usage remains within a proper range. If a large number of packets are sent to the CPU within a short period, the CPU is busy processing these packets, resulting in a high CPU usage.

How Does a CPU Process Packets (Fixed Switch)

Huawei switches forward data packets through hardware without involving the CPU. The following packets will be sent to the CPU for processing on a switch:

  • Protocol packets to be terminated by the switch

    All packets destined for the switch, including:

    • Control packets of protocols, such as STP, LLDP, LNP, LACP, VCMP, DLDP, EFM, GVRP, and VRRP
    • Route update packets of routing protocols, such as RIP, OSPF, BGP, and IS-IS
    • SNMP, Telnet, SSH packets
    • ARP and ND reply packets
  • Packets requiring special processing
    • ICMP packets carrying options
    • IPv6 packets with hop-by-hop option
    • IPv4/IPv6 packets with a TTL value less than or equal to 1
    • Packets with the switch's local IP address as the destination address
    • ARP/ND/FIB Miss packets
  • Packets processed using ACLs
    • Packets discarded by the deny action in ACL rules after the logging function is enabled
    • Packets redirected to the CPU by traffic policies
  • Multicast
    • PIM, IGMP, MLD, and MSDP protocol packets
    • Unknown IP multicast packets
  • Other features
    • DHCP packets
    • ARP and ND broadcast request packets as well as the ARP packets sent when dynamic ARP inspection (DAI) is configured on a Layer 2 switch
    • Layer 2 protocol packets forwarded through software by L2PT (Devices on two ends of a tunnel forward Layer 2 protocol packets through software, and intermediate devices forward these packets through hardware.)
    • In N:1 VLAN mapping, the first packet is sent to the CPU, and other packets are forwarded by hardware.

A switch uses QoS mechanisms to prioritize packets sent to the CPU and ensure preferential processing of important packets. The switch groups different packets sent to the CPU into eight queues by priority. The types of packets sent to the CPU may vary in different switch models. Table 1-4 and Figure 1-6 lists typical packets that are sent to the CPU in the S5700LI. A larger queue ID indicates a higher priority.

Table 1-4 Queues for different packets sent to the CPU

Queue ID

Packet Type

Description

7

IPC, RPC, LACP

Internal management packets

6

VP (VP packets are the same as those in the original protocol packet queue in V200R003 and later versions.)

Internally forwarded protocol packets

5

Telnet, SSH, LNP, DHCP

Management plane protocol packets

4

ARP Request

Important control plane protocol packets

3

STP, SMLK, EOAM, VCMP

Important control plane protocol packets

2

LBDT, LLDP, DLDP, IGMP, ICMP, NTP, 802.1x, GVRP, L2PT, ARP Miss, FTP, SNMP

Control plane protocol packets

1

Other

-

0

sFlow, NetStream

Data packets or messages

Figure 1-6 Allocating packets of different types to CPU queues

A switch determines into which CPU queues packets will be placed based on the packets' importance and plane (management, control, or forwarding plane). A CPU queue has a priority. For example, when Telnet management packets and Layer 2 protocol packets transparently forwarded through L2PT are sent to the CPU, the CPU first processes the Telnet management packets in queue 5. This mechanism ensures device stability and manageability under a heavy CPU load. The CPU can use a weighting mechanism to ensure that packets in low-priority queues can be processed. On a stable network, the number of packets sent to the CPU is limited within a specified range, and therefore the CPU usage remains within a proper range. If a large number of packets are sent to the CPU within a short period, the CPU is busy processing these packets, resulting in a high CPU usage.

Impact of High CPU Usage

The CPU on a switch will be overloaded if the forwarding plane sends packets to the CPU at high speeds (for example, the CPU receives a large number of packets within a short time due to a loop on the network) or a task consumes CPU resources for a long time. When this occurs, the CPU may be unable to process other tasks in a timely manner, which may cause exceptions in services.

High CPU usage adversely affects the system processing capability and may result in the following network problems:

  • Nonresponse to management requests
    • Failure to set up a Telnet or SSH session with the switch, causing a failure to manage the switch, slow response of the switch, or delay in command execution
    • SNMP timeout
    • Long delay or even timeout of MAC/IP ping operations
  • DHCP or 802.1X service failures caused by the switch's failure to forward or respond to requests from clients
  • Changes in the STP topology or even loops

    A switch maintains root and alternate ports based on the BPDUs periodically received on its CPU. If the upstream device cannot send BPDUs in a timely manner because its CPU is busy or the switch's CPU is too busy to process received BPDUs, the switch considers the original path to the root bridge to have failed and selects a new root port, causing network reconvergence. If the switch also has an alternate port, the switch uses the alternate port as the new root port. In this situation, a loop may occur on the network.

  • Changes in the routing topology

    Hello packets of dynamic routing protocols are processed by the CPU. If the CPU is too busy to process the received Hello packets or send Hello packets, route flapping occurs. For example, OSPF flapping, BGP flapping, or VRRP flapping may occur in this situation.

  • Flapping of reliability detection protocols

    The CPU is responsible for keepalive of detection protocols such as 802.3ah, 802.1ag, DLDP, BFD, and MPLS OAM. If a busy CPU cannot transmit or receive protocol packets promptly, protocol flapping occurs, which affects service traffic forwarding.

  • LACP Eth-Trunk link flapping

    LACP packets are processed by the CPU. If the CPU is too busy to receive and send LACP packets, the Eth-Trunk link will flap between Up and Down states.

  • Dropping of software forwarded packets or increasing delay in forwarding such packets
  • Memory usage of the switch increases.

Normal High CPU Usage Situations

A high CPU usage will cause service faults, for example, Border Gateway Protocol (BGP) route flapping, frequent Virtual Router Redundancy Protocol (VRRP) switchovers, or even user login failures. In some situations, a high CPU usage does not affect the network. For example, when a switch is reading optical transceiver information or traffic is bursting, the CPU usage may sharply increase. This is a normal and acceptable situation. Therefore, a high CPU usage may not be caused by faults. If a switch cannot process services for a long time, check whether a fault has occurred.

A high CPU usage resulting from the following events is normal and does not need to be handled. If the CPU usage can automatically restore to a normal range, you do not need to perform any operations.

  • Traffic bursts.
  • A card starts.
  • The switch reads information about multiple optical transceivers simultaneously.
  • The switch is calculating the spanning tree.

    On a device running Multiple Spanning Tree Protocol (MSTP) network, the CPU usage is proportional to the number of instances and active ports. On a device running VLAN-based Spanning Tree (VBST), each VLAN runs an independent instance. Therefore, VBST uses more CPU resources than MSTP when they have the same number of VLANs and active ports.

  • The switch updates routing table in a large scale after receiving route update messages.

    When a switch receives a route update message, the switch updates routing information and delivers it to the control plane, which consumes CPU resources. In a cluster/stack system, the switch also needs to synchronize routing information to other member switches.

    During routing table update, the following factors affect the CPU usage:

    • Number of entries in the routing table
    • Update frequency
    • Number of routing processes receiving the update messages
    • Number of member switches in a cluster/stack
  • The switch is running copy cfcard:/ or output much debugging information.
  • The NMS frequently operates the switch.
  • Other events
    • Fast MAC address learning on a port running the sticky MAC function
    • Many ports are added to many VLANs (For example, a user performs configuration in a port group to add many ports to many VLANs or change link types of the ports.)
    • The switch frequently receives a large number of IGMP request messages.
    • The switch processes a large number of concurrent DHCP requests (For example, a switch that functions as a DHCP server restores connections with a large number of users.)
    • ARP broadcast storm.
    • Ethernet broadcast storm.
    • Software forwarding of a large number of concurrent protocol packets (For example, L2PT transparently transmits a large number of BPDUs or the DHCP relay/snooping module forwards a large number of DHCP packets within a short time.)
    • A large number of data packets cannot be forwarded through the forwarding chip and are sent to the CPU (such as ARP Miss).
    • Ports alternate between Up and Down.

How to Locate the High CPU Usage Problem

  • When the network access speed of a user is slow or the video service is intermittently interrupted, determine whether the problem was caused by a high CPU usage according to Figure 1-7.
  • You can also check the CPU usage according to Figure 1-7 during routine operation.
Figure 1-7 Determining a high CPU usage

Checking the Switch and Version Information

Run the display version and display device commands to check the switch version and component types. Record the information for follow-up operations.

  1. Run the display version command to view the switch software version.

    # Run the display version command.

    <HUAWEI> display version
    Huawei Versatile Routing Platform Software
    VRP (R) software, Version 5.160 (S7700 V200R007C00)
    Copyright (C) 2000-2013 HUAWEI TECH CO., LTD
    Quidway S7703 Terabit Routing Switch uptime is 0 week, 0 day, 1 hour, 3 minutes
    BKP 0 version information:
    1. PCB      Version  : LE02BAKB VER.A
    2. Supporting PoE    : No
    3. Board    Type     : ES0B017712P0
    4. MPU Slot Quantity : 2
    5. LPU Slot Quantity : 3
    ……

    The VRP (R) software, Version 5.160 field indicates that this is an S7700 switch running V200R007.

  2. Run the display device command to check the switch model, whether the switch is in a cluster/stack, and LPUs (only on modular switches).

    # Run the display device command to check the component types and status.

    <HUAWEI> display device
    S7712's Device status:  
    Slot  Sub Type         Online    Power      Register       Status     Role       
    -------------------------------------------------------------------------------  
    6     -   ES0D0X4UXC00 Present   PowerOn    Registered     Normal     NA         
    8     -   ES0D0F48TC00 Present   PowerOn    Registered     Normal     NA         
    9     -   ES0D0G24SC00 Present   PowerOn    Registered     Normal     NA         
    10    -   -            Present   PowerOff   Unregistered   -          NA         
    14    -   ES0D00SRUA00 Present   PowerOn    Registered     Normal     Master     
    PWR1  -   -            Present   PowerOn    Registered     Normal     NA         
    CMU1  -   LE0DCMUA0000 Present   PowerOn    Registered     Normal     Master     
    FAN1  -   -            Present   PowerOn    Registered     Normal     NA         
    FAN2  -   -            Present   PowerOn    Registered     Normal     NA         
    FAN3  -   -            Present   PowerOn    Registered     Normal     NA         
    FAN4  -   -            Present   PowerOn    Registered     Normal     NA    

    The preceding information shows that this is a stand-alone S7712, with the ES0D00SRUA00 (MPU), LE0DCMUA0000 (CMU), and ES0D0X4UXC00/ES0D0F48TC00/ES0D0G24SC00 (LPUs) installed.

Checking the CPU Usage

Check the CPU usage as follows:
  • Run the display cpu-usage command to view the CPU usage.

    After several seconds, run the display cpu-usage command again to verify the CPU Usage field.

    NOTE:

    A switch is considered running normally if its long-term average CPU usage does not exceed 80% and its highest temporary CPU usage does not exceed 95%.

    Command

    Command Description for Modular Switches

    Command Description for Fixed Switches

    display cpu-usage

    Displays the CPU usage of the active MPU.

    NOTE:

    Generally, the CPU usage of a standby MPU will not be high, so it is not displayed.

    Displays the CPU usage of the switch.

    display cpu-usage slot slot-id

    • Non-cluster: displays the CPU usage of the specified interface card.
    • Cluster: displays the CPU usage of the cluster.
    • Non-stack: displays the CPU usage of the switch when the slot-id value is 0.
    • Stack: displays the CPU usage of the switch specified by slot-id.

    # Check the CPU usage of a non-cluster modular switch.

    <HUAWEI> display cpu-usage
    CPU Usage Stat. Cycle: 10 (Second)
    CPU Usage         : 88% Max: 92%
    CPU Usage Stat. Time : 2010-12-18  15:35:56
    CPU utilization for five seconds: 68%: one minute: 60%: five minutes: 55%.
    Max CPU Usage Stat. Time : 2015-01-27 10:08:10. 
    
    TaskName        CPU  Runtime(CPU Tick High/Tick Low)  Task Explanation           
    VIDL                 82%         8/ 4c8b1ff       DOPRA IDLE                     
    OS                   12%         1/2c684bff       Operation System  
    ...

    The preceding information shows that the CPU usage of the switch reaches 88%.

    Follow-up: Find out the tasks occupying high CPU usage and focus on the top 3 tasks (in V200R005 and later versions, the tasks are listed in a descending order of CPU usage). For details, see Determining Fault Causes According to CPU Usages of Tasks (Modular Switches) and Determining Fault Causes According to CPU Usages of Tasks (Fixed Switches).

  • Check whether related alarms have been reported on the NMS.

    When a switch connects to an NMS, check whether there is a high CPU usage alarm on the NMS.

    When the CPU usage exceeds the alarm threshold (set by the set cpu-usage threshold command in the system view, and the default CPU usage alarm threshold is 80%), the switch reports the following alarms to the NMS. Obtain the high CPU usage information according to the alarm messages.

    • hwCPUUtilizationRising
    • hwCPUUtilizationRisingAlarm

    For details about the alarms, see Alarm Information

  • Check whether the log records a high CPU usage.

    View the system log files or run the display logbuffer command to check whether the system has recorded logs about high CPU usage.

    The system log may include the current or historical high CPU usage records.

    Related log: VOSCPU/4/CPU_USAGE_HIGH. For details about this log, see Log Information.

Determining Fault Causes According to CPU Usages of Tasks (Modular Switches)

Run the display cpu-usage command to view the top 3 tasks occupying high CPU usage (in V200R005 and later versions, the tasks are listed in a descending order of CPU usage).

Find out the reason why CPU usage is high and the solution according to Table 1-5.

Table 1-5 Common tasks with high CPU usages and solutions

Task Name

Description

Reason for High CPU Usage

Solution

AGNT

Implements the IPv4 SNMP protocol stack and processes SNMP connection between the NMS and switch.

NMS operations are frequently performed.

Figure out a solution according to the network management events. Lower the rate at which the NMS sends requests or shield the requests from the NMS.

AGT6

Implements the IPv6 SNMP protocol stack and processes SNMP connection between the NMS and switch.

ARP

Implements the ARP protocol stack, manages the ARP state machine, and maintains the ARP database.

  • The CAR for packets sent to the CPU is too large, and a large number of ARP packets are received.
  • The aging time is too short.

Adjust the CAR for packets sent to the CPU and aging time.

bcmRx/bcmT/FTS/FBUF/VP/VPR/VPS/SOCK/ARPA

Packet receiving/sending task

When many protocol packets are sent to the CPU, the CPU usage of this task significantly increases.

This is a major cause for high system CPU usage.

The reasons why many protocol packets are sent to the CPU include:

  • The CPU is attacked.
  • A network loop occurs.
  • Service traffic is heavy.
  1. For details, see Checking Whether the Problem Is Caused by a Network Attack and Checking Whether the Problem Is Caused by Network Loop.
  2. Confirm with Huawei switch resellers whether service traffic is heavy and ask for help.

bcmDPC

Reports interrupts when chip failures occur.

  • There are unrecoverable soft failure entries on cards, and interrupts are not suppressed.
  • A large number of TC packets are received. As a result, MAC address entries are frequently deleted.
  • Upgrade the patch and restart the device.
  • Solve the problem of TC packets.

bcmL2MOD.0

Chip 0 MAC address entry learning task

MAC address flapping or a hash conflict occurs.

bcmL2MOD.2

Chip 2 MAC address entry learning task

bmLINK.0

Chip 0 linkscan task, which scans interface status and notifies the application modules of interface status changes

A large number of link interruptions are reported or miim access is time-consuming. Link interruptions are caused by LOS of optical modules. Non-certified optical modules and optical module failures will lead to many abnormal interruptions (non-standard optical modules will cause this situation).

Replace the optical modules with Huawei-certified optical modules.

bmLINK.1

Chip 1 linkscan task, which scans interface status and notifies the application modules of interface status changes

bmLINK.2

Chip 2 linkscan task, which scans interface status and notifies the application modules of interface status changes

CFM

Configuration management task, which restores MPU configuration and interface configuration

Configurations are restored.

No action is required.

CWP_CWP

Distributes CAPWAP services, receives and distributes CAPWAP packets.

High CPU usage occurs during message queue maintenance, packet distribution and statistics collection, or CAPWAP timer processing (retransmission, fragmentation, reassembly, and state machine), or when a large number of packets exist, traffic is sent continuously, or an attack occurs.

Decrease the service concurrency rate, and expand the system capacity or use high-performance main control units such as SRUH.

CWP_FWD

Creates CAPWAP socket, receives and sends socket packets, and rapidly receives and sends packets.

Traffic is continuously sent when there are a large number of CAPWAP control packets, or a CAPWAP attack exists.

When more than 20 users connect to the switch concurrently, it is normal that the CPU usage of this task is within 15%. You can only expand the capacity to solve the problem.

DEV/HOTT/FMCK/SRMI

Device management task

  • During configuration restoration, active/standby switchover, and card installation stage, the CPU usage may temporarily increase. This is a normal condition.
  • A large number of interrupts are reported when some hardware components fail. In this situation, the CPU usage of this task may increase.

Confirm with Huawei switch resellers whether the hardware is faulty.

For details, see Checking Whether the Problem Is Caused by a Hardware Failure.

DHCP

Implements the DHCP protocol stack and provides the functions such as DHCP snooping and DHCP relay.

The CPU experiences a DHCP attack.

For details, see Checking Whether the Problem Is Caused by a Network Attack.

FIB

Generates IPv4 software forwarding entries on the MPU and delivers the entries to the interface card to guide data forwarding.

When a large number of routes are delivered, route flapping continuously occurs.

No action is required.

FIB6

Manages IPv6 FIB entries, maintains software entries, and requests the hardware adaptation layer to maintain chip entries.

FMAT

Trap management task, which processes the traps generated by all services

A large number of traps are generated. For example, a large number of interfaces alternate between Up and Down states.

The high CPU usage problem is automatically solved when the number of generated traps is stable.

FTPS

Provides the FTP server service and FC0 as well as FC1 services.

The CPU usage of the FC task becomes high when large files are being transferred, for example, a large file is being transferred and even multiple large files are being transferred concurrently.

The high CPU usage problem is automatically solved after file transfer ends. To prevent this problem, minimize concurrent transfer of multiple large files.

HTTP

Processes HTTP packets.

The CPU usage becomes high when a large number of external HTTP packets are being processed, for example, web operations are frequently performed.

Reduce the frequency of packet sending triggered by external operations.

INFO

Information center main task, which receives and outputs the logs, alarms, and debugging information generated by service modules

When logs and debugging information are frequently triggered, frequently writing files to the CF card may also cause a high CPU usage due to poor performance of the CF card.

Reduce the frequency at which operations triggered by logs and debugging information are performed.

IP

Schedules IP protocol tasks in a unified manner.

A large number of IPv6 packets are received and sent.

Reduce the number of received and sent IPv6 packets by, for example, adjusting the CPCAR.

L2MC

Adapts to and delivers Layer 2 multicast entries. This task is the multicast product LPU adaptation task.

Layer 2 multicast entries are repeatedly updated due to ring network or port flapping.

Check whether ring network or port flapping occurs.

LDP

Implements the LDP protocol stack and maintains LSP databases.

Route flapping occurs.

Prevent session flapping caused by route flapping.

MCSW

Multicast product adaptation task, which processes received and sent multicast packets and delivers Layer 3 multicast entries

  • The switch receives a large number of multicast packets.
  • Multicast entries are repeatedly updated due to route or port flapping.
  • Check whether there are multicast attack packets.
  • Check whether route or port flapping occurs.

MFIB

Manages Layer 3 multicast forwarding entries.

A large number of data/registration packet entries are received, and interfaces frequently flap.

Configure a policy to filter data, find the cause for flapping, and eliminate the flapping.

MPSI

MPLS service LPU adaptation task

  • A large number of LSPs are updated.
  • A large number of L2VPN services are configured, added, or deleted.

Check port flapping and protocol status.

MPSM

MPLS service MPU adaptation task

PAT

Manages patch operations, for example, load, activate, run, and delete patches.

Patches are loaded to the standby MPU and LPUs.

The CPU usage of the PAT task will increase for a while when patches are being loaded. Currently, no proper approach is available to solve this problem. To prevent patch loading from affecting services, do not perform batch service operations during patch loading.

PM

Performance management task, which processes performance statistics data and PM configuration commands

When there are many PM configurations (a large amount of statistics data), performance data collection and processing are triggered.

  • Reduce the frequency at which performance statistics are collected.
  • Configure different statistics collection intervals for different statistics collection tasks.

RSVP

Implements the RSVP protocol stack and maintains the CR-LSP database.

RSVP LSP flaps or a large number of RSVP packets are sent and received.

RSVP LSP flapping is often caused by link or IGP flapping. You need to eliminate link or IGP flapping. If a large number of RSVP packets are sent and received, check whether there are invalid RSVP packets.

SFPM

Queries manufacturer information and digital diagnostic information of optical modules.

There are non-certified optical modules on the switch, causing I2C failures.

Replace non-certified optical modules with certified ones.

SNPG

Layer 2 multicast protocol stack task, which processes received and sent Layer 2 multicast packets and delivers Layer 2 multicast entries.

  • The switch receives a large number of Layer 2 multicast packets.
  • Layer 2 multicast entries are repeatedly updated due to ring network or port flapping.
  • Check whether there are a large number of Layer 2 multicast attack packets.
  • Check whether ring network or port flapping occurs.

VIDL

Collects statistics on CPU usage of idle tasks.

A larger value for this task indicates a lower CPU usage.

The system calculates the CPU usage for this task based on the duration in which the task occupies CPU resources. Therefore, no action is required.

VT0

Authenticates the user with the user ID 0 and processes commands.

User operations, especially, input and output operations are frequently performed. For example, commands are copied to the screen (input) or a large number of display commands are executed (output).

Reduce the frequency at which input and output operations are performed. This problem is automatically solved after the operations end.

VT1

Authenticates the user with the user ID 1 and processes commands.

VT2

Authenticates the user with the user ID 2 and processes commands.

VTYD

Processes login requests of all users.

A large number of user input operations are performed, for example, commands are copied to the screen.

Reduce the frequency at which input operations are performed.

WMT_DEV

Device management task:

  • Periodically checks APs.
  • Processes AP-pings.
  • Periodically synchronizes messages between mobility groups.
  • Processes MAP messages.
  • Processes CAPWAP messages.
  • Processes messages of the DEV module.
  • Processes status transition during AP login, maintains the state machine (including upgrade processing), batch AP login/logout, AP upgrade, and collected information periodically reported by radios.

During batch AP login/logout, upgrade, radio calibration, terminal location, a large number of messages from APs are concurrently processed.

Set the interval for scanning the air interface to a larger value and check whether APs frequently go offline.

WMT_SEC

User management task:

  • Processes user login/logout and roaming.
  • Processes the user key negotiation procedure.

More than 20 users connect to the switch or are roaming simultaneously.

When more than 20 users connect to the switch simultaneously, the CPU usage of this task is about 15% and CPU resources are used to process user access, authentication, and roaming. When more than 20 users connect to the switch simultaneously, capacity expansion is required.

WT0

Web service processing task, which processes requests of all web users

Operations are frequently performed on the web platform.

Reduce the frequency at which web operations are performed.

WT1

WT2

UCM/SAM

Processes user login/logout and permission control.

The number of concurrent users is large, or the users go online and offline frequently.

Check whether a large number of users go online and offline and whether the authentication configuration is changed.

If the top tasks on your switch are not included in the preceding table, see CPU-related Tasks and Functions for Modular Switches to find out which services caused the high CPU usage.

If the top tasks on your switch are not included in the preceding table or CPU-related Tasks and Functions for Modular Switches, contact Huawei switch resellers.

The preceding table is only a reference for you to locate a high CPU usage problem. To fix the problem, see How to Fix the High CPU Usage Problem.

Determining Fault Causes According to CPU Usages of Tasks (Fixed Switches)

Run the display cpu-usage command to view the top 3 tasks occupying high CPU usage (in V200R005 and later versions, the tasks are listed in a descending order of CPU usage).

Find out the reason why CPU usage is high and solution according to Table 1-6.

Table 1-6 Common tasks with high CPU usages and solutions

Task Name

Description

Reason for High CPU Usage

Solution

VIDL

Collects statistics on CPU usage of idle tasks.

A larger value for this task indicates a lower CPU usage.

The system calculates the CPU usage for this task based on the duration in which the task occupies CPU resources. Therefore, no action is required.

bmLINK.0

Linkscan task, which scans interface status and notifies the application modules of interface status changes

A large number of link interruptions are reported or miim access is time-consuming. Link interruptions are caused by LOS of optical modules. Non-certified optical modules and optical module failures will lead to many abnormal interruptions (non-standard optical modules will cause this situation).

Replace the optical modules with Huawei-certified optical modules.

linkscan

AGNT

Implements the IPv4 SNMP protocol stack and processes SNMP connection between the NMS and switch.

NMS operations are frequently performed.

Analyze network management events and reduce the NMS request rate or block NMS requests if necessary.

AGT6

Implements the IPv6 SNMP protocol stack and processes SNMP connection between the NMS and switch.

ARP

Implements the ARP protocol stack, manages the ARP state machine, and maintains the ARP database.

  • The CAR for packets sent to the CPU is too large, and a large number of ARP packets are received.
  • The aging time is too short.

Adjust the CAR for packets sent to the CPU and aging time.

CFM

Configuration management task, which restores MPU configuration and interface configuration

Configurations are restored.

No action is required.

CWP_CWP

Distributes CAPWAP services, receives and distributes CAPWAP packets.

High CPU usage occurs during message queue maintenance, packet distribution and statistics collection, or CAPWAP timer processing (retransmission, fragmentation, reassembly, and state machine), or when a large number of packets exist, traffic is sent continuously, or an attack occurs.

Decrease the service concurrency rate, and expand the system capacity or use high-performance MPUs such as SRUH.

DEV/HOTT/FMCK/SRMI

Device management task

  • During configuration restoration, active/standby switchover, and card installation stage, the CPU usage may temporarily increase. This is a normal condition.
  • A large number of interrupts are reported when some hardware components fail. In this situation, the CPU usage of this task may increase.

Confirm with Huawei switch resellers whether the hardware is faulty. For details, see Checking Whether the Problem Is Caused by a Hardware Failure.

CWP_FWD

Creates CAPWAP socket, receives and sends socket packets, and rapidly receives and sends packets.

Traffic is continuously sent when there are a large number of CAPWAP control packets, or a CAPWAP attack exists.

When more than 20 users connect to the switch concurrently, it is normal that the CPU usage of this task is within 15%. You can only expand the capacity to solve the problem.

DHCP

Implements the DHCP protocol stack and provides the functions such as DHCP snooping and DHCP relay.

The CPU experiences a DHCP attack.

For details, see Checking Whether the Problem Is Caused by a Network Attack.

ETHA

Ethernet packet distribution and processing task

A large number of protocol packets are sent to the CPU.

Configure the rate limit of protocol packets properly and deploy the attack defense function.

EpldIntTask

Processes CPLD interrupts.

When many CPLD interrupts are generated, the workload of processing these interrupts becomes heavy, and the CPU usage of this task becomes high.

Check whether many CPLD interrupts are generated.

FIB

Generates IPv4 software forwarding entries on the MPU and delivers the entries to the interface card to guide data forwarding.

When a large number of routes are delivered, route flapping continuously occurs.

-

FIB6

Manages IPv6 FIB entries, maintains software entries, and requests the hardware adaptation layer to maintain chip entries.

FMAT

Trap management task, which processes the traps generated by all services

A large number of traps are generated. For example, a large number of interfaces alternate between Up and Down states.

The high CPU usage problem is automatically solved when the number of generated traps is stable.

FTPS

Provides the FTP server service and FC0 as well as FC1 services.

The CPU usage of the FC task becomes high when large files are being transferred, for example, a large file is being transferred and even multiple large files are being transferred concurrently.

The high CPU usage problem is automatically solved after file transfer ends. To prevent this problem, minimize concurrent transfer of multiple large files.

FTS

Upper-layer packet sending and receiving task

When many protocol packets are sent to the CPU, the CPU usage of this task significantly increases.

This is a major cause for high system CPU usage.

The reasons why many protocol packets are sent to the CPU include:

  • The CPU is attacked.
  • A network loop occurs.
  • Service traffic is heavy.
  1. For details, see Checking Whether the Problem Is Caused by a Network Attack and Checking Whether the Problem Is Caused by Network Loop.
  2. Confirm with Huawei switch resellers whether service traffic is heavy and ask for help.

HTTP

Processes HTTP packets.

The CPU usage becomes high when a large number of external HTTP packets are being processed, for example, web operations are frequently performed.

Reduce the frequency of packet sending triggered by external operations.

INFO

Information center main task, which receives and outputs the logs, alarms, and debugging information generated by service modules

When logs and debugging information are frequently triggered, frequently writing files to the CF card may also cause a high CPU usage due to poor performance of the CF card.

Reduce the frequency at which operations triggered by logs and debugging information are performed.

INT

Processes CPLD interrupts sent by the kernel.

When many CPLD interrupts are generated, the workload of processing these interrupts becomes heavy, and the CPU usage of this task becomes high.

Check whether many CPLD interrupts are generated.

LDP

Implements the LDP protocol stack and maintains LSP databases.

Route flapping occurs.

Prevent session flapping caused by route flapping.

MCSW

Multicast product adaptation task, which processes received and sent multicast packets and delivers Layer 3 multicast entries

  • The switch receives a large number of multicast packets.
  • Multicast entries are repeatedly updated due to route or port flapping.
  • Check whether there are multicast attack packets.
  • Check whether route or port flapping occurs.

MFIB

Manages Layer 3 multicast forwarding entries.

A large number of data/registration packet entries are received, and interfaces frequently flap.

Configure a policy to filter data, find the cause for flapping, and eliminate the flapping.

MPSI

MPLS service LPU adaptation task

  • A large number of LSPs are updated.
  • A large number of L2VPN services are configured, added, or deleted.

Check port flapping and protocol status.

MPSM

MPLS service MPU adaptation task

PAT

Manages patch operations, for example, load, activate, run, and delete patches.

Patches are loaded to the standby MPU and LPUs.

The CPU usage of the PAT task will increase for a while when patches are being loaded. Currently, no proper approach is available to solve this problem. To prevent patch loading from affecting services, do not perform batch service operations during patch loading.

PM

Performance management task, which processes performance statistics data and PM configuration commands

When there are many PM configurations (a large amount of statistics data), performance data collection and processing are triggered.

  • Reduce the frequency at which performance statistics are collected.
  • Configure different statistics collection intervals for different statistics collection tasks.

SFPT

Fixed switches' optical module processing task

There are non-certified optical modules on the switch, causing I2C failures.

Replace non-certified optical modules with certified ones.

SNPG

Layer 2 multicast protocol stack task, which processes received and sent Layer 2 multicast packets and delivers Layer 2 multicast entries.

  • The switch receives a large number of Layer 2 multicast packets.
  • Layer 2 multicast entries are repeatedly updated due to ring network or port flapping.
  • Check whether there are a large number of Layer 2 multicast attack packets.
  • Check whether ring network or port flapping occurs.

SOCK

Schedules and processes IP packets.

When many protocol packets are sent to the CPU, the CPU usage of this task significantly increases.

This is a major cause for high system CPU usage.

The reasons why many protocol packets are sent to the CPU include:

  • The CPU is attacked.
  • A network loop occurs.
  • Service traffic is heavy.
  1. For details, see Checking Whether the Problem Is Caused by a Network Attack and Checking Whether the Problem Is Caused by Network Loop.
  2. Confirm with Huawei switch resellers whether service traffic is heavy and ask for help.

VT0

Authenticates the user with the user ID 0 and processes commands.

User operations, especially, input and output operations are frequently performed. For example, commands are copied to the screen (input) or a large number of display commands are executed (output).

Reduce the frequency at which input and output operations are performed. This problem is automatically solved after the operations end.

VTYD

VTY daemon process, which handles all user login requests

A large number of user input operations are performed, for example, commands are copied to the screen.

Reduce the frequency at which input operations are performed.

WT0

Web service processing task, which processes requests of all web users

Operations are frequently performed on the web platform.

Reduce the frequency at which web operations are performed.

bcmDPC

Reports interrupts when chip failures occur.

  • There are unrecoverable soft failure entries on the device, and interrupts are not suppressed.
  • A large number of TC packets are received. As a result, MAC address entries are frequently deleted.
  • Upgrade the patch and restart the device.
  • Solve the problem of TC packets.

bcmL2MOD.0

Chip 0 MAC address entry learning task

MAC address flapping or a hash conflict occurs.

l2au

MAC learning task

MAC address flapping or a hash conflict occurs.

-

l2sy

MAC synchronization task

WMT_DEV

Device management task:

  • Periodically checks APs.
  • Processes AP-pings.
  • Periodically synchronizes messages between mobility groups.
  • Processes MAP messages.
  • Processes CAPWAP messages.
  • Processes messages of the DEV module.
  • Processes status transition during AP login, maintains the state machine (including upgrade processing), batch AP login/logout, AP upgrade, and collected information periodically reported by radios.

During batch AP going-online or going-offline, upgrade, radio calibration, or terminal location, a large number of messages from APs are processed.

Set the interval for scanning the air interface to a larger value and check whether APs frequently go offline.

WMT_SEC

User management task:

  • Processes user login/logout and roaming.
  • Processes the user key negotiation procedure.

More than 20 users connect to the switch or are roaming simultaneously.

When more than 20 users connect to the switch simultaneously, the CPU usage of this task is about 15% and CPU resources are used to process user access, authentication, and roaming. When more than 20 users connect to the switch simultaneously, capacity expansion is required.

UCM/SAM

Processes user login/logout and permission control.

The number of concurrent users is large, or the users go online and offline frequently.

Check whether a large number of users go online and offline and whether the authentication configuration is changed.

If the top tasks on your switch are not included in the preceding table, see CPU-related Tasks and Functions for Fixed Switches to find out which services caused the high CPU usage.

If the top tasks on your switch are not included in the preceding table or CPU-related Tasks and Functions for Fixed Switches, contact Huawei switch resellers.

The preceding table is only a reference for you to locate a high CPU usage problem. To fix the problem, see How to Fix the High CPU Usage Problem.

How to Fix the High CPU Usage Problem

After determining the top tasks and reasons, analyze the root causes and take troubleshooting measures.

Checking Whether the Problem Is Caused by a Hardware Failure

If you determine that the problem is caused by a hardware failure according to Determining Fault Causes According to CPU Usages of Tasks (Modular Switches) or Determining Fault Causes According to CPU Usages of Tasks (Fixed Switches) (the DEV, HOTT, FMCK, or SRMI task has a high CPU usage), contact Huawei switch resellers for help.

NOTE:

If services are affected, reset the card that causes high CPU usage (powering off the card is recommended) to recover services temporarily.

Checking Whether the Problem Is Caused by a Network Attack

In some situations, network attacks may cause high CPU usage. Network attacks are initiated by hosts or network devices by sending a large number of forged packets to switches, affecting security and services on the target switches. When a network attack occurs, the switch is busy with the requests from the attack source. Therefore, some tasks occupy many CPU resources, causing a high CPU usage on the switch.

Common Network Attacks

Common network attacks, such as ARP, ARP Miss, and DHCP attacks, can cause a high CPU usage on a switch. These attacks are all initiated by sending a large number of protocol packets; therefore, packet statistics on the switch show a large number of packets sent to the CPU.

  • ARP and ARP-Miss attacks
    • ARP and ARP Miss flood
    • ARP spoofing
  • DHCP protocol packet attack
  • Other attacks
    • ICMP attack
    • DDoS attack
    • Broadcast attack
    • TTL expiry attack
    • IP packet attack initiated using the device's IP address as the destination IP address
    • SSH/FTP/Telnet attacks

Network Attack Locating

  1. Run the display version and display device commands to check the switch version and component types. Record the information for follow-up operations.
  2. Run the display cpu-defend statistics command to view statistics about the packets sent to the CPU, and determine whether too many protocol packets are discarded due to timeout.

    1. Run the reset cpu-defend statistics command to clear statistics about the packets sent to the CPU.
    2. After several seconds, run the display cpu-defend statistics command to view statistics about the packets sent to the CPU.

      If there are too many packets of a protocol, determine whether it is normal depending on the networking. If not, there is a high probability that the switch is undergoing a protocol packet attack.

      <HUAWEI> reset cpu-defend statistics
      <HUAWEI> display cpu-defend statistics all
      Statistics on slot 2:
      -----------------------------------------------------------------------------------------------------------
      Packet Type         Pass(Bytes)  Drop(Bytes)   Pass(Packets)   Drop(Packets)
      -----------------------------------------------------------------------------------------------------------
      arp-miss            0           0            0             0
      arp-request          40800       35768        600           52600
      bgp                0           0            0             0
      ...
      -----------------------------------------------------------------------------------------------------------

      The preceding information shows that the switch has discarded many ARP request packets. If these packets are abnormal, the switch undergoes an ARP attack.

  3. Configure the attack source tracing function to find out the attack source.

    If a CPU is busy with many valid or attack packets, services may be interrupted. The switch provides the local attack defense function to protect the CPU. Local attack defense policies include attack source tracing, port attack defense, CPCAR, and blacklist. For details about local attack defense, see Local Attack Defense Policy.
    1. Create a local attack defense policy based on attack source tracing.
      1. Create an ACL and add the gateway IP address to the whitelist of attack source tracing.
        <HUAWEI> system-view
        [HUAWEI] acl number 2000 
        [HUAWEI-acl-basic-2000] rule 5 permit source 10.1.1.1 0  //10.1.1.1 is the gateway IP address.
        [HUAWEI-acl-basic-2000] quit
      2. Create a local attack defense policy based on attack source tracing.
        [HUAWEI] cpu-defend policy policy1
        [HUAWEI-cpu-defend-policy-policy1] auto-defend enable  //Enable attack source tracing. By default, this function is disabled.
        [HUAWEI-cpu-defend-policy-policy1] undo auto-defend trace-type source-portvlan  //Set the attack tracing mode to MAC + IP based. By default, attack source tracing is based on source MAC address, source IP address, source interface, and VLAN ID. To delete unneeded mode, run the undo auto-defend trace-type command.
        [HUAWEI-cpu-defend-policy-policy1] undo auto-defend protocol 8021x dhcp icmp igmp tcp telnet ttl-expired udp  //Delete the types of traced packets. By default, the types include 802.1X, ARP, DHCP, ICMP, IGMP, TCP, Telnet, TTL expiry, and UDP.
        [HUAWEI-cpu-defend-policy-policy1] auto-defend whitelist 1 acl 2000  //Add the gateway IP address to a whitelist.
        [HUAWEI-cpu-defend-policy-policy1] quit 
        Beginning with later versions of V200R009, the attack source tracing configuration model is redesigned, attack source tracing is enabled by default, and source tracing protocols are designed to overwrite mode according to normal use habits.
        [HUAWEI] cpu-defend policy policy1
        [HUAWEI-cpu-defend-policy-policy1] auto-defend protocol arp //Trace only the source of ARP packets. By default, attack source tracing supports the following types of packets: 802.1X, ARP, DHCP, ICMP, IGMP, TCP, Telnet, TTL expiry, and UDP. In V200R010, IPv6 DHCPv6, ND, ICMPv6, and MLD are supported.
        [HUAWEI-cpu-defend-policy-policy1] auto-defend whitelist 1 acl 2000  //Add the gateway IP address to a whitelist.
        [HUAWEI-cpu-defend-policy-policy1] quit 
    2. Apply the local attack defense policy.
      • Modular switches

        Both MPUs and LPUs have their own CPUs. Local attack defense policies are configured differentially for MPUs and LPUs.

        Before creating and applying attack defense policies, check attack information on the MPUs and LPUs. If the attack information on the MPUs and LPUs is consistent, apply the same attack defense policy to the MPUs and LPUs; otherwise, apply different policies to them.

        1. Apply an attack defense policy to an MPU.
          <HUAWEI> system-view
          [HUAWEI] cpu-defend-policy policy1 
          [HUAWEI] quit
        2. Apply an attack defense policy to an LPU.
          NOTE:

          If an attack defense policy has been applied to all LPUs, it cannot be applied to a specified LPU. Similarly, if an attack defense policy has been applied to a specified LPU, it cannot be applied to all LPUs.

          • If all LPUs process similar services, apply an attack defense policy to all LPUs.
            <HUAWEI> system-view
            [HUAWEI] cpu-defend-policy policy2 global 
          • If LPUs process different services, apply an attack defense policy to a given LPU.
            <HUAWEI> system-view
            [HUAWEI] slot 1
            [HUAWEI-slot-1] cpu-defend-policy policy2
      • Fixed switches
        • Apply an attack defense policy to a stand-alone switch.
          <HUAWEI> system-view
          [HUAWEI] cpu-defend-policy policy1 global 
        • In a stack:
          • Apply an attack defense policy to the master switch.
            <HUAWEI> system-view
            [HUAWEI] cpu-defend-policy policy1 
          • Apply an attack defense policy to all stacked switches.
            <HUAWEI> system-view
            [HUAWEI] cpu-defend-policy policy1 global 
    3. View attack source information.

      After configuring local attack defense based on attack source tracing, run the display auto-defend attack-source and display auto-defend attack-source slot slot-id commands to view attack source information.

      NOTE:

      The MAC address of gateway should be excluded from the suspicious attack sources.

Handling Suggestion

Select an appropriate method based on the attack source information and networking.

  • Configure ARP security to prevent ARP attacks.

    The switch provides ARP security to prevent ARP and ARP Miss packet attacks.

    For details about ARP security, see "ARP Security Solutions" in the Configuration Guide > Security > ARP Security Configuration.

  • Configure a punishment action for attack source tracing: drop attack packets within a given period.
    • Enable the punishment function for attack source tracing and set the punishment action to drop all attack packets within 300s.
      <HUAWEI> system-view
      [HUAWEI] cpu-defend policy policy1
      [HUAWEI-cpu-defend-policy-policy1] auto-defend enable  //Enable attack source tracing. By default, this function is disabled.
      [HUAWEI-cpu-defend-policy-policy1] auto-defend action deny timer 300  //By default, the punishment function for attack source tracing is disabled.
    • Configure a blacklist for local attack defense. The packets from the users in the blacklist are discarded.

      If an attack source is considered as attacker (for example, attack source address is 1.1.1.0/24), blacklist the users with the specified characteristics through an ACL.

      # Configure ACL 2001 to match the packets with source address 1.1.1.0/24. The switch drops the packets that match the ACL.

      [HUAWEI] acl number 2001
      [HUAWEI-acl-basic-2001] rule permit source 1.1.1.0 0.0.0.255
      [HUAWEI-acl-basic-2001] quit
      [HUAWEI] cpu-defend policy policy1
      [HUAWEI-cpu-defend-policy-policy1] blacklist 1 acl 2001
    • Configure a punishment action for attack source tracing: shut down the interface receiving attack packets.

      Use this punishment action if attack packets are sent from a specified interface and shutting down this interface does not affect services.

      Shutting down an interface may cause a service interruption and affect valid users. Use this method with caution.

      # Shut down the interface that receives attack packets.

      <HUAWEI> system-view
      [HUAWEI] cpu-defend policy policy1
      [HUAWEI-cpu-defend-policy-policy1] auto-defend enable  //Enable attack source tracing. By default, this function is disabled.
      [HUAWEI-cpu-defend-policy-policy1] auto-defend action error-down

Checking Whether the Problem Is Caused by Network Flapping

When network flapping occurs, the network topology frequently changes. The switch is busy with network switching events, causing a high CPU usage. Network flapping includes STP flapping and OSPF route flapping.

STP Flapping

When STP flapping occurs, the switch frequently calculates the STP topology, and updates its MAC address table and ARP table, causing a high CPU usage.

  1. Fault Location
    • If you consider that frequent STP flapping may occur, run the display stp topology-change command multiple times at an interval of several seconds to view the current STP topology change information. Alternatively, you can check the trap and log information on the switch to determine whether the STP topology has changed.

      # Run the command multiple times. Check whether the value of Number of topology changes increases.

      <HUAWEI> display stp topology-change 
       CIST topology change information
         Number of topology changes             :35
         Time since last topology change        :0 days 1h:7m:30s
         Topology change initiator(notified)    :GigabitEthernet2/0/6
         Topology change last received from     :101b-5498-d3e0
         Number of generated topologychange traps :   38
         Number of suppressed topologychange traps:   8
      
       MSTI 1 topology change information
         Number of topology changes             :0
    • When you confirm that the network topology is frequently changed, run the display stp tc-bpdu statistics command after several seconds again. Check whether interfaces on the switch have received Topology Change (TC) BPDUs. If so, find out the source of the TC BPDUs, that is, the device causing the topology change.
      • If only the TC(Send) value increases, the topology change is caused by the local switch.
        • If only the TC(Send) value of a single interface increases, the topology change is caused by this interface.
        • If the TC(Send) values of multiple interfaces increase, check the events and logs on the NMS to analyze the STP topology change reason. Find out the interface causing the flapping.
      • If multiple values in the TC(Send/Receive) column increase, check the event and log information on the NMS to determine whether the local switch causes the topology change, and check whether STP flapping occurs on the device connected to the problematic interface.

      # View statistics about TC/TCN BPDUs on ports.

      <HUAWEI> display stp tc-bpdu statistics  
      -------------------------- STP TC/TCN information --------------------------
       MSTID Port                    TC(Send/Receive)      TCN(Send/Receive)
       0     GigabitEthernet2/0/6        21/4                  0/1 
       0     GigabitEthernet2/0/7        93/0                  0/1 
       0     GigabitEthernet2/0/8        115/0                 0/0 
       0     GigabitEthernet2/0/9        110/0                 0/0 
       0     GigabitEthernet3/0/23       29/5                  0/0
  2. Suggestion
    1. Enable TC protection trap to help you understand how the switch processes TC BPDUs.

      Run the snmp-agent trap enable feature-name mstp and stp tc-protection commands in the system view to enable TC protection trap.

      By default, a switch is enabled to prevent topology change attacks. That is, within the stp tc-protection interval, the switch processes a maximum number of stp tc-protection threshold TC BPDUs.

      After the trap is enabled, the switch reports the MSTP_1.3.6.1.4.1.2011.5.25.42.4.2.15 hwMstpiTcGuarded and MSTP_1.3.6.1.4.1.2011.5.25.42.4.2.16 hwMstpProTcGuarded traps.

      For details about the traps, see Alarm Information.

    2. Perform operations according to topology changes.
      • STP topology changes when the access interface alternates between Up and Down.

        Run the stp edged-port enable command in the interface view to set the access interface as an edge port, and run the stp bpdu-protection command in the system or STP process view to enable BPDU protection.

      • The root bridge is changed unexpectedly.

        Run the display stp command. Check whether CIST Root/ERPC is the expected interface MAC address. If not, the root bridge has changed unexpectedly.

        Run the stp root-protection command in the interface view to enable root protection, ensuring the correct topology.

        <HUAWEI> display stp
        -------[CIST Global Info][Mode MSTP]-------
        CIST Bridge:4096 .707b-e8c8-00e9
        Config Times:Hello 2s MaxAge 20s FwDly 15s MaxHop 20
        Active Times:Hello 2s MaxAge 20s FwDly 15s MaxHop 20
        CIST Root/ERPC:4096 .707b-e8c8-00e9 / 0 (This bridge is the root)
        CIST RegRoot/IRPC:4096 .707b-e8c8-00e9 / 0 (This bridge is the root)
        CIST RootPortId:0.0
        BPDU-Protection:Disabled
        CIST Root Type:Secondary root
        TC or TCN received:1
        TC count per hello:0
        STP Converge Mode:Normal 
        Share region-configuration :Enabled
        Time since last TC:1 days 14h:25m:38s
        Number of TC:2
        Last TC occurred:GigabitEthernet0/0/1
        ----[Port18(GigabitEthernet0/0/1)][LEARNING]----
        Port Protocol:Enabled
        Port Role:Designated Port
        Port Priority:128
        Port Cost(Dot1T ):Config=auto / Active=20000
        Designated Bridge/Port:4096.707b-e8c8-00e9 / 128.18
        Port Edged:Config=default / Active=disabled
        Point-to-point:Config=auto / Active=true
        Transit Limit:6 packets/s
        Protection Type:None
        Port STP Mode:STP 
        Port Protocol Type:Config=auto / Active=dot1s
        BPDU Encapsulation:Config=stp / Active=stp
        PortTimes:Hello 2s MaxAge 20s FwDly 15s RemHop 20
        TC or TCN send:0
        TC or TCN received:0
        BPDU Sent:11
        TCN: 0, Config: 12, RST: 0, MST: 1
        BPDU Received:0
        TCN: 0, Config: 1, RST: 0, MST: 0
    3. If the topology change reason is unknown or the fault persists, collect network information (including interface connections) and logs (the log.log file or the display logbuffer command output), and provide collected information to Huawei switch resellers.

OSPF Route Flapping

Route flapping causes route re-advertisement and recomputation, increasing the load of the CPU. Generally, OSPF is configured to manage dynamic routing information. Therefore, OSPF route flapping is described here.
  1. Fault Location
    • Run the display ospf peer last-nbr-down command to check the reason why the OSPF neighbor relationship goes Down.

      The reason is displayed in the Immediate Reason and Primary Reason fields.

    • Check logs on the switch to determine why the OSPF neighbor becomes Down.

      Run the display logbuffer command, and you can find the following log information:

      OSPF/3/NBR_DOWN_REASON:Neighbor state leaves full or changed to Down. (ProcessId=[USHORT], NeighborRouterId=[IPADDR],NeighborAreaId=[ULONG], NeighborInterface=[STRING],NeighborDownImmediate reason=[STRING], NeighborDownPrimeReason=[STRING],NeighborChangeTime=[STRING])

      The NeighborDownImmediate reason field indicates the cause for the OSPF neighbor Down event.

  2. Suggestion

    Determine the reason depending on the key fields and take corresponding measures.

    Possible causes of the fault are as follows:
    • Neighbor Down Due to Inactivity

      The Hello packet is not received within the deadtime (set by the ospf timer dead command in the interface view).

      When an OSPF neighbor is Down, OSPF neighbor flapping occurs and OSPF neighbor relationship cannot be set up. Run the display ospf peer brief command to check whether OSPF neighbor flapping occurs or OSPF neighbor relationship cannot be set up.
      • OSPF neighbor relationship flaps.

        OSPF neighbor flapping may be caused by a small CPCAR value for OSPF, link flapping or congestion on interfaces, and a large amount of LSA flooding.

        1. Run the display cpu-defend statistics packet-type ospf command to view statistics about the OSPF packets sent to the CPU. If too many OSPF packets are discarded, check whether the switch undergoes an OSPF attack or the CPCAR value for OSPF is too small.
        2. View the log to check whether interfaces alternate between Up and Down. If link flapping or congestion occurs, check the link on the interface.
        3. If the holdtime of the OSPF neighbor relationship is smaller than 20s, run the ospf timer dead interval command to change the holdtime to be greater than 20s.
        4. Run the sham-hello enable command in the OSPF view to enable the OSPF sham-hello function, so that the switch can maintain the neighbor relationship using non-Hello packets such as LSU. This allows the switch to detect OSPF neighbor relationships sensitively.
        5. If the fault persists after the preceding operations are performed, contact Huawei switch resellers.
      • OSPF neighbor relationship cannot be set up.

        Check whether the configurations in the OSPF view of devices on both ends are the same. If the configurations such as the OSPF area ID or area type (NSSA, stub area, or common area) are different, the two devices cannot establish an OSPF neighbor relationship.

        Run the display ospf [ process-id ] interface command to check whether OSPF is successfully enabled on the interfaces.

        <HUAWEI> display ospf 1 interface
        
                  OSPF Process 1 with Router ID 2.2.2.2
                          Interfaces
        
         Area: 0.0.0.0          (MPLS TE not enabled)
        Interface           IP Address      Type         State    Cost    Pri
        Eth0/1/1            10.1.1.2        Broadcast    Waiting  1       1
        • If OSPF is not enabled on interfaces, run the ospf enable [ process-id ] area area-id command in the interface view to enable OSPF.
        • If the OSPF process has been enabled on the related interface, run the display ospf error command multiple times at an interval of several seconds to check whether OSPF authentication information on the two devices is the same according to the Bad authentication type and Bad authentication key fields.
          <HUAWEI> display ospf 1 error
          
                    OSPF Process 1 with Router ID 2.2.2.2
                            OSPF error statistics
          
          General packet errors:
           0           : IP: received my own packet     3           : Bad packet
           0           : Bad version                  0           : Bad checksum
           0           : Bad area id                  0           : Drop on unnumbered interface
           0           : Bad virtual link             3        : Bad authentication type
           0           : Bad authentication key        0           : Packet too small
           0           : Packet size > ip length         0           : Transmit error
           0           : Interface down               0           : Unknown neighbor
           0           : Bad net segment           0           : Extern option mismatch
          

          If the value of the Bad authentication type or Bad authentication key value keeps increasing, OSPF authentication information on the two devices is different. To configure the same authentication information for the two devices, run the ospf authentication-mode command in the interface view or the authentication-mode command in the OSPF process view.

          - If the Bad authentication type or Bad authentication key value does not increase, the authentication information is the same. If the neighbor intermittently disappears when the display ospf peer command is executed, OSPF neighbor relationship flaps. Refer to the related information in this section to resolve this problem.

    • Neighbor Down Due to Kill Neighbor

      If the interface is Down, BFD is Down, or the reset ospf process command is executed, the OSPF neighbor relationship goes Down.

      View the NeighborDownPrimeReason field to determine the reason.

    • Neighbor Down Due to 1-Wayhello Received or Neighbor Down Due to SequenceNum Mismatch

      When the OSPF status of the peer device goes Down first, the peer device sends a 1-Way Hello packet to the local device, causing OSPF on the local device to go Down.

      Determine why OSPF status of the peer device becomes Down.

    For other reasons, see OSPF/3/NBR_DOWN_REASON in Log Information.

Checking Whether the Problem Is Caused by Network Loop

A network loop will cause MAC flapping. A large number of protocol packets are sent to the CPU, overwhelming the CPU.

  1. Fault Location

    A network loop may have the following symptoms:

    • The CPU usage of a switch exceeds 80%.
    • Indicators of interfaces in the VLAN where a loop has occurred blink faster than usual.
    • MAC flapping frequently occurs.
    • The administrator cannot remotely log in to the switch, and the switch responds to the operations on console port slowly.
    • A lot of ICMP packets are lost in ping tests.
    • The display interface command output shows a large number of broadcast packets received on an interface.
    • Loop alarms are generated after loop detection is enabled.
    • The PCs connected to switch receive a large number of broadcast or unknown unicast packets.
  1. Suggestion
    1. Observe interface indicators and collect traffic statistics on interfaces to locate the interfaces undergoing broadcast storms.
    2. Check the devices hop by hop according to the topology to locate the devices that cause the loop.
    3. Locate the interface that causes the loop and shut down the interface to remove the loop.
    4. if the fault persists after the preceding operations are performed, collect network information (including interface connections) and logs (the log.log file or the display logbuffer command output), and provide collected information to Huawei switch agents.
NOTE:

This chapter describes only the method of locating network loops and handling suggestions. For more information, see the network loop troubleshooting guide.

How to Relieve CPU Load

  1. Plan the network configurations, configure loop prevention protocols, and enable loopback detection to prevent loops.
    • Run the loopback-detect untagged mac-address ffff-ffff-ffff command in the system view to broadcast BPDUs for loopback detection and prevent them from being terminated by unexpected devices.
    • Run the loopback-detect enable command in the interface view to enable loopback detection.

    When the total number of VLANs on the interfaces with loopback detection enabled exceeds 1024, run the loopback-detect action shutdown command on these interfaces to set the action for a detected loopback to shutdown. (The VLAN counter is incremented by 1 every time an interface is added to a VLAN, even when multiple interfaces are added to the same VLAN.)

  2. Configure ARP security to protect the device against ARP or ARP Miss attacks.

    For details about ARP security, see "ARP Security Solutions" in the Configuration Guide > Security > ARP Security Configuration.

  3. On the network prone to DHCP and ARP attacks, such as campus networks, configure local attack defense policies for DHCP and ARP protocol packets.

    This section provides suggestions on local attack defense policies in normal cases. The requirements on different protocol packets sent to the CPU may vary according to the model and version. In practice, configure CPU attack defense based on actual service requirements; otherwise, the configuration may fail or services may be affected.

    • MPU on modular switches
      # 
      cpu-defend policy main-board
       auto-defend enable   //Default configuration for later versions of V200R009
       undo auto-defend trace-type source-portvlan   //Default configuration for later versions of V200R009
       undo auto-defend protocol tcp igmp telnet ttl-expired  //auto-defend protocol arp dhcp (for V200R009)
       auto-defend action deny  
       auto-defend whitelist 1 interface GigabitEthernet x/x/x  //Add interconnected interfaces to the whitelist.
       auto-defend whitelist 2 interface GigabitEthernet x/x/x  //Add uplink interfaces to the whitelist.
      #
      cpu-defend-policy main-board
      #
    • LPU on modular switches
      # 
      cpu-defend policy io-board
       auto-defend enable   //Default configuration for later versions of V200R009
       undo auto-defend trace-type source-portvlan   //Default configuration for later versions of V200R009
       undo auto-defend protocol tcp igmp telnet ttl-expired //auto-defend protocol arp dhcp (for V200R009)
       auto-defend action deny 
       auto-defend whitelist 1 interface GigabitEthernet x/x/x  //Add interconnected interfaces to the whitelist.
       auto-defend whitelist 2 interface GigabitEthernet x/x/x  //Add uplink interfaces to the whitelist.
      # 
      cpu-defend-policy io-board global
      #
    • Fixed switches
      # 
      cpu-defend policy main 
       auto-defend enable   //Default configuration for later versions of V200R009
       undo auto-defend trace-type source-portvlan   //Default configuration for later versions of V200R009
       undo auto-defend protocol tcp igmp telnet ttl-expired //auto-defend protocol arp dhcp (for V200R009)
       auto-defend action deny 
       auto-defend whitelist 1 interface GigabitEthernet x/x/x  //Add interconnected interfaces to the whitelist.
       auto-defend whitelist 2 interface GigabitEthernet x/x/x  //Add uplink interfaces to the whitelist.
      #
      cpu-defend-policy main global
      #
  4. Log in to the switch as an administrator through SSH, Telnet, and SNMP. Configure an ACL to allow only the administrator to log in.

    # In VTY 0-14, configure an ACL to allow only the user with source IP address 10.1.1.1/32 to log in to the switch.

    <HUAWEI> system-view
    [HUAWEI] acl 2001
    [HUAWEI-acl-adv-2001] rule 5 permit source 10.1.1.1 0
    [HUAWEI-acl-adv-2001] quit
    [HUAWEI] user-interface vty 0 14
    [HUAWEI-ui-vty0-14] acl 2001 inbound
  5. When a port group has more than 40 member ports and you add these member ports to 4K VLANs at the same time, the CPU usage may jump to over 80% in a short period. Therefore, you are advised to add the member ports to no more than 500 VLANs at a time.
  6. Changing the type of more than 20 ports together may cause the CPU usage to exceed 80% in a short period. Therefore, you are advised to change the type of ports one by one.
  7. Frequent MAC address flapping may result in a high CPU usage. If MAC address flapping may occur frequently on an interface, run the mac-address flapping action error-down command on the interface to enable the system to set the interface to error-down state after detecting MAC address flapping.
  8. Load and activate the patch files of the corresponding software version.

    Visit http://support.huawei.com/enterprise/ to obtain the corresponding patch file and documents (patch release notes and installation guide).

  9. Scan virus on the PCs or servers connected to the switch periodically.
  10. The switch provides CPCAR values for each protocol. Generally, the default CPCAR values can meet requirements. If service traffic volume is too high, contact Huawei switch resellers to adjust the CPCAR values.

Appendix

Commands, Alarms, Logs, and OIDs Related to High CPU Usage

Commands

Table 1-7 Commands related to high CPU usage

Command

Description

display interface [ interface-type ] counters { inbound | outbound }

Displays number of packets sent and received on each interface.

display cpu-usage [ slave | slot slot-id ]

Displays CPU usage statistics.

display cpu-defend statistics [ packet-type packet-type ] [ all | slot slot-id ]

Displays statistics on protocol packets sent to the CPU.

display arp packet statistics

Displays ARP packet statistics.

display dhcp statistics

Displays DHCP packet statistics.

display cpu-defend rate [ packet-type packet-type ] [ slot slot-id | all ]

Displays the rates at which protocol packets are sent to the CPU.

display cpu-defend policy [ policy-name ]

Displays information about the attack defense policy.

display auto-defend configuration [ cpu-defend policy policy-name | slot slot-id | mcu ]

Displays information about attack source tracing.

display cpu-defend configuration

Displays CAR values, including the rate at which packets are sent to the CPU and CPU queues to which protocol packets are sent.

display logbuffer [ size value | slot slot-id | module module-name | security | level { severity | level } ] *

Displays log information on the switch.

display trapbuffer [ size value ]

Displays trap information on the switch.

display stp [ process process-id ] [ instance instance-id ] topology-change

Displays information about STP topology changes.

display stp [ process process-id ] [ instance instance-id ] [ interface interface-type interface-number | slot slot-id ] tc-bpdu statistics

Displays STP TC BPDU statistics.

reset cpu-defend statistics [ packet-type packet-type ] [ all | slot slot-id ]

Clears statistics on packets sent to the CPU.

cpu-defend policy policy-name

Configures an attack defense policy.

blacklist blacklist-id acl acl-number

Configures an ACL-based blacklist.

whitelist whitelist-id acl acl-number

Configures an ACL-based whitelist.

queue packet-type packet-type queue-value

Specifies the queue number of the CPU to which protocol packets are sent.

auto-defend enable

Enables the attack source tracing function.

undo auto-defend trace-type { source-mac | source-ip | source-portvlan } *

Deletes the source tracing mode.

undo auto-defend protocol { 8021x | arp | dhcp | dhcpv6 | icmp | icmpv6 | igmp | mld | nd | tcp | telnet | ttl-expired | udp }*

Deletes the packet type in attack source tracing.

auto-defend whitelist whitelist-number { acl acl-number | interface interface-type interface-number }

Configures a whitelist for attack source tracing. The users in the whitelist are excluded from attack source tracing.

auto-defend alarm enable

Enables event report in attack source tracing.

auto-defend action { deny [ timer time-length ] | error-down }

Enables punish action for attack source tracing and specifies the action.

auto-port-defend whitelist whitelist-number { acl acl-number | interface interface-type interface-number }

Configures the whitelist for port attack defense.

System view: cpu-defend-policy policy-name [ global ]

Slot view: cpu-defend-policy policy-name

Applies the attack defense policy. (The command format depends on switch models and versions. In this example, the modular switch runs V200R007.)

Alarm Information

  1. ENTITYTRAP_1.3.6.1.4.1.2011.5.25.219.2.14.1 hwCPUUtilizationRising //The CPU usage of the switch exceeded the threshold.
    ENTITYTRAP/4/ENTITYCPUALARM:OID [oid] CPU utilization exceeded the pre-alarm threshold.(Index=[INTEGER],  
     EntityPhysicalIndex=[INTEGER], PhysicalName=[OCTET], EntityThresholdType=[INTEGER], EntityThresholdValue=[INTEGER],  
     EntityThresholdCurrent=[INTEGER], EntityTrapFaultID=[INTEGER].) 
  2. BASETRAP_1.3.6.1.4.1.2011.5.25.129.2.4.1 hwCPUUtilizationRisingAlarm //The CPU usage of the switch exceeded the threshold.
    BASETRAP/2/CPUUSAGERISING: OID [oid] CPU utilization exceeded the pre-alarm threshold.(Index=[INTEGER], 
    BaseUsagePhyIndex=[INTEGER], UsageType=[INTEGER], UsageIndex=[INTEGER], Severity=[INTEGER], ProbableCause=[INTEGER],  
     EventType=[INTEGER], PhysicalName="[OCTET]", RelativeResource="[OCTET]", UsageValue=[INTEGER], UsageUnit=[INTEGER],  
    UsageThreshold=[INTEGER])
  3. MSTP_1.3.6.1.4.1.2011.5.25.42.4.2.15 hwMstpiTcGuarded //After TC protection is enabled on an MSTP-enabled switch, extra TC BPDUs that are received after the number of TC BPDUs received in a specified period has exceeded the threshold are processed after the TC protection time expires.
    MSTP/4/TCGUARD:OID [OID] The instance received TC message exceeded the threshold will be deferred to deal with at the end of TC protection time. (InstanceID=[INTEGER]) 
  4. MSTP_1.3.6.1.4.1.2011.5.25.42.4.2.16 hwMstpProTcGuarded //After TC protection is enabled for an MSTP process, extra TC BPDUs that are received after the number of TC BPDUs received in a specified period has exceeded the threshold are processed after the TC protection time expires.
    MSTP/1/PROTCGUARD:OID [OID] MSTP process's instance received TC message exceeded the threshold will be deferred to deal with at the end of TC protection time. (ProcessID=[INTEGER], InstanceID=[INTEGER])

Log Information

  1. DEFD/6/CPCAR_DROP_MPU //The rate of packets sent to the CPU exceeded the CPCAR value on the MPU.
    DEFD/6/CPCAR_DROP_MPU:Rate of packets to cpu exceeded the CPCAR limit on the MPU. (Protocol=[STRING], CIR/CBS=[ULONG]/[ULONG], ExceededPacketCount=[STRING])

    Parameter

    Description

    Protocol

    Protocol type.

    CIR/CBS

    Committed information rate and committed burst size.

    ExceededPacketCount

    Packet count exceeded.

  2. DEFD/6/CPCAR_DROP_LPU //The rate at which packets are sent to the CPU exceeded the CPCAR values on the LPU.
    DEFD/6/CPCAR_DROP_LPU:Rate of packets to cpu exceeded the CPCAR limit on the LPU in slot [STRING]. (Protocol=[STRING], CIR/CBS=[ULONG]/[ULONG], ExceededPacketCount=[STRING])

    Parameter

    Description

    slot

    Slot ID.

    Protocol

    Protocol type.

    CIR/CBS

    Committed information rate and committed burst size.

    ExceededPacketCount

    Packet count exceeded.

  3. SECE/4/PORT_ATTACK //A lot of attack packets from the corresponding VLAN were received on the interface.
    SECE/4/PORT_ATTACK:Port attack occurred.(Slot=[STRING], SourceAttackInterface=[STRING], OuterVlan/InnerVlan=[ULONG]/[ULONG], AttackProtocol=[STRING], AttackPackets=[ULONG] packets per second)

    Parameter

    Description

    Slot

    Slot of an MPU or LPU.

    SourceAttackInterface

    Interface that initiates the attack.

    OuterVlan

    Outer VLAN ID or single VLAN ID of the attack source.

    InnerVlan

    Inner VLAN ID of the attack source.

    AttackProtocol

    Attack packet type.

    AttackPackets

    Rate of attack packets, in pps.

  4. SECE/4/USER_ATTACK //User attack information was generated on an MPU or LPU.
    SECE/4/USER_ATTACK:User attack occurred.(Slot=[STRING], SourceAttackInterface=[STRING], OuterVlan/InnerVlan=[ULONG]/[ULONG], UserMacAddress=[STRING], AttackProtocol=[STRING], AttackPackets=[ULONG] packets per second)

    Parameter

    Description

    Slot

    Slot of an MPU or LPU.

    SourceAttackInterface

    Interface that initiates the attack.

    OuterVlan

    Outer VLAN ID or single VLAN ID of the attack source.

    InnerVlan

    Inner VLAN ID of the attack source.

    UserMacAddress

    MAC address of the attack source.

    AttackProtocol

    Attack packet type.

    AttackPackets

    Rate of attack packets, in pps.

  5. SECE/4/SPECIFY_SIP_ATTACK //The attack source information is displayed when a switch is attacked.
    SECE/4/SPECIFY_SIP_ATTACK:The specified source IP address attack occurred.(Slot=[STRING], SourceAttackIP = [STRING], AttackProtocol=[STRING], AttackPackets=[ULONG] packets per second)

    Parameter

    Description

    Slot

    Slot of an MPU or LPU.

    SourceAttackIP

    IP address of the attack source.

    AttackProtocol

    Attack packet type.

    AttackPackets

    Rate of attack packets, in pps.

  6. SECE/4/PORT_ATTACK_OCCUR //When the switch detects attack packets on an interface, the switch starts attack defense on the interface.
    SECE/4/PORT_ATTACK_OCCUR:Auto port-defend started.(SourceAttackInterface=[STRING], AttackProtocol=[STRING])

    Parameter

    Description

    SourceAttackInterface

    Interface that initiates the attack.

    AttackProtocol

    Attack packet type.

  7. SECE/6/PORT_ATTACK_END //After an attack source is excluded, the switch cancels attack defense on the interface.
    SECE/6/PORT_ATTACK_END:Auto port-defend stop.(SourceAttackInterface=[STRING], AttackProtocol=[STRING],ExceededPacketCountInSlot=[STRING])

    Parameter

    Description

    SourceAttackInterface

    Interface that initiates the attack.

    AttackProtocol

    Attack packet type.

    ExceededPacketCountInSlot

    Count of dropped packets. After attack defense is triggered on multiple interfaces, packet loss does not occur only on interfaces recorded in the log. (Added in R10)

  1. VOSCPU/4/CPU_USAGE_HIGH //The CPU was overloaded. Names of the tasks whose CPU usages rank top three and their CPU usages were displayed. If these tasks contained sub-tasks, names of the sub-tasks and their CPU usages were also displayed.
    VOSCPU/4/CPU_USAGE_HIGH:The CPU is overloaded (CpuUsage=[ULONG]%, Threshold=[ULONG]%), and the tasks with top three CPU occupancy are: [CPU-resources-usage]

    Parameter

    Description

    [CPU-resources-usage]

    Names of the tasks whose CPU usages rank top three and their CPU usage. If these tasks contained sub-tasks, names of the sub-tasks and their CPU usages were also displayed.

    CpuUsage

    Current CPU usage.

    Threshold

    CPU usage threshold.

  2. OSPF/3/NBR_DOWN_REASON //The neighbor status goes Down.
    OSPF/3/NBR_DOWN_REASON:Neighbor state leaves full or changed to Down. (ProcessId=[USHORT], NeighborRouterId=[IPADDR], NeighborAreaId=[ULONG], NeighborInterface=[STRING],NeighborDownImmediate reason=[STRING], NeighborDownPrimeReason=[STRING], NeighborChangeTime=[STRING])

    Parameter

    Description

    ProcessId

    Process ID.

    NeighborRouterId

    Neighbor router ID.

    NeighborAreaId

    Neighbor area ID.

    NeighborInterface

    Neighbor interface.

    NeighborDownImmediate reason

    Possible reasons why the OSPF neighbor goes Down:

    Neighbor Down Due to Inactivity: The switch does not receive any Hello packets from the OSPF neighbor within the Dead Time.

    Neighbor Down Due to LL Down LLDown: The switch does not receive any LLD packet from the OSPF neighbor within the Dead Time.

    Neighbor Down Due to Kill Neighbor: The interface connected to the OSPF neighbor is Down, the BFD session on the interface is Down, or the reset ospf process command has been executed. You can view the NeighborDownPrimeReason field to determine the specific cause.

    Neighbor Down Due to 1-Wayhello Received or Neighbor Down Due to SequenceNum Mismatch: The OSPF status on the peer interface goes Down and the remote device sends a 1-Way Hello packet to the local device. As a result, the OSPF status of the local device also changes to Down.

    Neighbor Down Due to AdjOK?: The AdjOK? event times out.

    Neighbor Down Due to BadLSreq: The BadLSReq event occurs on the interface.

    NeighborDownPrimeReason

    Possible reasons why the neighbor goes Down:

    Hello Not Seen: No Hello packet is received.

    Interface Parameter Mismatch: The interface settings on two ends of a link do not match.

    Logical Interface State Change: The logic interface status changes.

    Physical Interface State Change: The physical interface status changes.

    OSPF Process Reset: The OSPF process restarts.

    Area reset: The area is reset due to an area type change.

    Area Option Mis-match: The options of the areas to which interfaces on both ends belong do not match.

    Vlink Peer Not Reachable: The virtual link neighbor is unreachable.

    Sham-Link Unreachable: The Sham-Link neighbor is unreachable.

    Undo Network Command: The network command is undone.

    Undo NBMA Peer: The neighbor configuration on the NBMA interface is cleared.

    Passive Interface Down: The silent-interface command is executed on the local interface.

    Opaque Capability Enabled: The opaque capability is enabled.

    Opaque Capability Disabled: The opaque capability is disabled.

    Virtual Interface State Change: The virtual link interface status changes.

    BFD Session Down: The BFD session goes Down.

    Down Retransmission Limit Exceed: The maximum number of retransmission times is reached.

    1-Wayhello Received: A 1-way Hello packet is received.

    Router State Change from DR or BDR to DROTHER: The local interface role is changed from DR or BDR to DROTHER.

    Neighbor State Change from DR or BDR to DROTHER: The neighbor interface role is changed from DR or BDR to DROTHER.

    NSSA Area Configure Change: The configuration of the NSSA area is modified.

    Stub Area Configure Change: The configuration of the stub area is modified.

    Received Invalid DD Packet: An invalid DD packet is received.

    Not Received DD during RouterDeadInterval: No DD packet is received during Dead timer restart.

    M,I,MS bit or SequenceNum Incorrect: The M, I, and MS bits in received DD packets are different from those defined in the protocol.

    Unable Opaque Capability,Find 9,10,11 Type Lsa: The LSAs of types 9, 10, and 11 are received, but the Opaque capability is not enabled.

    Not NSSA,Find 7 Type Lsa in Summary List: The local area does not belong to NSSA, but Type-7 LSA exists in Summary.

    LSrequest Packet,Unknown Reason: An LSR packet is received due to an unknown reason.

    NSSA or STUB Area,Find 5 ,11 Type Lsa: The local area belongs to NSSA or Stub, but Type-5 and Type-11 LSAs exist.

    LSrequest Packet,Request Lsa is Not in the Lsdb: The neighbor requests an LSA through LSR from the local process or area, but the LSA does not exist in the LSDB of the local process.

    LSrequest Packet, exist same lsa in the Lsdb: The process receives an LSA, which exists in the local LSDB and neighbor request list.

    LSrequest Packet, exist newer lsa in the Lsdb: The process receives an updated LSA, which exists in the local LSDB and neighbor request list.

    Neighbor state was not full when LSDB overflow: The LSDB overflows, but the neighbor status is not Full.

    Filter LSA configuration change: The configuration of LSA filter is modified.

    ACL changed for Filter LSA: The ACL configuration of LSA filter is modified.

    Reset Ospf Peer: The OSPF neighbor is reset.

    NeighborChangeTime

    Time when the status changes.

OID Information

Object

OID

Data Type

Description

Implemented Specifications

hwEntityCpuUsage

1.3.6.1.4.1.2011.5.25.31.1.1.1.1.5

Integer32

CPU usage

Value range: 2-100

read-only

hwEntityCpuUsageThreshold

1.3.6.1.4.1.2011.5.25.31.1.1.1.1.6

Integer32

CPU usage threshold

Value range: 2-100

Default value: 80 for modular switches; 95 for fixed switches

read-write

Local Attack Defense Policy

The switch provides a local attack defense policy to protect its CPU. When the CPU receives a large number of valid packets or malicious attack packets, this function protects the CPU to prevent service interruption.

Function Overview

As shown in Figure 1-8, local attack defense policies include attack source tracing, port attack defense, CPCAR, and blacklist. The port attack defense and CPCAR functions are enabled by default.

Improper CPCAR adjustment will affect network services. To modify the CPCAR settings, contact Huawei switch resellers.

Figure 1-8 Security capability of the CPU
Attack Source Tracing

After attack source tracing is enabled, the switch analyzes and collects statistics on the packets sent to the CPU. The switch provides thresholds for packets, and considers the packets exceeding thresholds as attack packets. Then the switch locates the source interface and IP address of the attack source, reports logs to users, and takes measures on the attack source. The switch may also discard the attack packets or shut down the attacked interface.

  1. Set the source tracing mode.

    The switch supports the following attack source tracing modes:

    • Source IP address-based tracing: defends against Layer 3 attack packets.
    • Source MAC address-based tracing: prevents Layer 2 attack packets with a fixed source MAC address.
    • Interface+VLAN based tracing: defends against Layer 2 attack packets with different source MAC addresses.

    If you are unknown of the packet attack type, configure all of the preceding modes.

  2. Set the packet type in attack source tracing.

    The switch can perform attack source tracing for each of 802.1X, ARP, DHCP, ICMP, IGMP, TCP, Telnet, TTL 1, UDP, DHCPv6, MLD, ICMPv6, and ND packets, or all of them.

    When an attack occurs, you cannot identify the type of attack packets. The auto-defend protocol command allows you to flexibly specify the types of traced packets.

  3. Set the attack defense action.

    After identifying an attack source, the switch takes actions on the attack source to prevent it attacking the switch:

    • Discards the attack packets within a period.
    • Shuts down the interface receiving the attack packets.
  4. Configure the whitelist.

    If you want to exclude some users from attack source tracing and punishment actions, add the users to the whitelist. The switch does not take attack source tracing actions on the users in the whitelist.

    Generally, uplink interfaces need to be added to the whitelist to prevent impact on services.

  5. Set the attack source tracing threshold.

    The switch supports the attack source tracing threshold, sampling rate, and event report threshold.

In Figure 1-9, the source tracing mode is based on source IP address, the threshold is 4 pps, and the attack source tracing punishment action is discard packets. If the rate of packets sent to the CPU within one second exceeds the threshold, the system considers that an attack has occurred, generates a log in which the attack source address is 10.3.2.1, and discards packets from this address for a certain period of time.

Figure 1-9 Attack source tracing
Port Attack Defense

If too many packets sent from an interface to the CPU from occupying bandwidth, the packets from other interfaces cannot be sent to the CPU to cause a service interruption. Port attack defense controls the number of packets sent to the CPU.

After port attack defense is configured, a switch can trace the source and limit the rate of packets sent to the CPU based on ports, protecting the CPU against DoS attacks.

By default, the port attack defense function is enabled. The switch calculates rate of packets received on an interface. If the packet rate exceeds the threshold within the aging time, the switch considers that an attack occurs. Then the switch traces the source and limits the rate of attack packets on the port, and records a log.

The switch takes the following measures in rate limiting:

  • When the packet rate does not exceed the limit (the value is the same as the CPCAR value in attack defense policy), the switch moves the packets to a low-priority queue and then sends them to the CPU.

    The switch calculates the rate of protocol packets received by the interface, and performs attack source tracing and rate limiting on the attack packets. When the rate of protocol packets received by an interface exceeds the threshold, the switch considers that an attack has occurred and sends a log. The switch moves packets to the low-priority queue (queue 2, generally. For details about queues, see CPCAR ), and then sends the packets to the CPU.

  • When the rate of packets exceeds the threshold, the switch discards the packets.

Port attack defense provides the following functions:

  • Attack defense for the specified protocol packets

    The switch can perform port attack defense for each of ARP Request, ARP Reply, DHCP, ICMP, IGMP, and IP fragment packets, or for all of them.

  • Whitelist

    If you want to exclude some users from attack source tracing, add the users to the whitelist.

    Generally, the uplink interface needs to be added to the whitelist to ensure prompt processing on network-side protocol packets and packets from authorized users to be sent to the CPU.

  • Port attack defense thresholds

    The switch supports the attack source tracing threshold, sampling rate, and aging time.

    When an attack occurs, you cannot identify the type of attack packets. The auto-defend protocol command allows you to flexibly specify the types of traced packets.

In Figure 1-10, both port 1 and port 2 send ARP request and DHCP packets to the CPU. The rate of ARP request packets sent by port 1 and the rate of DHCP packets sent by port 2 exceed the threshold. The switch considers that an attack has occurred, and moves the packets to queue 2, which has a low priority.

Figure 1-10 Port attack defense

By default, port attack defense is enabled on a switch. The rate limiting actions taken by port attack defense have minor impact than the rate limiting actions taken by attack source tracing.

CPCAR

The Control Plane Committed Access Rate (CPCAR) limits the rate of packets sent to the CPU to protect the control plane. After packets are sent to the CPU, the switch performs the following types of rate limiting:

  1. Rate limiting based on protocol

    The switch specifies a threshold for each protocol. When the rate of protocol packets exceeds the threshold, the switch discards the packets so that each protocol can be processed promptly.

  2. Scheduling and rate limiting based on queue

    After protocol-based rate limiting is performed, the switch moves packets to queues depending on layer (management/control/forwarding) and importance. The queues have different priorities. The packets in queues are scheduled based on priorities. When conflict occurs, the packets in high-priority queues are processed first. In addition, the switch can limit rate for each queue. It restricts the maximum rate of packets sent from each queue to the CPU. This ensures stable switch running when the CPU has a high load.

    The switch has eight queues: queues 0-7. The queue with a large ID has a high priority. To view the packet queues, run the display cpu-defend configuration all command.

  3. Unified rate limiting

    On a stable network, the number of packets sent to the CPU is within an acceptable range. If a large number of packets are sent to the CPU within a short period, the CPU is busy processing these packets, resulting in a high CPU usage. To restrict the total number of packets processed by the CPU, the switch performs rate limiting on all packets to ensure normal running of the CPU.

In Figure 1-11, a large number of protocol packets are sent to the CPU:

  1. Performs rate limiting on protocol packets based on protocol type.
  2. Moves packets to different queues depending on the queues of the protocols. The queue with a large ID has a high priority.
  3. Limits the rate of all packets. If the packet rate exceeds the threshold, the switch discards the packets in low-priority queues.
Figure 1-11 Packet rate limiting by CPCAR

The CPCAR does not take effect on the management interface. If the network connected to a management interface undergoes a serious attack, users may fail to log in to the switch through the management interface. You are advised to scan virus on the PCs or replan the network.

The switch provides a default CPCAR setting for each protocol. Improper CPCAR settings will affect services on the network. To adjust CPCAR values for specified types of protocol packets based on services and network environment, contact Huawei switch resellers.

Generally, the default CPCAR settings can meet requirements.

Blacklist

A switch receives a large number of protocol packets, overwhelming the CPU. The switch may fail to process valid protocol packets or protocol flapping occurs. You can use the methods such as packet obtaining and attack source tracing to determine the attack source characteristics (such as MAC or IP address), and then configure a blacklist to discard these packets.

You can create a blacklist on a device and add users with specified characteristics to the blacklist. The device then discards the packets from these users. In Figure 1-12, blacklist 1 matches the packets with source IP address 10.1.1.0/24 and blacklist 2 matches packets with source IP address 10.2.2.0/24. When these packets are sent to the CPU, the switch discards them.

Figure 1-12 Blacklist

Configuring a Local Attack Defense Policy

  1. Create a local attack defense policy.

    1. Run the system-view command to enter the system view.
    2. Run the cpu-defend policy policy-name command to create an attack defense policy and enter its view.
    3. Configure attack source tracing.
      1. Run the auto-defend enable command to enable attack source tracing.
      2. Run the auto-defend trace-type { source-ip | source-mac | source-portvlan }* command to set the attack source tracing mode.
      3. Run the auto-defend protocol { all | { 8021x | arp | dhcp | icmp | igmp | tcp | telnet | ttl-expired | udp } * } command to set the packet type for attack source tracing.
      4. Run the auto-defend whitelist whitelist-number { acl acl-number | interface interface-type interface-number } command to configure a whitelist.
      5. Run the auto-defend action { deny [ timer time-length ] | error-down } command to enable the attack source tracing action function and set the action.
    4. Configure port attack defense.
      1. Run the auto-port-defend enable command to enable port-based attack defense.

        By default, the port attack defense function is enabled.

      2. Run the auto-port-defend protocol { all | { arp-request | arp-reply | dhcp | icmp | igmp | ip-fragment } * } command to set the packet type in port attack defense.

        By default, port attack defense is applicable to ARP Request, ARP Reply, DHCP, ICMP, IGMP, and IP fragment packets.

    5. Set the rate limit for protocol packets.

      The rules of sending protocol packets to CPU include car and deny. When both the car and deny rules are configured for the same type of protocols, the rule configured later takes effect.

      • To enable CPCAR limiting for the packets sent to the CPU and set the threshold, run the car { packet-type packet-type | user-defined-flow flow-id } cir cir-value [ cbs cbs-value ] command.
      • To set the action taken on the packets sent to the CPU to discard, run the deny { packet-type packet-type | user-defined-flow flow-id } command.
    6. Run the blacklist blacklist-id acl acl-number command to create a blacklist.

      A maximum of eight blacklists can be configured in an attack defense policy.

      NOTE:

      Packets matching the ACL applied to a blacklist are discarded, regardless of whether the ACL contains a permit or deny rule.

  2. Apply the local attack defense policy.

    After a local attack defense policy is created, the policy must be applied.

    • Modular switches

      Both MPUs and LPUs have their own CPUs. Local attack defense policies are configured differentially for MPUs and LPUs.

      Before creating and applying attack defense policies, check attack information on the MPUs and LPUs. If the attack information on the MPUs and LPUs is consistent, apply the same attack defense policy to the MPUs and LPUs; otherwise, apply different policies to them.

      1. Apply an attack defense policy to MPU.
        1. Run the system-view command to enter the system view.
        2. Run the cpu-defend-policy policy-name1 command to apply the attack defense policy.
      2. Apply an attack defense policy to an LPU.
        NOTE:

        If an attack defense policy has been applied to all LPUs, it cannot be applied to the specified LPU. In a similar manner, if an attack defense policy has been applied to a specified LPU, it cannot be applied to all LPUs.

        • If all LPUs process similar services, apply an attack defense policy to all LPUs.

          Run the cpu-defend-policy policy-name2 global command to apply an attack defense policy.

        • If LPUs process different services, apply an attack defense policy to the specified LPU.
          1. Run the slot slot-id command to enter the slot view.
          2. Run the cpu-defend-policy policy-name2 command to apply an attack defense policy.

          An attack defense policy applied to a slot view takes effect only for the LPU in this slot.

    • Fixed switches
      • On a stand-alone switch:
        1. Run the system-view command to enter the system view.
        2. Run the cpu-defend-policy policy-name global command to apply the attack defense policy globally.
      • In a stack:
        1. Run the system-view command to enter the system view.
        2. Apply the attack defense policy.
          • To apply the attack defense policy to all stacked devices, run the cpu-defend-policy policy-name global command.
          • To apply the attack defense policy to the master device, run the cpu-defend-policy policy-name command.

Tasks Occupying CPU Resource

Task Name

Description

BUFM

Outputs debugging information.

1731

Implements the Y.1731 protocol stack, manages the protocol state machine, and maintains the protocol database.

_EXC

Processes system exception events.

_TIL

Monitors and processes deadloops caused by software exceptions.

AAA

Interacts with modules such as the UCM and RADIUS to process user authentication messages, and maintains authentication and authorization entries.

ACL

Controls access users.

ADPG

Maintains dynamic VLAN-related chip entries (adaptation layer task).

ADPT

Implements the EFM protocol stack, manages the protocol state machine, and maintains the protocol database.

age_task

Ages out MAC address entries.

AGNT

Implements the IPv4 SNMP protocol.

AGT6

Implements the IPv6 SNMP protocol.

ALM

Adds, clears, and manages alarm information.

ALS

Implements automatic laser shutdown.

AM

Manages IP address pools and addresses for modules such as DHCP.

AMCP

Synchronizes data from MPU to SPU (application layer protocol).

APP

Schedules Layer 3 services in a unified manner.

ARP

Implements the ARP protocol stack, manages the ARP state machine, and maintains the ARP database.

au_msg_hnd

Processes AU messages, which are used for MAC entry learning and delivery.

bcmC

Counts the number of packets on chip ports.

bcmD

Implements asynchronous message processing in chip drive software.

bcmR

Receives packets from the chip.

bcmT

Transmits packets to the chip.

bcmX

Transmits packets to the chip of specified type asynchronously.

bcmL2MOD.0

Learns MAC address entries.

BEAT

Sends and receives heartbeat packets to monitor inter-board communication.

BFD

Implements the BFD protocol stack, manages the protocol state machine, and maintains the protocol database.

bmLI

Scans interface status and notifies the application modules of interface status changes.

BOX

Outputs the data stored in the black box, including error and exception information generated during system operations.

BULK_CLASS

Manages the USB flash drive (operating system task).

BULK_CLASS_IRP

Manages USB I/O request packets (operating system task).

BusM A

Manages USB bus (operating system task).

CCTL

Collects and schedules performance data in batches.

CDM

Manages configuration data.

CFM

Recovers configurations.

CHAL

Completes functions at the hardware adaptation layer.

CKDV

Controls and manages the clock module.

CMD_Switching

Listens on sockets.

CMDA

Executes commands in batches.

cmdExec

Executes commands.

CSBR

Checks configuration consistency between the active and standby MPUs.

CSPF

Implements the CSPF protocol stack and completes path computation.

CssC

Handles cluster events.

CSSM

Implements cluster protocol stack and manages cluster status.

DEFD

Monitors traffic sent to the CPU and maintains CPU defense data.

DELM

Deletes MAC address entries in STP.

DEV

Manages hardware modules on the switch.

DEVA

Handles subcard hot swapping.

DFSU

Loads logic files.

DHCP

Implements the DHCP protocol stack and provides the functions such as DHCP snooping and DHCP relay.

DLDP

Implements the DLDP protocol stack, manages the protocol state machine, and maintains the protocol database.

DSMS

Processes environment alarms generated by the environment monitoring system.

EAP

Implements 802.1x authentication, MAC address authentication, and MAC address bypass authentication, manages the protocol state machine, and maintains the protocol database.

Ecm

Manages low-level inter-board communication.

EFMT

Sends 802.3ah test packets.

EHCD_IH

Drives USB host controller (operating system task).

ELAB

Manages electronic labels.

EOAM

Implements the EOAM 802.1ag protocol, manages the protocol state machine, and maintains the protocol database.

Eout

Outputs debugging information about the ECM task.

FBUF

Sends packets.

FCAT

Obtains the packets sent or received by the CPU for fault location.

FECD

Processes MOD synchronization messages.

FIB

Generates IPv4 forwarding entries on the control plane and delivers the entries to the forwarding plane to guide data forwarding.

FIB6

Manages IPv6 FIB entries, maintains software entries, and requests the hardware adaptation layer to maintain chip entries.

FM93

Outputs fault information.

FMAT

Manage faults.

FMCK

Detects device faults.

FMON

Monitors logic card failures.

frag_add

Synchronizes MAC entries from the hardware table to the software table, traverses the hardware table, and adds the MAC address entries that do not exist in the software table to the software table.

frag_del

Synchronizes MAC entries from the hardware table to the software table, traverses the software table, and deletes the MAC entries that do not exist in the hardware table from the software table.

FTPS

Offers the FTP service.

FTS

Receives packets. This task is created by FECD. After the driver receives packets that do not need to be processed by the super task, it sends the packets to the FTS task for processing.

GREP

Manages GRE forwarding entries in chip (adaptation layer task).

GTL

Manages common data such as memory and character strings.

GVRP

Implements the GVRP protocol stack, manages the protocol state machine, and maintains the protocol database.

HACK

Processes HA response messages.

HOTT

Manages hot swapping of interface cards.

HS2M

Synchronizes data between the active and standby MPUs to ensure high reliability.

HVRP

Implements the HVRP protocol stack, manages the protocol state machine, and maintains the protocol database.

IFNT

Processes interface status change events.

IFPD

Manages interfaces, maintains interface database, and processes interface status change events.

INFO

Receives and sends logs, traps, and debugging information generated by service modules.

IP

Schedules IP protocol tasks in a unified manner.

IPCQ

Retransmits IPC messages upon message transmission failures.

IPCR

Sends, receives, and distributes IPC messages to related service modules.

IPMC

Adapts to Layer 3 multicast protocols, responds to changes on the control plane, and issues forwarding entries.

ISSU

Provides smooth upgrade for firmware.

ITSK

Sends, receives, and distributes various protocol packets.

L2

Schedules Layer 2 services in a unified manner.

L2MC

Listens on IGMP/MLD packets on interfaces and implements fast join/leave group member interfaces.

L2V

Manages VPLS and VLL services, maintains control plane data, and requests the adaptation layer to maintain forwarding entries in chip.

L3I4

Delivers IPv4 unicast forwarding entries from LPUs.

L3IO

Delivers entries of Layer 3 protocols, such as URPF and VRRP, to interface cards.

L3M4

Adapts to the ARP protocol on the MPU, delivers IPv4 unicast forwarding entries, and responds to the changes at the control plane.

L3MB

Adapts to Layer 3 protocols, such as URPF and VRRP, on the MPU, and delivers forwarding entries.

LACP

Implements the LACP protocol stack, manages the LACP state machine, and maintains the LACP database.

LCS

Manages licenses.

LCSP

Loads authorized features allowed by the license file.

LDP

Implements the LDP protocol stack and maintains the LDP LSP database.

LDRV

Synchronizes software versions between active and standby MPUs.

LDT

Implements the LDT protocol stack, manages the protocol state machine, and maintains the protocol database.

LHAL

Provides the hardware adaptation layer to shield hardware differences.

LINK

Schedules link layer tasks in a unified manner.

linkscan

Monitors the status of links.

LLDP

Implements the LLDP protocol stack, manages the LLDP state machine, and maintains the LLDP database.

LOAD

Loads the system image file and patch packages.

LSPA

Maintains LSP forwarding entries and instructs the hardware adaptation layer to maintain chip entries.

LSPM

Creates, updates, and deletes LSPs.

MCSW

Adapts to Layer 3 multicast protocols, responds to changes on the control plane, and issues forwarding entries.

MERX

Processes the packets received on the management interface.

MFF

Implements the MAC forced forwarding (MFF) function.

MFIB

Manages Layer 3 multicast forwarding entries.

MIRR

Implements port mirroring.

MOD

Manages, distributes, and reclaims module numbers.

MPLS

Implements MPLS protocol stack, and distributes, manages, and reclaims labels.

MSYN

Synchronizes MAC entries between cards.

MTR

Collects memory usage data at scheduled time.

mv_rxX

Handles packet receiving queues in CPU X (X is an integer ranging from 0 to 7).

NDIO

Delivers IPv6 unicast forwarding entries from LPUs.

NDMB

Adapts to the ND protocol on the MPU, issues IPv6 unicast forwarding entries, and responds to changes on the control plane.

NQAC

Acts as the NQA client to respond to and process NQA packets.

NQAS

Acts as the NQA server to respond to and process NQA events and packets.

NSA

Manages chip entries at the VRP NetStream adaptation layer.

NTPT

Implements the NTP protocol stack, manages the protocol state machine, and maintains the protocol database.

OAM

Implements the MPLS OAM protocol stack, manages the protocol state machine, and maintains the protocol database.

OAM1

Adapts to the OAM 802.1ag protocol, responds to protocol-layer changes, and responds to changes on the forwarding plane.

OAMI

Processes packets received from logic cards.

OAMT

Responds to protocol changes and maintains chip entries (adaptation layer task).

OS

Operating system task.

Ping

Quickly responds to ping packets.

PNGI

Provides fast ping reply on LPUs.

PNGM

Provides fast ping reply on MPUs.

Port

Processes chip debugging commands.

port_statistics

Collects port statistics.

PPI

Maintains interface status on chips (adaptation layer task).

PTAL

Implements redirection authentication, authentication and authorization, manages the protocol state machine, and maintains the protocol database.

QOSA

Manages QoS configurations and maintains chip entries.

QOSB

Delivers QoS entries to LPUs and maintains QoS entries.

RACL

Creates session table entries based on TCP/UDP/ICMP initial packet, monitors and ages out session table entries.

RDS

Implements the RADIUS protocol stack, manages the protocol state machine, and maintains the protocol database.

RMON

Monitors the system remotely.

root

System root task.

ROUT

Completes route learning for routing protocols, selects best routes, and delivers routes to the FIB.

RPCQ

Provides the remote procedure call function.

RRPP

Implements the RRPP protocol stack on interface cards, detects interface status quickly, and delivers hardware entries.

RSA

Calculates the RSA key.

RSVP

Implements the RSVP protocol stack and maintains the CR-LSP database.

RTMR

Manages scheduled tasks.

SAM

Delivers service entries to LPUs and maintains the entries.

SAPP

Manages application layer protocol dictionary and whitelist, maintains software entries and instructs the adaptation layer to set chip status.

SDKD

Detects the status of the interfaces connected to the backplane and collects the packet rate on the interfaces.

SDKE

Displays LSW chip entries.

SECB

Delivers security entries to LPUs and maintains the security entries.

SECE

Implements security functions such as ARP security, IP security, and CPU security, manages the protocol state machine, and maintains protocol databases.

SERVER

TCP/IP server task.

SFPM

Queries manufacturer information and digital diagnosis information of optical modules.

SLAG

Implements the E-Trunk function.

SMAG

Smart link agent that can quickly detect and process interface status change vents.

SMLK

Implements the Smart Link protocol stack, manages the protocol state machine, and maintains the protocol database.

smsL

Loads the environment monitoring module.

smsR

Sends environment monitoring requests.

smsT

Enables the environment monitoring system to send packets.

SNPG

Listens on and processes IGMP and MLD protocol packets.

SOCK

Schedules and processes IP packets.

SRMI

Processes external interrupts.

SRMT

Device management timer task.

SRVC

Processes DHCP packets related to IP sessions, and interacts with the user management module and AAA module to complete authorization and accounting.

STFW

Super forwarding task that maintains forwarding entries in the trunk memory.

STND

Assists the operating system in task and event scheduling.

STP

Implements the STP protocol stack, manages the STP state machine, and maintains the STP database.

STRA

Monitors traffic, identifies attacking traffic, and punishes attack sources.

STRB

Monitors LPUs and identifies attack traffic.

SUPP

Processes interrupt messages and timer messages in the device management module.

t1

Temporary task (operating system task).

TACH

Implements the HWTACACS protocol stack, manages the protocol state machine, and maintains the protocol database.

TAD

Transmits traps.

TARP

Processes trap messages.

tBulkClnt

Manages the USB driver (operating system task).

TCPKEEPALIVE

Maintains TCP connections.

TCTL

Controls the upload of batch collected performance data.

tDcacheUpd

Updates the disk cache (operating system task).

tExcTask

Handles exceptions (operating system task).

TICK

Processes the system clock.

tLogTask

Processes logs (operating system task).

TM

Maintains chip entries for the access service.

tNetTask

Processes network-related events (operating system task).

TNLM

Manages tunnels.

TNQA

Schedules NQA client tasks in a unified manner.

TRAF

Collects statistics on VLL, VPLS, and L3VPN.

TRAP

Processes trap messages.

tRlogind

Enables remote login to virtual terminals (operating system task).

tTelnetd

Telnet server task (operating system task).

TTNQ

Schedules NQA server tasks in a unified manner.

tUsbPgs

Device management task that manages USB plug-in and plug-out (operating system task).

tWdbTask

Debugging proxy task (operating system task).

U 34

Processes user's commands.

UCM

Interacts with the AAA module to process user status and maintain user entries.

UDPH

UDP Helper

USB

USB-based upgrade task.

usbPegasusLib

USB host LIB (operating system task).

usbPegasusLib_IRP

USB host I/O LIB (operating system task).

UTSK

User framework task that optimizes protocol processing to ensure preferential processing of protocol packets.

VCON

Serial port redirection task.

VFS

Manages the virtual file system.

VIDL

Collects statistics on CPU usage of idle tasks.

VMON

Monitors system task running.

VOAM

Offers NQA VPLS MAC diagnosis.

VP

Receives and sends VP packets between boards.

VPR

Receives VP packets between boards.

VPRE

Processes VP messages.

VPS

Sends VP packets between boards.

VRPT

Timer test task.

VRRP

Implements the VRRP protocol stack, manages the VRRP state machine, and maintains the VRRP database.

VT

Virtual terminal task.

VT0

Authenticates the first login user and processes the user's commands.

VTRU

Processes the Up/Down events of V Trunk.

VTYD

Processes login requests of all users.

WEB

Implements Web authentication.

WEBS

Allows users to log in to the device through Web.

XMON

Traces system task running.

XQOS

Service quality task.

CPU-related Tasks and Functions for Modular Switches

Task Name

Description

Reason for High CPU Usage

Solution

_EXC

Processes system exception events.

In normal cases, this task does not cause high CPU usage. The task is scheduled only when product or service exceptions occur.

-

_TIL

Monitors and processes deadloops caused by software exceptions.

In normal cases, this task does not cause high CPU usage. The task is scheduled only when a product or service task fails to be scheduled or a deadloop occurs.

-

1AGA

EOAM_1AG super task for delivering module events.

-

-

1AGAGT

EOAM_1AG super task for delivering module events.

-

-

AAA

Manages user authentication, authorization, and accounting.

Authentication, authorization, and accounting are performed for a large number of users.

Reduce online users.

ACL

Controls access users.

Too many ACLs are delivered at a time.

Prolong the interval between configuring ACLs.

ADPT

Layer 2 adaptation task. Processes BFD VLANIF interface Down events and CFD logic interruption events, and sets the timer for the EFM module.

-

-

ALM

Adds, clears, and manages alarm information.

-

-

AM

Manages IP address pools and IP addresses for modules such as DHCP.

A large number of users apply for IP addresses.

Reduce the number of users who apply for IP addresses.

AMCP

Synchronizes data from MPUs to SPUs (application-layer management and control protocol).

-

-

APP

Centrally schedules Layer 3 service tasks.

Multiple tasks are performed to process many service messages.

Run the display utask-info utask-id slice-time command to check which UTASK task takes a long time.

APS

Processes Ethernet protection switching events.

-

-

ARPA

Processes ARP attack defense events.

Many ARP attacks are detected on the switch.

Filter out packets from unauthorized users on interfaces.

CWP_BUP

Processes MAP messages.

In normal cases, this task does not cause high CPU usage.

Decrease the service concurrency rate, and expand the system capacity or use high-performance main control units such as SRUH.

ASFI

Processes sFlow messages on LPUs.

sFlow sampling is configured on a large number of interfaces, and the sampling ratio or sampling interval is too small.

Deploy the sFlow service properly, and configure the sampling ratio and sampling interval based on actual traffic on the interfaces.

ASFM

Processes sFlow messages on MPUs.

ASMN

Manages ASs in an SVF system.

-

-

bcmCNTR.0

Collects traffic statistics on chip 0.

-

-

bcmCNTR.1

Collects traffic statistics on chip 1.

-

-

bcmCNTR.2

Collects traffic statistics on chip 2.

-

-

bcmD

BCM debugging task.

A large amount of debugging information is printed.

-

bcmI

bcmINTR task that processes kernel interrupts.

Many kernel interrupts are reported.

-

bcmIbodSync.0

Resolves the buffer exceptions on HG interfaces of chip 0.

Synchronization is performed frequently.

-

bcmIbodSync.2

Resolves the buffer exceptions on HG interfaces of chip 2.

bcmIpfixDma.0

Collects service traffic statistics on the Ipfix register of chip 0.

The register is frequently accessed.

-

bcmIpfixDma.2

Collects service traffic statistics on the Ipfix register of chip 2.

bcmL2age.0

Ages out MAC address entries on chip 0.

-

-

bcmL2age.2

Ages out MAC address entries on chip 2.

-

-

bcmMEM_SCAN.0

Periodically checks the memory on chip 0.

-

-

bcmMEM_SCAN.1

Periodically checks the memory on chip 1.

-

-

bcmMEM_SCAN.2

Periodically checks the memory on chip 2.

-

-

bcmPortMon.0

Monitors status of ports on chip 0.

The status of a port changes frequently.

-

bcmPortMon.1

Monitors status of the FBUF port on chip 1.

bcmPortMon.2

Monitors status of ports on chip 2.

bcmXGS3AsyncTX

Synchronizes packet sending information.

-

-

BEAT

Sends and receives heartbeat packets to monitor inter-card communication.

-

-

BFD

Implements the BFD protocol stack, manages the protocol state machine, and maintains the protocol database.

A large number of BFD sessions flap.

Delete or shut down BFD sessions.

BFDA

BFD adaptation task that processes IPC messages as well as ARP and MAC address change messages.

-

-

BFDS

Processes BFD sending and detection timers and other events.

-

-

BOX

Exports the data stored in the black box, including error and exception information generated during system operations.

Errors, assertions, exceptions or deadloops occur on the device.

-

BOX_Out

BTRC

Traces internal debugging functions.

The trace function is enabled.

Disable the trace function.

BULK_CLASS_IRP

Manages USB I/O request packets (operating system task).

-

-

BusM A

Manages the USB bus (operating system task).

-

-

CAPM

Processes CAPWAP events.

There are too many online users.

Reduce the number of online users.

CCTL

Collects and schedules performance data in batches.

Data is being collected.

No action is required.

CHAL

Completes functions at the hardware adaptation layer.

-

-

CKDV

Controls and manages the clock module.

-

-

CLKI

Processes the timers, IPC messages, and interrupt messages in the clock module of the MPU.

-

-

CMDA

Executes commands in batches.

Many service commands are delivered in batches.

Reduce the number of commands delivered in batches.

co0

Serial port task.

User operations, especially, input and output operations are frequently performed. For example, commands are copied to the screen (input) or a large number of display commands are executed (output).

Reduce the frequency at which input and output operations are performed. This problem is automatically solved after the operations end.

COMT

Commits ACL configurations to APs.

A large number of APs go online concurrently.

Plan the network properly and avoid many concurrent online APs.

CSBR

Checks configuration consistency between the active and standby MPUs.

This task is rarely used and is unlikely to cause high CPU usage.

No action is required.

CSPF

Calculates paths for TE tunnels.

The TEDB for CSPF frequently changes.

Check whether the link or IGP flaps. If so, rectify the fault.

CSS

Sets up CSS systems and maintains the status and topology (main CSS task).

-

-

CSST

Tests CSS links and monitors the CSS link status.

-

-

CSSD

Delays bringing CSS ports Down so that CSS port status changes will not cause CSS split within a short time.

-

-

CSSF

Performs cross-version upgrades in a CSS system quickly.

-

-

CSSP

Sends and receives protocol packets in a CSS system.

-

-

CWP_DTLS

Performs DTLS encryption.

DTLS links are created or disabled, DTLS negotiation is performed, or APs set up DTLS links in batches.

This task is used when APs go online through DTLS links. However, it is rarely used. If this task causes high CPU usage, disable DTLS based on the network requirements.

LBS

Locates terminals and analyzes the spectrum of non-wireless devices.

The air scan interval is too short or the radio environment is complex.

Increase the air scan interval to a proper value by considering both the location precision and CPU usage.

DCPI

Monitors IP traffic (IP FPM).

Many configurations are enabled and the measurement interval is short.

Avoid many configurations and increase the measurement interval.

DEFD

Processes CPU defense events.

Too many packets are sent to the CPU.

Limit the rate of packets sent to the CPU.

DEVA

Loads and initializes FSUs, synchronizes entity trees, and performs active/standby switchovers (auxiliary device management task).

-

-

DFSU

Loads and initializes FSUs.

-

-

DIAG

Equipment module task on the MPU.

-

-

DLDP

Sends and receives DLDP protocol packets and manages the protocol state machine.

DLDP is enabled on too many interfaces and the interval at which DLDP packets are sent is too short.

  • Run the dldp interval command to adjust the interval at which DLDP packets are sent.
  • Disable DLDP on the interfaces that do not require the DLDP function.

DRVD

Processes diagnosis messages for the drive module.

-

-

DSMS

Processes environment alarms generated by the environment monitoring system.

-

-

EAP

Performs MAC address and 802.1X authentication.

Authentication is performed for a large number of MAC and 802.1X users.

Reduce the number of authentication users.

Ecm

Manages low-level inter-card communication.

-

-

EFMT

Sends 802.3ah test packets.

-

-

EHCD_IH0

Processes EHCI interrupts (VxWorks operating system task).

-

-

ELAB

Manages electronic labels.

-

-

EOAM

Implements the EOAM 802.1ag protocol, manages the protocol state machine, and maintains the protocol database.

The associated service flaps.

This task rarely causes high CPU usage. If the problem occurs, ensure that the associated service does not flap.

Eout

Exports debugging information about the ECM task.

-

-

ERPS

Initializes global ACL rules and registers events for ERPS (ERPS adaptation task).

-

-

ESAP

eSAP adaptation task.

There are too many online APs and users.

Reduce the number of online APs and users.

esm_recovery.0

Fixes soft errors on the extended TCAM of chip 0.

Soft errors occur in entries on extended chips.

Collect information about faulty entries and restart the card.

esm_recovery.2

Fixes soft errors on the extended TCAM of chip 2.

EZOP

Manages the EasyOperation function. This function is used to upgrade the software version and load configurations and patches in batches.

-

-

EZPP

Manages and processes EasyOperation packets.

-

-

FCAT

Obtains packets.

Too many packets are obtained and printed frequently.

-

FECD

Processes messages at the FECD layer.

Too much diagnostic information is printed.

-

FLOW

Performs traffic measurement.

Too much traffic needs to be collected and analyzed.

Disable the sFlow service in the case of heavy network traffic.

FMES

Exports device fault information and monitors the chip and CPLD status.

-

-

FNTL

Exchanges kennel-mode and user-mode packets (fast path task).

-

-

FTS_

Sends packets to and receives packets from the CPU.

A large number of protocol packets are sent to or received from the CPU.

Check whether attacks exist.

GEM

Manages general events.

This task is currently not executed.

No action is required.

GEMR

Manages general events.

This task is currently not executed.

No action is required.

GLRM

License adaptation task that registers license-controlled items.

-

-

GREI

GRE module adaptation task on the LPU.

-

-

GREM

GRE module adaptation task on the MPU.

-

-

GRES

Task corresponding to the label and token resource modules.

The CPU is high due to the application that applies for resources. The GRESM task usually does not cause high CPU usage.

Check whether the service that applies for labels or tokens flaps.

GRSA

Creates RSA and DSA key pairs.

GTL

Manages common data such as memory and character strings.

This task will not cause high CPU usage.

No action is required.

GVRP

Receives and sends GVRP packets, and processes internal messages of the GVRP protocol.

The CPU usage is high due to a large number of VLANs that need dynamic GARP registration or a large network radius.

Increase the timer value.

HVRP

Processes HVRP command lines, sends and receives packets, and processes timer messages.

-

-

IFAD

Processes IPC messages delivered by the VCT.

VCT detection is performed frequently.

-

IFLP

Collects traffic statistics on a management interface periodically.

A large number of interfaces are configured and the measurement interval is short.

-

IFNT

Processes interface status change events.

The interface flaps frequently.

-

IFPD

Manages interfaces, maintains interface database, and processes interface status change events.

There are a large number of interfaces, interface link status flaps, or optical modules become faulty.

-

IFWL

Processes wireless interface-related events.

A large number of APs go online and offline, a large number of AP interfaces change, or a large number of STAs go online or offline concurrently.

-

INPT

Serial port task.

-

-

IPCK

Processes the received IPC messages and sends ACK messages to the peer.

The service process is simple, which will not cause high CPU usage.

-

IPCQ

Retransmits IPC messages upon message transmission failures.

The retransmission frequency is not high, which will not cause high CPU usage.

-

IPCR

Sends, receives, and distributes IPC messages to related service modules.

-

-

IPFP

Monitors IP traffic (IP FPM).

Many configurations are enabled and the measurement interval is short.

-

IS2U

ISSU function adaptation task.

-

-

ISC6

Processes commands of IPsec6 and encrypts packets.

This task will not cause high CPU usage.

-

ITSK

Sends, receives, and distributes various protocol packets.

A large number of protocol packets are sent and received.

-

JOB

Maintenance assistant task.

When the maintenance assistant meets the trigger conditions, the CPU usage will be high if many commands in the script are executed in batches.

Reduce the number of commands in the script.

L2

Centrally schedules Layer 2 service tasks, and supports the MGR, ErrorDown, BPTNL, LNP, VCMP, MFLP, VLAN, and QinQ features.

LNP: There are too many interfaces.

VCMP: VLANs are frequently deleted or created.

BPTNL: A large number of packets are transparently transmitted.

LNP: This feature rarely causes high CPU usage. If the problem occurs, check the reason of interface flapping and avoid frequent flapping.

VCMP: Avoid frequently creating or deleting VLANs.

BPTNL: Configure transparent transmission of protocol packets on interfaces.

L2_E

Main task of the EOAM feature.

The associated service flaps.

This task rarely causes high CPU usage. If the problem occurs, ensure that the associated service does not flap.

L2_P

Supports LACP, HGMP, 3AH, and ELMI features.

-

-

L2_R

Supports ERPS, RRPP, and SEP features.

Incorrect connections exist after a protocol is deployed and the device suffers from a TC packet attack.

Ensure that physical loops are closed.

L2_T

Supports the Eth-Trunk feature.

-

-

L2IF

Processes real-time backup and batch backup of MAC address and VLAN information.

-

-

L2PQ

Processes IPC messages of Layer 2 protocols.

-

-

L2V

Processes L2VPN services, including VLL and VPLS.

Flapping occurs on the public network. As a result, a large number of services send mapping packets, and connections are re-established.

Solve the problem of public network flapping.

L3I4

Processes Layer 3 IPv4 services on the LPU.

-

-

L3IO

Processes Layer 3 services in the common module on the LPU.

-

-

L3M4

Processes Layer 3 IPv4 services on the MPU.

-

-

L3MB

Processes Layer 3 services in the common module on the MPU.

-

-

LAGAGT

Agent task on the LPU for sending and receiving LACP negotiation packets.

A large number of LACP negotiation packets are received and the LACP frequently flaps.

Analyze the configuration and traffic on interfaces, and verify that the Eth-Trunk service is normal.

LBDT

Sends, receives, and processes loopback detection packets.

LBDT is configured for many VLANs and interfaces.

Disable LBDT in some VLANs and on some interfaces.

WMT_PM

Collects PM performance data.

eSight collects AP data periodically.

Adjust the PM performance measurement interval.

LCSP

License adaptation task that registers license-controlled items.

-

-

LDCM

Command line task in the load module.

-

-

LDT

Sends, receives, and processes loop detection packets.

-

-

LDTP

Receives loop detection packets.

LDT is configured for many VLANs and interfaces.

Disable LDT in some VLANs and on some interfaces.

LHAL

Provides the hardware adaptation layer for LPUs to shield hardware differences.

-

-

LINK

Centrally schedules link layer tasks.

Multiple tasks are performed to process many service messages.

Run the display utask-info utask-id slice-time command to check which UTASK task takes a long time.

LLDP

Sends, receives, and processes LLDP protocol packets.

A switch receives a large number of LLDP protocol packets because it has too many LLDP neighbors.

Reduce the number of LLDP neighbors on the switch.

LNP

LNP protocol task.

-

-

LOAD

Loads the system image file and patch packages.

-

-

LRCV

Receives packets in the load module on the MPU.

-

-

LSPA

MPLS LSP process (MPLS LSP AGENT) task.

-

-

LSPM

Processes LSP services.

LDP, RSVP, or BGP LSPs flap frequently, triggering LSP creation and deletion.

Determine which type of LSPs flap, such as LDP, BGP, or RSVP LSP. LSP flapping is typically caused by IGP, BGP, or VPN route flapping.

LT0

Local Telnet task. This task is rarely used on live networks.

This task is rarely used on live networks.

No action is required.

MACL

Creates and updates MQC traffic policies.

Too many traffic policies are created and frequently updated.

Prolong the interval between configuring MQC traffic policies.

MACRESTORE

Retrieves bottom-layer MAC software entries.

-

-

MAD

Processes MAD in direct mode.

-

-

MADP

Processes MAD in relay mode.

-

-

MCSF

Processes multicast entries delivered to SFUs.

Multicast entries are repeatedly updated due to route or port flapping.

Check whether route or port flapping occurs.

MDNS

Processes mDNS protocol packets.

A large number of mDNS packets are sent to the CPU.

Limit the rate of mDNS packets sent to the CPU, and check whether too many mDNS packets are caused by external attacks or network loops.

MERX

Processes the packets received on the management interface.

The management interface receives a large number of packets.

Rate limiting on the management interface prevents the attack of a large number of packets.

METH

Redirection task for the management interface.

-

-

MFF

MFF task.

ARP-MFF packets are processed.

Configure the rate limit of ARP-MFF packets properly and deploy the attack defense function.

Mirr

Processes the mirroring service.

A large number of configurations are synchronized in the batch backup process.

Reduce the mirroring configurations.

MOD

Learns MAC address entries.

MAC address flapping or a hash conflict occurs.

-

MPSF

MPLS service adaptation task on the SFU.

-

-

NDIO

Layer 3 IPv6 adaptation task on the LPU.

-

-

NDMB

Layer 3 IPv6 adaptation task on the MPU.

-

-

MTR

Collects memory usage data at scheduled time.

-

-

NFPT

Manages scheduled tasks.

This task will not cause high CPU usage.

No action is required.

NQAF

Provides the NQA FTPR function.

The NMS frequently uses FTP to obtain the results of NQA test instances.

Decrease the frequency of operations.

NSA

Processes the NetStream service.

A large number of flows are sent to the CPU of an LPU.

Use flexible flows to decrease the number of flows.

NTLK

Netlink fast path for exchanging kennel-mode and user-mode messages.

-

-

NTPT

Provides the NTP clock synchronization function.

A large number of NTP attack packets are received.

Configure NTP authentication.

OAM

Implements the MPLS OAM protocol stack, manages the protocol state machine, and maintains the protocol database.

-

-

OAM1

Adapts to the OAM 802.1ag protocol, responds to protocol-layer changes, and responds to changes on the forwarding plane.

-

-

OAMI

Processes packets received from logical cards.

-

-

OAMT

Responds to protocol changes and maintains chip entries (adaptation layer task).

-

-

OS

Operating system virtual task.

This task will not cause high CPU usage.

-

PARITY_CHECK

Detects soft errors in entries.

Soft errors occur in entries.

-

PATC

Manages patches.

-

-

PCAI

Processes the iPCA service on LPUs.

-

-

PCAM

Processes the iPCA service on MPUs.

-

-

PGMC

XMPP-side connection task for free mobility.

-

-

PGMP

Manages free mobility policies.

-

-

PGMX

XMPP-side task for free mobility.

-

-

PMS

Uploads performance measurement files. This task is triggered when automatic upload of performance measurement files is enabled.

Generally, files are not uploaded frequently and the sizes of uploaded files are small. Therefore, this task will not cause high CPU usage.

No action is required.

PNGI

Processes the Layer 3 fast ping service on LPUs.

-

-

PNGM

Processes the Layer 3 fast ping service on MPUs.

-

-

POE

Checks whether PDs are in present and checks the grading status and power control policies of the PDs.

-

-

POE+

Processes the PPPoE plus protocol.

A large number of PPPoE packets are sent to the CPU.

  • Reduce the number of PPPoE users.
  • Limit the rate of PPPoE packets sent to the CPU, and check whether too many PPPoE packets are caused by external attacks or network loops.

PPI

Maintains VLAN and MAC address data and delivers entries (L2 adaptation task).

Network loops or flapping occurs, or port security is configured on multiple ports.

  • Ensure that no network loop or flapping occurs.
  • Check whether the ports configured with port security alternate between Up and Down or switch VLANs frequently. If so, reduce the frequency of operations.

PPP

Processes the PPPoE protocol.

A large number of PPPoE packets are sent to the CPU.

  • Reduce the number of PPPoE users.
  • Limit the rate of PPPoE packets sent to the CPU, and check whether too many PPPoE packets are caused by external attacks or network loops.

PTAL

Processes Portal authentication.

A large number of HTTP packets for Portal authentication are sent to the CPU.

  • Reduce the number of authentication users.
  • Limit the rate of HTTP packets sent to the CPU, and check whether too many HTTP packets are caused by external attacks or network loops.

QOSA

Processes QoS services on MPUs.

Too many messages on MPUs are backed up to standby MPUs in the batch backup process.

Reduce QoS configurations.

QOSB

Processes QoS services on LPUs.

Too many messages on MPUs are backed up to standby MPUs in the batch backup process.

Reduce QoS configurations.

RACL

Processes reflective ACLs.

Many reflective ACLs are configured using commands and updated frequently.

Prolong the interval between configuring reflective ACLs.

RDS

Processes the RADIUS protocol.

A large number of RADIUS packets are sent to the CPU.

  • Reduce the number of authentication users.
  • Limit the rate of RADIUS packets sent to the CPU, and check whether too many RADIUS packets are caused by external attacks or network loops.

RMON

Monitors the system remotely.

This task will not cause high CPU usage.

No action is required.

root

System root task.

-

-

ROUT

Completes route learning for routing protocols, selects best routes, and delivers routes to the FIB.

A large number of multicast packets are received, and multicast entries are updated due to route changes or interface changes.

Configure multicast filtering policies.

RRPP

Implements the RRPP protocol stack on LPUs, detects interface status quickly, and delivers hardware entries.

A common FDB attack occurs.

Check whether hubs are introduced to the network.

SAM

Delivers authentication entries to the LPU.

A large number of users go online.

Reduce the number of authentication users.

SAPP

Manages the application layer protocol dictionary and whitelist, maintains software entries, and instructs the adaptation layer to set chip status.

This task will not cause high CPU usage.

No action is required.

SCFT

Shields commands at the link layer.

This task currently processes no messages.

-

SDKD

Detects HG interconnection interfaces.

An error occurs in detection task processing.

-

SDKE

SDK diagnosis task.

Too much diagnostic information is printed.

-

SECB

Task corresponding to the security module on the LPU.

A large number of protocol packets are sent to the CPU of the LPU.

-

SECE

Implements security functions such as ARP security, IP security, and CPU security, manages the protocol state machine, and maintains protocol databases.

A large number of protocol packets are sent to the CPU.

-

SEPP

Processes the received IPC messages and configures instance status (SEP proxy task).

-

-

SIMC

Simulates high CPU usage.

-

-

SIMU

Processes the task of simulating high CPU usage (main simulation task).

-

-

SLAG

Sends and receives packets of the E-trunk feature.

A large number of E-Trunks are configured and status flapping occurs.

This task rarely causes high CPU usage. If the problem occurs, shut down E-Trunk member interfaces to avoid flapping.

SMac

Dynamically configures a static MAC address based on the active/standby status of the MPU.

-

-

SMAG

Smart Link proxy task that processes link-down and shutdown events.

-

-

SMLK

Processes Smart Link and Monitor Link protocols.

-

-

smsLoad

Processes loading events.

-

-

smsRqDeal

Processes the request messages sent from a CANbus node.

-

-

smsRsDeal

Processes the response messages sent from a CANbus node.

-

-

smsRx

Processes the response and request messages on the management interface sent from the CANbus.

-

-

smsTimer

SMS internal scheduled task.

-

-

smsTx

Processes the response and request messages sent from SMS to CANbus.

-

-

socdmadesc.0

Reads chip 0 information to the CPU through SBUSDMA.

-

-

socdmadesc.2

Reads chip 2 information to the CPU through SBUSDMA.

-

-

SPM

Manages the energy-saving function.

-

-

SPTM

Super task management.

-

-

SRVC

Processes DHCP packets related to IP sessions, and interacts with the user management module and AAA module to complete authorization and accounting.

A large number of DHCP packets are sent to the CPU or authentication is performed for a large number of concurrent users.

Configure the rate limit of protocol packets properly and deploy the attack defense function.

STFW

Super forwarding task that maintains forwarding entries in the Eth-Trunk memory.

Eth-Trunk member interfaces are frequently created and deleted.

Avoid creating or deleting Eth-Trunk member interfaces frequently.

STP

Implements the STP protocol stack, manages the STP state machine, and maintains the STP database.

Incorrect connections exist after STP is deployed and the device suffers from a TC packet attack.

Check the configuration and configure TC suppression.

STRA

Processes attack source tracing and port attack defense.

A large number of protocol packets are sent to the CPU.

Configure the rate limit of protocol packets properly and deploy the attack defense function.

SUPP

Processes interrupt messages and timer messages in the device management module.

-

-

TACH

Receives authentication, authorization, and accounting requests sent from the AAA module, and transmits the requests to the TACACS server. The TACACS server processes these requests and returns the results to the AAA module.

An attacker continuously sends authentication requests.

Filter out unauthorized IP addresses through methods such as configuring a firewall, to prevent access from the unauthorized IP addresses.

TARP

Provides the ARP-Ping detection function.

ARP-Ping detection is performed frequently.

Reduce the frequency of ARP-Ping detection.

TCBM

Monitors whether blocking occurs and sends IPCR, RPCQ, and IPC synchronization messages.

This task is a scheduled monitoring task and will not cause high CPU usage.

No action is required.

TCTL

Collects and schedules performance data in batches.

Information is being collected in batches.

-

TM

Distributes authentication entries.

A large number of users go online.

Reduce the number of authentication users.

TNLM

Manages tunnels.

Tunnel flapping occurs.

Analyze the tunnel that flaps and shield the flapping source.

TNQA

Provides the NQA client function.

Too many NQA test instances are configured and the execution period is too short.

Control the NQA specification or prolong the execution period.

TOPO

Manages SVF topologies.

-

-

UMBR

Discovers neighbors in an SVF system.

-

-

TPLS

Manages sessions. This task is used when a switch is interconnected with a controller.

-

-

TRAF

Collects statistics on VLL, VPLS, and L3VPN traffic. The switch supports only VPLS traffic measurement.

This task rarely causes high CPU usage. If the problem occurs, it is caused by VPLS service flapping due to interface or route flapping.

Resolve the VPLS service flapping problem.

TRAP

Processes trap messages.

A large number of traps are generated. For example, a large number of interfaces alternate between Up and Down states.

The high CPU usage problem is automatically solved when the number of generated traps is stable.

TRUN

Processes Eth-Trunk status change events and LACP protocol packets (Eth-Trunk adaptation task).

This task may have high CPU usage when there are a large number of Eth-Trunks, interface status flaps, or optical modules become faulty.

Ensure that the interfaces and optical modules are normal: Check whether an interface frequently alternates between Up and Down states according to log information and alarm information. If so, check whether the optical module on the interface is faulty or a non-Huawei-certified optical module is used. Additionally, analyze the configuration and traffic volume on the interface.

TTNQ

Provides the NQA server function.

Too many NQA test instances are configured and the execution period is too short.

Control the NQA specification or prolong the execution period.

TTVP

Provides the VPLS network detection function.

VPLS detection is performed frequently.

Reduce the frequency of VPLS detection.

TUNL

Processes control and configuration messages in the TUNNEL module.

The same source interface is configured for a large number of tunnels, and the interface status or configuration is checked. Keepalive is configured on a large number of GRE tunnel interfaces.

This task rarely causes high CPU usage. If the problem occurs, avoid configuring keepalive on many GRE tunnel interfaces.

UCM

Manages authentication users.

A large number of users go online.

Reduce the number of authentication users.

UDPM

Processes the Layer 3 UDP Helper service on MPUs.

-

-

UMBR

Discovers neighbors in an SVF system.

-

-

USA

Processes the authentication service in the SVF and policy association scenario.

-

-

USBL

Loads software to a USB flash drive.

-

-

UTSK

User framework task that optimizes protocol processing to ensure preferential processing of protocol packets.

UTASK commands are registered and timers are created during device registration. After device registration is complete, this task no longer processes messages and will not cause high CPU usage.

No action is required.

UVMC

Manages templates on the controller.

-

-

VCLK

Wakes up the TICK clock.

-

No action is required.

VCMP

Runs the VCMP protocol.

-

-

VCON

Serial port redirection task on the LPU.

-

-

VFSD

Clears junk data from the file system periodically.

This task has a low priority.

No action is required.

VMON

Monitors system task running.

Timer messages and command messages are processed in the VMON module. The processing logic is simple, so the task will not cause high CPU usage.

No action is required.

VMSH

Displays information about other cards in the VMON module.

Information is queried in the VMON module. The processing logic is simple, so the task will not cause high CPU usage.

No action is required.

VOAM

Provides the NQA VPLS MAC diagnostic function.

VPLS and MAC address detections are performed quickly and frequently.

Decrease the frequency of operations.

VRPT

Temporary timer test task during system startup. It stops after the system is properly started.

This task will not cause high CPU usage.

No action is required.

VRRP

Implements the VRRP protocol stack, manages the VRRP state machine, and maintains the VRRP database.

A large number of VRRP groups are configured and interface flapping occurs.

This task rarely causes high CPU usage. If the problem occurs, shut down the interfaces where VRRP is configured, to avoid flapping.

WADP

WLAN adaptation task.

A large number of APs go online and offline, a large number of AP interfaces change, or a large number of STAs go online or offline concurrently.

Plan the network again and limit the number of online APs and users.

WMT_SYS

Manages WLAN components.

AP performance data is collected or messages are exchanged between WMNG modules.

If the CPU usage is high periodically, no action is required. If the CPU usage is high continuously, collect logs.

WEB

Web authentication service.

A large number of Portal authentication packets are sent to the CPU.

Limit the rate of Portal packets sent to the CPU, and check whether too many Portal packets are caused by external attacks or network loops.

WEB_

Adapts to the web system, including checking and decompressing the loaded web files.

-

-

WMT_SRV

WLAN component task that delivers configurations and backs up data in batches:

  • Processes configuration delivery messages (MAP and timer messages).
  • Processes CAPWAP messages.
  • Maintains status transition of the configuration delivery module.
  • Initializes the WESS, WQOS, and WGLB tasks.
  • Processes messages received on the radio module from other modules.
  • Processes messages reported by WVAP.
  • Processes the reported radio location information.
  • Processes HSB event notifications and HSB packets.
  • Notifies other modules of AP status changes.
  • Configurations are delivered when APs go online in batches.
  • Data is backed up in the case of dual-link HSB or VRRP.
  • The HSB service frequently flaps due to link flapping. As a result, batch deletion and batch backup are performed.
  • Backup is performed periodically.

This task is mainly used for HSB backup on WLAN products. It rarely causes high CPU usage on switches.

WMT_IDS

Detects wireless intrusions:

  • Checks validity of detection entries, processes detection entry mapping, and generates entries for countered rogue devices.
  • Generates attack detection entries and reports attack alarms.

A large number of APs are detected, or the detection interval is short.

This task rarely causes high CPU usage.

ArrmThread

Radio calibration task.

During radio calibration, neighbor information reported by APs is continuously processed. However, the algorithm is complex and the calculation workload is heavy, causing high CPU usage.

Configure scheduled calibration during off-peak hours.

WLAN_AgeList

Ages out WPA and WPA2 users.

Packets are retransmitted due to WPA key negotiation timeout, and there are a large number of concurrent WPA users.

This task rarely causes high CPU usage.

WAPI_RCV_PKT

Receives WAPI authentication packets.

There are a large number of WAPI authentication users.

This task rarely causes high CPU usage.

WPA_AgeList

Provides a WLAN component aging mechanism and ages out users.

User deassociation, aging, and authentication timeout events are processed.

Bring users offline in batches.

XSTP

Processes internal messages in the XSTP agent, and sends and receives packets.

VBST is enabled on a large number of interfaces and VLANs.

Reduce the number of interfaces and VLANs that participate in VBST calculation.

CPU-related Tasks and Functions for Fixed Switches

Task Name

Description

Reason for High CPU Usage

Solution

OSVT

Operating system virtual task.

This task will not cause high CPU usage.

No action is required.

POE

Manages the power over Ethernet (PoE), including PD class, power-on, and power-off.

The CPU usage is generally stable and reaches about 8%. Too many interruptions reported may result in high usage.

Run the display trapbuffer command to check whether the traffic passes through PD devices or PD devices are frequently powered on/off, and observe the CPU usage after PD devices are removed.

bcmCNTR .0

Collects traffic statistics on chip 0.

-

-

1AGA

EOAM_1AG super task for delivering module events.

-

-

port_statistics

Collects traffic statistics.

-

-

AAA

Manages user authentication, authorization, and accounting.

Authentication, authorization, and accounting are performed for a large number of users.

Reduce online users.

ACL

Creates and updates ACLs.

Too many ACLs are delivered at a time.

Prolong the interval between configuring ACLs.

ADPT

Processes BFD VLANIF interface Down events and CFD logic interruption events, and sets the timer for the EFM module.

-

-

ALM

Manages hardware fault alarms, including the temperature sensors, power modules, fan modules, and optical modules.

An alarm is generated on the device.

Run the display alarm all command to check whether alarms are frequently displayed, and clear the alarms.

ALS

Manages loss signals on the interface, including detecting and handling loss signals.

-

-

AM

Manages IP address pools and IP addresses for modules such as DHCP.

A large number of users apply for IP addresses.

Reduce the number of users who apply for IP addresses.

APP

Centrally schedules Layer 3 service tasks.

Multiple tasks are performed to process many service messages.

Run the display utask-info utask-id slice-time command to check which UTASK task takes a long time.

ASFI

Processes sFlow messages on LPUs.

sFlow sampling is configured on a large number of interfaces, and the sampling ratio or sampling interval is too small.

Deploy the sFlow service properly, and configure the sampling ratio and sampling interval based on actual traffic on the interfaces.

ASFM

Processes sFlow messages on MPUs.

ASMN

Manages ASs in an SVF system.

-

-

BATT

Manages batteries.

-

-

BFD

Implements the BFD protocol stack, manages the protocol state machine, and maintains the protocol database.

A large number of BFD sessions flap.

Delete or shut down BFD sessions.

BFDA

BFD adaptation task that processes IPC messages as well as ARP and MAC address change messages.

-

-

BFDS

Processes BFD sending and detection timers and other events.

-

-

BOX_Out

Outputs information stored in the black box (the black box is used to record the errors and exceptions that occur during the running of the product). The black box provides only the mechanism for recording, querying, and obtaining information. Users need to record the information according to the functions provided by the black box.

Errors, assertions, exceptions or deadloops occur on the device.

No action is required.

BOX

BPDU

BPDU adaptation task that processes timer messages of the BPDU module.

-

-

BTRC

Traces internal debugging functions.

The trace function is enabled.

Disable the trace function.

CAPM

Processes CAPWAP events.

There are too many online users.

Reduce the number of online users.

CLKI

Processes the timers, IPC messages, and interrupt messages in the clock module of the MPU.

-

-

CSPF

Calculates paths for TE tunnels.

The TEDB for CSPF frequently changes.

Check whether the link or IGP flaps. If so, rectify the fault.

CWP_BUP

Processes MAP messages.

In normal cases, this task does not cause high CPU usage.

Decrease the service concurrency rate, and expand the system capacity or use high-performance main control units such as SRUH.

CWP_DTLS

Performs DTLS encryption.

DTLS links are created or disabled, DTLS negotiation is performed, or APs set up DTLS links in batches.

This task is used when APs go online through DTLS links. However, it is rarely used. If this task causes high CPU usage, disable DTLS based on the network requirements.

DCPI

Monitors IP traffic (IP FPM).

Many configurations are enabled and the measurement interval is short.

Avoid many configurations and increase the measurement interval.

DEFD

Processes CPU defense events.

Too many packets are sent to the CPU.

Limit the rate of packets sent to the CPU.

DEVA

Handles subcard hot swapping.

-

-

DLDP

Sends and receives DLDP protocol packets and manages the protocol state machine.

DLDP is enabled on too many interfaces and the interval at which DLDP packets are sent is too short.

  • Run the dldp interval command to adjust the interval at which DLDP packets are sent.
  • Disable DLDP on the interfaces that do not require the DLDP function.

EAP

Performs MAC address and 802.1X authentication.

Authentication is performed for a large number of MAC and 802.1X users.

Reduce the number of authentication users.

ECM

Manages Ethernet channels and maintains the channel status, validity, and channel switchover.

-

-

ECMM

Manages Ethernet channel configurations, including stack port configuration.

-

-

EDBG

Records the maintainability of Ethernet channels.

-

-

EFMT

Sends 802.3ah test packets.

-

-

EOAM

Implements the EOAM 802.1ag protocol, manages the protocol state machine, and maintains the protocol database.

The associated service flaps.

This task rarely causes high CPU usage. If the problem occurs, ensure that the associated service does not flap.

ESAP

eSAP adaptation task.

There are too many online APs and users.

Reduce the number of online APs and users.

EZOP

Task of the EasyOperation function.

The CPU becomes high during a period of time after a device starts, and then the CPU usage is restored to the normal range.

-

EZPP

Packets sending and receiving task of the EasyDeploy function.

-

-

FCAT

Obtains packets.

Too many packets are obtained and printed frequently.

-

FECD

Processes messages at the FECD layer.

Too much diagnostic information is printed.

-

FLOW

Performs traffic measurement.

Too much traffic needs to be collected and analyzed.

Disable the sFlow service in the case of heavy network traffic.

FSP

Manages stack tasks and maintains stack topologies, links, and states.

-

-

GEM

Manages general events.

This task is currently not executed.

No action is required.

GEMR

GREI

GRE module adaptation task on the LPU.

-

-

GREM

GRE module adaptation task on the MPU.

-

-

GRES

Task corresponding to the label and token resource modules.

The CPU is high due to the application that applies for resources. The GRESM task usually does not cause high CPU usage.

Check whether the service that applies for labels or tokens flaps.

GRSA

Creates RSA and DSA key pairs.

This task will not cause high CPU usage.

This task will not cause high CPU usage.

GVRP

Receives and sends GVRP packets, and processes internal messages of the GVRP protocol.

The CPU usage is high due to a large number of VLANs that need dynamic GARP registration or a large network radius.

Increase the timer value based on the product documentation.

HTPD

Processes the built-in Portal.

A large number of HTTP packets for Portal authentication are sent to the CPU.

  • Reduce the number of authentication users.
  • Limit the rate of HTTP packets sent to the CPU, and check whether too many HTTP packets are caused by external attacks or network loops.

IFAD

Processes IPC messages delivered by the VCT.

VCT detection is performed frequently.

Perform VCT detection properly.

IFLP

Collects traffic statistics on a management interface periodically.

A large number of interfaces are configured and the measurement interval is short.

Avoid collecting statistics on a large number of interfaces and increase the measurement interval.

IFNT

Processes interface status change events.

The interface flaps frequently.

Configure the interface status suppression.

IFPD

Manages interfaces, maintains interface database, and processes interface status change events.

There are a large number of interfaces, interface link status flaps, or optical modules become faulty.

Ensure that the interfaces and optical modules are normal:

Check whether an interface frequently alternates between Up and Down states according to log information and alarm information. If so, check whether the optical module on the interface is faulty or a non-Huawei-certified optical module is used. Additionally, analyze the configuration and traffic volume on the interface.

INPT

Serial port task.

-

-

IP

Schedules IP protocol tasks in a unified manner.

ND entries are frequently created and deleted, for example, the IPv6 protocol flaps.

Decrease the frequency of operations.

IPCK

Processes the received IPC messages and sends ACK messages to the peer.

The service process is simple, which will not cause high CPU usage.

No action is required.

IPCQ

Retransmits IPC messages upon message transmission failures.

The retransmission frequency is not high, which will not cause high CPU usage.

No action is required.

IPCR

Sends, receives, and distributes IPC messages to related service modules.

-

-

IPFP

Task of the IP FPM detection function.

A large number of instances are running simultaneously.

Reduce the number of configurations.

ISC6

Processes commands of IPsec6 and encrypts packets.

This task will not cause high CPU usage.

No action is required.

ITSK

Sends, receives, and distributes various protocol packets.

A large number of protocol packets are sent and received.

Reduce the number of received and sent packets by adjusting the CPCAR for example.

JOB

Maintenance assistant task.

When the maintenance assistant meets the trigger conditions, the CPU usage will be high if many commands in the script are executed in batches.

Reduce the number of commands in the script.

L2

Supports the MGR, ErrorDown, BPTNL, LNP, VCMP, MFLP, VLAN, and QinQ features.

  • LNP: There are too many interfaces.
  • VCMP: VLANs are frequently deleted or created.
  • BPTNL: A large number of packets are transparently transmitted.
  • LNP: This feature rarely causes high CPU usage. If the problem occurs, check the reason of interface flapping and avoid frequent flapping.
  • VCMP: Avoid frequently creating or deleting VLANs.
  • BPTNL: Configure transparent transmission of protocol packets on interfaces.

L2IF

Processes real-time backup and batch backup of MAC address and VLAN information.

-

-

L2PQ

Processes IPC messages of Layer 2 protocols.

-

-

L2V

Processes L2VPN services, including VLL and VPLS.

Flapping occurs on the public network. As a result, a large number of services send mapping packets, and connections are re-established.

Solve the problem of public network flapping.

L2_E

Main task of the EOAM feature.

The associated service flaps.

This task rarely causes high CPU usage. If the problem occurs, ensure that the associated service does not flap.

L2_P

Supports LACP, HGMP, 3AH, and ELMI features.

-

-

L2_R

Supports ERPS, RRPP, and SEP features.

Incorrect connections exist after a protocol is deployed and the device suffers from a TC packet attack.

Ensure that physical loops are closed.

L2_T

Supports the Eth-Trunk feature.

-

-

L3I4

Processes Layer 3 IPv4 services on the LPU.

-

-

L3IO

Processes Layer 3 services in the common module on the LPU.

-

-

L3M4

Processes Layer 3 IPv4 services on the MPU.

-

-

L3MB

Processes Layer 3 services in the common module on the MPU.

-

-

LAGA

Agent task on the LPU for sending and receiving LACP negotiation packets.

A large number of LACP negotiation packets are received and the LACP frequently flaps.

Analyze the configuration and traffic on interfaces, and verify that the Eth-Trunk service is normal.

LAGAGT

LBDT

Sends, receives, and processes loopback detection packets.

LBDT is configured for many VLANs and interfaces.

Disable LBDT in some VLANs and on some interfaces.

LDUP

Pre-loads files when a card is upgraded.

-

-

LGBF

Records log files of the drive module.

-

-

LINK

Centrally schedules link layer tasks.

Multiple tasks are performed to process many service messages.

Run the display utask-info utask-id slice-time command to check which UTASK task takes a long time.

LLDP

Sends, receives, and processes LLDP protocol packets.

A switch receives a large number of LLDP protocol packets because it has too many LLDP neighbors.

Reduce the number of LLDP neighbors on the switch.

LNP

LNP protocol task.

-

-

LOAD

Loading tasks, involving the following events: member joining and leaving a stack, managing the load module, receiving load packets, packet retransmission due to timeout, ACK retransmission due to timeout, and timer event.

-

-

LSPA

MPLS LSP process (MPLS LSP AGENT) task.

-

-

LSPM

Processes LSP services.

LDP, RSVP, or BGP LSPs flap frequently, triggering LSP creation and deletion.

Determine which type of LSPs flap, such as LDP, BGP, or RSVP LSP. LSP flapping is typically caused by IGP, BGP, or VPN route flapping.

MAC

Processes MAC address flapping.

A network loop occurs.

Remove the loop. For details, see Checking Whether the Problem Is Caused by Network Loop.

MACL

Processes traffic policies.

Many traffic policies are applied frequently.

Avoid adding or deleting rules when many traffic policies are applied.

MACRESTORE

Retrieves bottom-layer MAC software entries.

MAC address tables age out and cannot be sent to the CPU. A large number of software and hardware tables cannot be synchronized.

-

MAD

Processes MAD in direct mode.

-

-

MADP

Processes MAD in relay mode.

-

-

MBRB

Discovers neighbors in an SVF system.

-

-

MDNS

Processes mDNS protocol packets.

A large number of mDNS packets are sent to the CPU.

Limit the rate of mDNS packets sent to the CPU, and check whether too many mDNS packets are caused by external attacks or network loops.

MERX

Processes the packets received on the management interface.

The management interface receives a large number of packets.

Rate limiting on the management interface prevents the attack of a large number of packets.

METH

Management interface task.

-

-

MFF

MFF task.

ARP-MFF packets are processed.

Configure the rate limit of ARP-MFF packets properly and deploy the attack defense function.

MSYN

Synchronizes MAC entries between cards.

A network loop occurs.

Remove the loop. For details, see Checking Whether the Problem Is Caused by Network Loop.

MTR

Monitors memory information.

-

-

Mirr

Processes the mirroring service.

A large number of configurations are synchronized in the batch backup process.

Reduce the mirroring configurations.

NDIO

Layer 3 IPv6 adaptation task on the LPU.

-

-

NDMB

Layer 3 IPv6 adaptation task on the MPU.

-

-

NFPT

Manages scheduled tasks.

This task will not cause high CPU usage.

No action is required.

NTLK

Netlink fast path for exchanging kennel-mode and user-mode messages.

-

-

NTPT

Provides the NTP clock synchronization function.

A large number of NTP attack packets are received.

Configure NTP authentication.

OAM

Implements the MPLS OAM protocol stack, manages the protocol state machine, and maintains the protocol database.

-

-

OAM1

Adapts to the OAM 802.1ag protocol, responds to protocol-layer changes, and responds to changes on the forwarding plane.

-

-

PARITY_CHECK

Detects soft errors in entries.

Soft errors occur in entries.

-

PATC

Patch adaptation module.

-

-

PMS

Uploads performance measurement files. This task is triggered when automatic upload of performance measurement files is enabled.

Generally, files are not uploaded frequently and the sizes of uploaded files are small. Therefore, this task will not cause high CPU usage.

-

PNGI

Processes the Layer 3 fast ping service on LPUs.

-

-

PNGM

Processes the Layer 3 fast ping service on MPUs.

-

-

POE+

Processes the PPPoE plus protocol.

A large number of PPPoE packets are sent to the CPU.

  • Reduce the number of PPPoE users.
  • Limit the rate of PPPoE packets sent to the CPU, and check whether too many PPPoE packets are caused by external attacks or network loops.

PPI

Maintains VLAN and MAC address data and delivers entries (L2 adaptation task).

Network loops or flapping occurs, or port security is configured on multiple ports.

PS

Processes built-in Portal authentication.

A large number of users are authenticated by the built-in Portal authentication function.

Reduce the number of authentication users.

PTAL

Processes Portal authentication.

A large number of HTTP packets for Portal authentication are sent to the CPU.

  • Reduce the number of authentication users.
  • Limit the rate of HTTP packets sent to the CPU, and check whether too many HTTP packets are caused by external attacks or network loops.

RDS

Processes the RADIUS protocol.

A large number of RADIUS packets are sent to the CPU.

  • Reduce the number of authentication users.
  • Limit the rate of RADIUS packets sent to the CPU, and check whether too many RADIUS packets are caused by external attacks or network loops.

RMON

Delivers authentication entries to the LPU.

A large number of users go online.

Reduce the number of authentication users.

ROUT

Distributes authentication entries.

RPCQ

Manages authentication users.

RSVP

Processes the authentication service in the SVF and policy association scenario.

-

-

RTMR

Web authentication service.

A large number of Portal authentication packets are sent to the CPU.

Limit the rate of Portal packets sent to the CPU, and check whether too many Portal packets are caused by external attacks or network loops.

SAM

Delivers authentication entries to the LPU.

A large number of users go online.

Reduce the number of authentication users.

SAPP

Manages the application layer protocol dictionary and whitelist, maintains software entries, and instructs the adaptation layer to set chip status.

-

-

SECE

Implements security functions such as ARP security, IP security, and CPU security, manages the protocol state machine, and maintains protocol databases.

A large number of protocol packets are sent to the CPU.

Configure the rate limit of protocol packets properly and deploy the attack defense function.

SLAG

Sends and receives packets of the E-trunk feature.

A large number of E-Trunks are configured and status flapping occurs.

This task rarely causes high CPU usage. If the problem occurs, shut down E-Trunk member interfaces to avoid flapping.

SMAG

Smart Link proxy task that processes link-down and shutdown events.

-

-

SMLK

Processes Smart Link and Monitor Link protocols.

-

-

SPM

Provides the intelligent power supply function to save energy.

-

-

SPTM

Manages the super timer.

-

-

SRM

Manages system resources, including fans, subcards, and power supplies.

-

-

SRMT

Device management timer task.

-

Replace the optical modules with Huawei-certified optical modules.

STFW

Super forwarding task that maintains forwarding entries in the Eth-Trunk memory.

Eth-Trunk member interfaces are frequently created and deleted.

Avoid creating or deleting Eth-Trunk member interfaces frequently.

STND

Assists the operating system in task and event scheduling.

This task has a low priority and the process logic is simple, so the task will not cause high CPU usage.

No action is required.

STP

Implements the STP protocol stack, manages the STP state machine, and maintains the STP database.

Incorrect connections exist after STP is deployed and the device suffers from a TC packet attack.

Check the configuration and configure TC suppression.

STRA

Processes attack source tracing and port attack defense.

A large number of protocol packets are sent to the CPU.

Configure the rate limit of protocol packets properly and deploy the attack defense function.

TACH

Receives authentication, authorization, and accounting requests sent from the AAA module, and transmits the requests to the TACACS server. The TACACS server processes these requests and returns the results to the AAA module.

An attacker continuously sends authentication requests.

Filter out unauthorized IP addresses through methods such as configuring a firewall, to prevent access from the unauthorized IP addresses.

TARP

Provides the ARP-Ping detection function.

ARP-Ping detection is performed frequently.

Reduce the frequency of ARP-Ping detection.

TCBM

Monitors whether blocking occurs and sends IPCR, RPCQ, and IPC synchronization messages.

This task is a scheduled monitoring task and will not cause high CPU usage.

No action is required.

TM

Distributes authentication entries.

A large number of users go online.

Reduce the number of authentication users.

TNLM

Manages tunnels.

Tunnel flapping occurs.

Analyze the tunnel that flaps and shield the flapping source.

TNQA

Provides the NQA client function.

Too many NQA test instances are configured and the execution period is too short.

Control the NQA specification or prolong the execution period.

TOPO

Manages SVF topologies.

-

-

TPLA

Manages SVF templates, and configures delivery and computing for ASs.