High CPU Usage on S Series Switches
- Introduction
- Prerequisites
- Knowledge About High CPU Usage
- CPU and CPU Usage Overview
- CPU and CPU Usage Working Mechanism
- How to Locate the High CPU Usage Problem
- How to Fix the High CPU Usage Problem
- How to Relieve CPU Load
- Appendix
- Related Information
Introduction
This document provides information about CPU and CPU usage on Huawei S series switches and how to locate and rectify faults when the CPU usage is high. In addition, it provides typical examples and references to maintenance engineers.
Prerequisites
The functions and commands supported by different models may be different. This document uses V200R007 as an example. For the functions and commands used on your switch, see the related switch documents.
Knowledge About High CPU Usage
This section describes knowledge about high CPU usage on switches, including impact of high CPU usage, reason why CPU usage is high, fault locating methods, method of lowering CPU usage, and method of preventing high CPU usage.
CPU and CPU Usage Overview
CPU - The Core of a Switch
A switch uses the distributed architecture, including forwarding and control planes. The forwarding plane implements Layer 2 and Layer 3 forwarding; the control plane implements forwarding control.
As shown in Figure 1-1, the control plane uses the universal embedded CPU and the forwarding plane uses forwarding chip:
- The forwarding chip implements Layer 2 and Layer 3 forwarding, for example, updating the MAC address table for Layer 2 forwarding and Layer 3 forwarding table for IP forwarding. The forwarding chip implements data forwarding with a high throughput.
- The CPU maintains software entries, such as routing and ARP entries, and configures the hardware Layer 3 forwarding table in chip based on the software forwarding entries. The CPU can also provide software-based Layer 3 forwarding. However, a disadvantage of CPU is that it has a low processing capability.
Packets on a network can be classified into control packets and data packets depending on their functions. If a switch does not have any hardware forwarding entry, the first packet reaching the switch is forwarded by the CPU and a Layer 3 forwarding hardware entry is created. The follow-up packets enter the forwarding chip through the inbound interface. Figure 1-2 shows this process.
- Flow 1 (data packets) is sent out by the forwarding chip, and does not pass the CPU. The flow processing does not consume CPU resources.
- Flow 2 (control packets and a part of data packets) is forwarded to the CPU through the forwarding chip. The CPU determines whether to send the flow out or terminate it. Flow 2 consumes CPU resources, and cannot be forwarded in a high speed.
The Layer 2 and Layer 3 hardware entries in the forwarding chip determine whether a switch can implement high-speed forwarding; however, the hardware entries in the forwarding chip are created based on the software entries maintained in the CPU. Therefore, the CPU is the core of a switch.
CPU Usage
After a switch starts, the CPU runs more than 200 active tasks to manage the switch and monitor Layer 3 entry learning. The number of tasks may vary according to switch models. In addition, when more features are configured on a switch, more tasks run in the system
CPU usage is the percentage of the amount of time a CPU spends processing non-idle tasks. It has the following characteristics:
- Constantly changing: A switch's CPU usage keeps changing with system operations and changes of the environment.
- Non-real-time: CPU usage data reflects CPU usage within a statistical period.
- Entity-relevant: CPU usage is calculated based on physical CPU. Generally, each service card on a switch has an independent physical CPU. Therefore, the CPU usages of different cards are calculated separately.
A CPU usage reflects task running status at a specified time point. In Figure 1-3, task A occupies CPU resource for 10 ms, task B occupies CPU resource for 30 ms, and they stop for 60 ms. Then, task A occupies CPU resource for 10 ms, task B occupies CPU resource for 30 ms, and they stop for 60 ms. In this period, the CPU usage is 40%. A high CPU usage indicates that the switch is running many tasks.
It can be found that the CPU usage is directly related to CPU performance. Therefore, the CPU usage is a key indicator of switch performance.
CPU and CPU Usage Working Mechanism
How Does a CPU Process Packets (Modular Switch)
Huawei switches forward data packets through the forwarding chip without involving the CPU. The following packets will be sent to the CPU for processing on a switch:
- Protocol packets to be terminated by the switch
All packets destined for the switch, including:
- Control packets of protocols, such as STP, LLDP, LNP, LACP, VCMP, DLDP, EFM, GVRP, and VRRP
- Route update packets of routing protocols, such as RIP, OSPF, BGP, and IS-IS
- SNMP, Telnet, SSH packets
- ARP and ND reply packets
- Packets requiring special processing
- ICMP packets carrying options
- IPv6 packets with hop-by-hop option
- IPv4/IPv6 packets with a TTL value less than or equal to 1
- Packets with the switch's local IP address as the destination address
- ARP/ND/FIB Miss packets
- Packets forwarded to the CPU by matching ACL
- Packets discarded by the deny action in ACL rules after the logging function is enabled
- Packets redirected to the CPU by traffic policies
- Multicast-related packets
- PIM, IGMP, MLD, and MSDP protocol packets
- Unknown IP multicast packets
- Packets related to other features
- DHCP packets
- ARP and ND broadcast request packets
- Layer 2 protocol packets forwarded through software by L2PT (Devices on two ends of a tunnel forward Layer 2 protocol packets through software, and intermediate devices forward these packets through chip.)
In Figure 1-4, multiple rate limiting operations are performed on the packets that are sent to the CPU of an MPU. For example, forwarding chips and SFU chips will limit the rate. The rate limiting ensures security of the MPU CPU.
In Figure 1-5, rate limiting on each chip or logic includes protocol-based rate limiting, queue-based rate limiting, and port-based rate limiting. The following provides default CPU rate limiting configuration on non-X1E LPUs of the S9300 running V200R007. To check the default CPU rate limiting configuration in other switch models and versions, run the display cpu-defend configuration all command.
Packet Type |
Rate Limit on LPU (in kbit/s) |
Rate Limit on MPU (in kbit/s) |
---|---|---|
802.1x, arp-miss, mpls-ping, nd, nd-miss, loopbacktest, nd-redirect |
64 |
64 |
smart-link, lacp, lldp, dldp, ttl-expired, mpls-ttl-expired, ntp, hw-tacacs, fib-miss, hgmp-bc, smlk-rrpp, hotlimit, mpls-vccv-ping, arp-request, arp-reply, arp-mff, vpls-arp |
64 |
128 |
eoam-3ah, mpls-one-label |
64 |
256 |
vpls-igmp, mpls-rsvp, ipmc-invalid, bpdu |
64 |
512 |
vrrp, bgp4plus, vrrp6, hvrp, ssh, ftp, snmp, gvrp, eoam-1ag-lblt, pppoe, hopbyhop, hgmp-mc, hgmp-uc, nac-nd, nd-snp-rs, nd-snp-rans, nd-snp-na, mad, nac-arp |
128 |
128 |
mpls-oam, igmp, pim, rip, telnet, tcp, fib-hit, rrpp, udp-helper |
128 |
256 |
stp, mld, unknown-multicast, bpdu-tunnel, ipmc-miss |
128 |
512 |
fib6-hit, mpls-fib-hit |
128 |
1024 |
icmp |
192 |
256 |
http, pimv6, icmpv6, easy-operation, eoam-1ag, heart-packet |
256 |
256 |
isis, ospf, ospf-hello, bgp, bfd, mpls-ldp, ripng, ospfv3, nac-dhcp, vpls-dhcp-request, vpls-dhcp-reply, nac-dhcpv6, ospfv3-uc |
256 |
512 |
dhcp-client, dhcpv6-request, dhcpv6-reply, radius, y1731 |
512 |
512 |
dhcp-server |
512 |
1024 |
Queue ID on an LPU |
Packet Type |
Description |
---|---|---|
7 |
lacp |
Fast protocol packets (fast protocols have fast responses in interaction, for example, the response time of BFD is within 100 ms. The loss of a few packets will cause protocol flapping.) |
6 |
vp (VRRP packets are moved from queue 5 to queue 6 in V200R010.) |
Packets sent from an LPU's CPU to the MPU's CPU |
5 |
stp, smart-link, ldt, lldp, dldp, vrrp, mpls-oam, isis, pim, rip, ospf, ospf-hello, bgp, bfd, mpls-rsvp, mpls-ldp, mpls-ttl-expired, ntp, ripng, ospfv3, bgp4plus, pimv6, vrrp6, hvrp, telnet, ssh, mpls-ping, gvrp, bpdu-tunnel, rrpp, eoam-3ah, eoam-1ag, eoam-1ag-lblt, nd, y1731, mpls-one-label, loopbacktest, bpdu, nap, hgmp-mc, hgmp-uc, hgmp-bc, nd-redirect, nd-snp-rs, nd-snp-rans, nd-snp-na, mad, smlk-rrpp, ospfv3-uc |
Important control plane protocol packets |
4 |
other |
- |
3 |
arp-request, arp-reply, dhcp-client, dhcp-server, gmp, vpls-igmp, icmp, 8021x, http, dhcpv6-request, dhcpv6-reply, icmpv6, mld, ftp, snmp, radius, hw-tacacs, tcp, easy-operation, fib-hit, fib-miss, arp-miss, unknown-packet, udp-helper, arp-mff, pppoe, hopbyhop, mpls-vccv-ping, fib6-hit, nd-miss, nac-dhcp, vpls-arp, vpls-dhcp-request, vpls-dhcp-reply, nac-arp, icmp-ttl-expired, mpls-fib-hit, nac-nd, nac-dhcpv6, heart-packet |
Important control plane protocol packets |
2 |
ttl-expired, hotlimit |
Secondary control plane protocol packets |
1 |
unknown-multicast, ipmc-invalid, ipmc-miss |
Secondary control plane protocol packets |
0 |
other |
- |
Queue ID on an MPU |
Packet Type |
Description |
---|---|---|
7 |
lacp |
Fast protocol packets (fast protocols have fast responses in interaction, for example, the response time of BFD is within 100 ms. The loss of a few packets will cause protocol flapping.) |
6 |
vp (VP packets are the same as those in the original protocol packet queue in V200R003 and later versions. VRRP packets are moved from queue 5 to queue 6 in V200R010.) |
Packets sent from an LPU's CPU to the MPU's CPU |
5 |
stp, smart-link, ldt, lldp, dldp, vrrp, mpls-oam, isis, pim, rip, ospf, ospf-hello, bgp, bfd, mpls-rsvp, mpls-ldp, mpls-ttl-expired, ntp, ripng, ospfv3, bgp4plus, pimv6, vrrp6, hvrp, telnet, ssh, mpls-ping, gvrp, bpdu-tunnel, rrpp, eoam-3ah, eoam-1ag, eoam-1ag-lblt, nd, y1731, loopbacktest, bpdu, nap, hgmp-mc, hgmp-uc, hgmp-bc, nd-redirect, nd-snp-rs, nd-snp-rans, nd-snp-na, mad, smlk-rrpp, ospfv3-uc |
Important control plane protocol packets |
4 |
other |
- |
3 |
arp-request, arp-reply, dhcp-client, dhcp-server, gmp, vpls-igmp, icmp, 8021x, http, dhcpv6-request, dhcpv6-reply, icmpv6, mld, ftp, snmp, radius, hw-tacacs, tcp, easy-operation, fib-hit, fib-miss, arp-miss, unknown-packet, udp-helper, arp-mff, pppoe, hopbyhop, mpls-vccv-ping, fib6-hit, nd-miss, nac-dhcp, mpls-one-label, vpls-arp, vpls-dhcp-request, vpls-dhcp-reply, nac-arp, icmp-ttl-expired, mpls-fib-hit, nac-nd, nac-dhcpv6, heart-packet |
Important control plane protocol packets |
2 |
ttl-expired, hotlimit |
Secondary control plane protocol packets |
1 |
unknown-multicast, ipmc-invalid, ipmc-miss |
Secondary control plane protocol packets |
0 |
sFlow, NetStream |
Data packets or messages |
A switch determines into which CPU queues packets will be placed based on the packets' importance and plane (management, control, or forwarding plane). A CPU queue has a priority. For example, when both the Telnet management packets and dhcp-client protocol packets are sent to the CPU, the CPU first processes the Telnet management packets in queue 5. This mechanism ensures device stability and manageability under a heavy CPU load. The CPU can use a weighting mechanism to ensure that packets in low-priority queues can be processed. On a stable network, the number of packets sent to the CPU is limited within a specified range, and therefore the CPU usage remains within a proper range. If a large number of packets are sent to the CPU within a short period, the CPU is busy processing these packets, resulting in a high CPU usage.
How Does a CPU Process Packets (Fixed Switch)
Huawei switches forward data packets through hardware without involving the CPU. The following packets will be sent to the CPU for processing on a switch:
- Protocol packets to be terminated by the switch
All packets destined for the switch, including:
- Control packets of protocols, such as STP, LLDP, LNP, LACP, VCMP, DLDP, EFM, GVRP, and VRRP
- Route update packets of routing protocols, such as RIP, OSPF, BGP, and IS-IS
- SNMP, Telnet, SSH packets
- ARP and ND reply packets
- Packets requiring special processing
- ICMP packets carrying options
- IPv6 packets with hop-by-hop option
- IPv4/IPv6 packets with a TTL value less than or equal to 1
- Packets with the switch's local IP address as the destination address
- ARP/ND/FIB Miss packets
- Packets processed using ACLs
- Packets discarded by the deny action in ACL rules after the logging function is enabled
- Packets redirected to the CPU by traffic policies
- Multicast
- PIM, IGMP, MLD, and MSDP protocol packets
- Unknown IP multicast packets
- Other features
- DHCP packets
- ARP and ND broadcast request packets as well as the ARP packets sent when dynamic ARP inspection (DAI) is configured on a Layer 2 switch
- Layer 2 protocol packets forwarded through software by L2PT (Devices on two ends of a tunnel forward Layer 2 protocol packets through software, and intermediate devices forward these packets through hardware.)
- In N:1 VLAN mapping, the first packet is sent to the CPU, and other packets are forwarded by hardware.
A switch uses QoS mechanisms to prioritize packets sent to the CPU and ensure preferential processing of important packets. The switch groups different packets sent to the CPU into eight queues by priority. The types of packets sent to the CPU may vary in different switch models. Table 1-4 and Figure 1-6 lists typical packets that are sent to the CPU in the S5700LI. A larger queue ID indicates a higher priority.
Queue ID |
Packet Type |
Description |
---|---|---|
7 |
IPC, RPC, LACP |
Internal management packets |
6 |
VP (VP packets are the same as those in the original protocol packet queue in V200R003 and later versions.) |
Internally forwarded protocol packets |
5 |
Telnet, SSH, LNP, DHCP |
Management plane protocol packets |
4 |
ARP Request |
Important control plane protocol packets |
3 |
STP, SMLK, EOAM, VCMP |
Important control plane protocol packets |
2 |
LBDT, LLDP, DLDP, IGMP, ICMP, NTP, 802.1x, GVRP, L2PT, ARP Miss, FTP, SNMP |
Control plane protocol packets |
1 |
Other |
- |
0 |
sFlow, NetStream |
Data packets or messages |
A switch determines into which CPU queues packets will be placed based on the packets' importance and plane (management, control, or forwarding plane). A CPU queue has a priority. For example, when Telnet management packets and Layer 2 protocol packets transparently forwarded through L2PT are sent to the CPU, the CPU first processes the Telnet management packets in queue 5. This mechanism ensures device stability and manageability under a heavy CPU load. The CPU can use a weighting mechanism to ensure that packets in low-priority queues can be processed. On a stable network, the number of packets sent to the CPU is limited within a specified range, and therefore the CPU usage remains within a proper range. If a large number of packets are sent to the CPU within a short period, the CPU is busy processing these packets, resulting in a high CPU usage.
Impact of High CPU Usage
The CPU on a switch will be overloaded if the forwarding plane sends packets to the CPU at high speeds (for example, the CPU receives a large number of packets within a short time due to a loop on the network) or a task consumes CPU resources for a long time. When this occurs, the CPU may be unable to process other tasks in a timely manner, which may cause exceptions in services.
High CPU usage adversely affects the system processing capability and may result in the following network problems:
- Nonresponse to management requests
- Failure to set up a Telnet or SSH session with the switch, causing a failure to manage the switch, slow response of the switch, or delay in command execution
- SNMP timeout
- Long delay or even timeout of MAC/IP ping operations
- DHCP or 802.1X service failures caused by the switch's failure to forward or respond to requests from clients
- Changes in the STP topology or even loops
A switch maintains root and alternate ports based on the BPDUs periodically received on its CPU. If the upstream device cannot send BPDUs in a timely manner because its CPU is busy or the switch's CPU is too busy to process received BPDUs, the switch considers the original path to the root bridge to have failed and selects a new root port, causing network reconvergence. If the switch also has an alternate port, the switch uses the alternate port as the new root port. In this situation, a loop may occur on the network.
- Changes in the routing topology
Hello packets of dynamic routing protocols are processed by the CPU. If the CPU is too busy to process the received Hello packets or send Hello packets, route flapping occurs. For example, OSPF flapping, BGP flapping, or VRRP flapping may occur in this situation.
- Flapping of reliability detection protocols
The CPU is responsible for keepalive of detection protocols such as 802.3ah, 802.1ag, DLDP, BFD, and MPLS OAM. If a busy CPU cannot transmit or receive protocol packets promptly, protocol flapping occurs, which affects service traffic forwarding.
- LACP Eth-Trunk link flapping
LACP packets are processed by the CPU. If the CPU is too busy to receive and send LACP packets, the Eth-Trunk link will flap between Up and Down states.
- Dropping of software forwarded packets or increasing delay in forwarding such packets
- Memory usage of the switch increases.
Normal High CPU Usage Situations
A high CPU usage will cause service faults, for example, Border Gateway Protocol (BGP) route flapping, frequent Virtual Router Redundancy Protocol (VRRP) switchovers, or even user login failures. In some situations, a high CPU usage does not affect the network. For example, when a switch is reading optical transceiver information or traffic is bursting, the CPU usage may sharply increase. This is a normal and acceptable situation. Therefore, a high CPU usage may not be caused by faults. If a switch cannot process services for a long time, check whether a fault has occurred.
A high CPU usage resulting from the following events is normal and does not need to be handled. If the CPU usage can automatically restore to a normal range, you do not need to perform any operations.
- Traffic bursts.
- A card starts.
- The switch reads information about multiple optical transceivers simultaneously.
- The switch is calculating the spanning tree.
On a device running Multiple Spanning Tree Protocol (MSTP) network, the CPU usage is proportional to the number of instances and active ports. On a device running VLAN-based Spanning Tree (VBST), each VLAN runs an independent instance. Therefore, VBST uses more CPU resources than MSTP when they have the same number of VLANs and active ports.
- The switch updates routing table in a large scale after receiving route update messages.
When a switch receives a route update message, the switch updates routing information and delivers it to the control plane, which consumes CPU resources. In a cluster/stack system, the switch also needs to synchronize routing information to other member switches.
During routing table update, the following factors affect the CPU usage:
- Number of entries in the routing table
- Update frequency
- Number of routing processes receiving the update messages
- Number of member switches in a cluster/stack
- The switch is running copy cfcard:/ or output much debugging information.
- The NMS frequently operates the switch.
- Other events
- Fast MAC address learning on a port running the sticky MAC function
- Many ports are added to many VLANs (For example, a user performs configuration in a port group to add many ports to many VLANs or change link types of the ports.)
- The switch frequently receives a large number of IGMP request messages.
- The switch processes a large number of concurrent DHCP requests (For example, a switch that functions as a DHCP server restores connections with a large number of users.)
- ARP broadcast storm.
- Ethernet broadcast storm.
- Software forwarding of a large number of concurrent protocol packets (For example, L2PT transparently transmits a large number of BPDUs or the DHCP relay/snooping module forwards a large number of DHCP packets within a short time.)
- A large number of data packets cannot be forwarded through the forwarding chip and are sent to the CPU (such as ARP Miss).
- Ports alternate between Up and Down.
How to Locate the High CPU Usage Problem
- When the network access speed of a user is slow or the video service is intermittently interrupted, determine whether the problem was caused by a high CPU usage according to Figure 1-7.
- You can also check the CPU usage according to Figure 1-7 during routine operation.
Checking the Switch and Version Information
Run the display version and display device commands to check the switch version and component types. Record the information for follow-up operations.
- Run the display version command to view the switch software version.
# Run the display version command.
<HUAWEI> display version Huawei Versatile Routing Platform Software VRP (R) software, Version 5.160 (S7700 V200R007C00) Copyright (C) 2000-2013 HUAWEI TECH CO., LTD Quidway S7703 Terabit Routing Switch uptime is 0 week, 0 day, 1 hour, 3 minutes BKP 0 version information: 1. PCB Version : LE02BAKB VER.A 2. Supporting PoE : No 3. Board Type : ES0B017712P0 4. MPU Slot Quantity : 2 5. LPU Slot Quantity : 3 ……
The VRP (R) software, Version 5.160 field indicates that this is an S7700 switch running V200R007.
- Run the display device command to check the switch model, whether the switch is in a cluster/stack, and LPUs (only on modular switches).
# Run the display device command to check the component types and status.
<HUAWEI> display device S7712's Device status: Slot Sub Type Online Power Register Status Role ------------------------------------------------------------------------------- 6 - ES0D0X4UXC00 Present PowerOn Registered Normal NA 8 - ES0D0F48TC00 Present PowerOn Registered Normal NA 9 - ES0D0G24SC00 Present PowerOn Registered Normal NA 10 - - Present PowerOff Unregistered - NA 14 - ES0D00SRUA00 Present PowerOn Registered Normal Master PWR1 - - Present PowerOn Registered Normal NA CMU1 - LE0DCMUA0000 Present PowerOn Registered Normal Master FAN1 - - Present PowerOn Registered Normal NA FAN2 - - Present PowerOn Registered Normal NA FAN3 - - Present PowerOn Registered Normal NA FAN4 - - Present PowerOn Registered Normal NA
The preceding information shows that this is a stand-alone S7712, with the ES0D00SRUA00 (MPU), LE0DCMUA0000 (CMU), and ES0D0X4UXC00/ES0D0F48TC00/ES0D0G24SC00 (LPUs) installed.
Checking the CPU Usage
- Run the display cpu-usage command to view the CPU usage.
After several seconds, run the display cpu-usage command again to verify the CPU Usage field.
A switch is considered running normally if its long-term average CPU usage does not exceed 80% and its highest temporary CPU usage does not exceed 95%.
Command
Command Description for Modular Switches
Command Description for Fixed Switches
display cpu-usage
Displays the CPU usage of the active MPU.
NOTE:Generally, the CPU usage of a standby MPU will not be high, so it is not displayed.
Displays the CPU usage of the switch.
display cpu-usage slot slot-id
- Non-cluster: displays the CPU usage of the specified interface card.
- Cluster: displays the CPU usage of the cluster.
- Non-stack: displays the CPU usage of the switch when the slot-id value is 0.
- Stack: displays the CPU usage of the switch specified by slot-id.
# Check the CPU usage of a non-cluster modular switch.
<HUAWEI> display cpu-usage CPU Usage Stat. Cycle: 10 (Second) CPU Usage : 88% Max: 92% CPU Usage Stat. Time : 2010-12-18 15:35:56 CPU utilization for five seconds: 68%: one minute: 60%: five minutes: 55%. Max CPU Usage Stat. Time : 2015-01-27 10:08:10. TaskName CPU Runtime(CPU Tick High/Tick Low) Task Explanation VIDL 82% 8/ 4c8b1ff DOPRA IDLE OS 12% 1/2c684bff Operation System ...
The preceding information shows that the CPU usage of the switch reaches 88%.
Follow-up: Find out the tasks occupying high CPU usage and focus on the top 3 tasks (in V200R005 and later versions, the tasks are listed in a descending order of CPU usage). For details, see Determining Fault Causes According to CPU Usages of Tasks (Modular Switches) and Determining Fault Causes According to CPU Usages of Tasks (Fixed Switches).
- Check whether related alarms have been reported on the NMS.
When a switch connects to an NMS, check whether there is a high CPU usage alarm on the NMS.
When the CPU usage exceeds the alarm threshold (set by the set cpu-usage threshold command in the system view, and the default CPU usage alarm threshold is 80%), the switch reports the following alarms to the NMS. Obtain the high CPU usage information according to the alarm messages.
- hwCPUUtilizationRising
- hwCPUUtilizationRisingAlarm
For details about the alarms, see Alarm Information
- Check whether the log records a high CPU usage.
View the system log files or run the display logbuffer command to check whether the system has recorded logs about high CPU usage.
The system log may include the current or historical high CPU usage records.
Related log: VOSCPU/4/CPU_USAGE_HIGH. For details about this log, see Log Information.
Determining Fault Causes According to CPU Usages of Tasks (Modular Switches)
Run the display cpu-usage command to view the top 3 tasks occupying high CPU usage (in V200R005 and later versions, the tasks are listed in a descending order of CPU usage).
Find out the reason why CPU usage is high and the solution according to Table 1-5.
Task Name |
Description |
Reason for High CPU Usage |
Solution |
---|---|---|---|
AGNT |
Implements the IPv4 SNMP protocol stack and processes SNMP connection between the NMS and switch. |
NMS operations are frequently performed. |
Figure out a solution according to the network management events. Lower the rate at which the NMS sends requests or shield the requests from the NMS. |
AGT6 |
Implements the IPv6 SNMP protocol stack and processes SNMP connection between the NMS and switch. |
||
ARP |
Implements the ARP protocol stack, manages the ARP state machine, and maintains the ARP database. |
|
Adjust the CAR for packets sent to the CPU and aging time. |
bcmRx/bcmT/FTS/FBUF/VP/VPR/VPS/SOCK/ARPA |
Packet receiving/sending task |
When many protocol packets are sent to the CPU, the CPU usage of this task significantly increases. This is a major cause for high system CPU usage. The reasons why many protocol packets are sent to the CPU include:
|
|
bcmDPC |
Reports interrupts when chip failures occur. |
|
|
bcmL2MOD.0 |
Chip 0 MAC address entry learning task |
MAC address flapping or a hash conflict occurs. |
|
bcmL2MOD.2 |
Chip 2 MAC address entry learning task |
||
bmLINK.0 |
Chip 0 linkscan task, which scans interface status and notifies the application modules of interface status changes |
A large number of link interruptions are reported or miim access is time-consuming. Link interruptions are caused by LOS of optical modules. Non-certified optical modules and optical module failures will lead to many abnormal interruptions (non-standard optical modules will cause this situation). |
Replace the optical modules with Huawei-certified optical modules. |
bmLINK.1 |
Chip 1 linkscan task, which scans interface status and notifies the application modules of interface status changes |
||
bmLINK.2 |
Chip 2 linkscan task, which scans interface status and notifies the application modules of interface status changes |
||
CFM |
Configuration management task, which restores MPU configuration and interface configuration |
Configurations are restored. |
No action is required. |
CWP_CWP |
Distributes CAPWAP services, receives and distributes CAPWAP packets. |
High CPU usage occurs during message queue maintenance, packet distribution and statistics collection, or CAPWAP timer processing (retransmission, fragmentation, reassembly, and state machine), or when a large number of packets exist, traffic is sent continuously, or an attack occurs. |
Decrease the service concurrency rate, and expand the system capacity or use high-performance main control units such as SRUH. |
CWP_FWD |
Creates CAPWAP socket, receives and sends socket packets, and rapidly receives and sends packets. |
Traffic is continuously sent when there are a large number of CAPWAP control packets, or a CAPWAP attack exists. |
When more than 20 users connect to the switch concurrently, it is normal that the CPU usage of this task is within 15%. You can only expand the capacity to solve the problem. |
DEV/HOTT/FMCK/SRMI |
Device management task |
|
Confirm with Huawei switch resellers whether the hardware is faulty. For details, see Checking Whether the Problem Is Caused by a Hardware Failure. |
DHCP |
Implements the DHCP protocol stack and provides the functions such as DHCP snooping and DHCP relay. |
The CPU experiences a DHCP attack. |
For details, see Checking Whether the Problem Is Caused by a Network Attack. |
FIB |
Generates IPv4 software forwarding entries on the MPU and delivers the entries to the interface card to guide data forwarding. |
When a large number of routes are delivered, route flapping continuously occurs. |
No action is required. |
FIB6 |
Manages IPv6 FIB entries, maintains software entries, and requests the hardware adaptation layer to maintain chip entries. |
||
FMAT |
Trap management task, which processes the traps generated by all services |
A large number of traps are generated. For example, a large number of interfaces alternate between Up and Down states. |
The high CPU usage problem is automatically solved when the number of generated traps is stable. |
FTPS |
Provides the FTP server service and FC0 as well as FC1 services. |
The CPU usage of the FC task becomes high when large files are being transferred, for example, a large file is being transferred and even multiple large files are being transferred concurrently. |
The high CPU usage problem is automatically solved after file transfer ends. To prevent this problem, minimize concurrent transfer of multiple large files. |
HTTP |
Processes HTTP packets. |
The CPU usage becomes high when a large number of external HTTP packets are being processed, for example, web operations are frequently performed. |
Reduce the frequency of packet sending triggered by external operations. |
INFO |
Information center main task, which receives and outputs the logs, alarms, and debugging information generated by service modules |
When logs and debugging information are frequently triggered, frequently writing files to the CF card may also cause a high CPU usage due to poor performance of the CF card. |
Reduce the frequency at which operations triggered by logs and debugging information are performed. |
IP |
Schedules IP protocol tasks in a unified manner. |
A large number of IPv6 packets are received and sent. |
Reduce the number of received and sent IPv6 packets by, for example, adjusting the CPCAR. |
L2MC |
Adapts to and delivers Layer 2 multicast entries. This task is the multicast product LPU adaptation task. |
Layer 2 multicast entries are repeatedly updated due to ring network or port flapping. |
Check whether ring network or port flapping occurs. |
LDP |
Implements the LDP protocol stack and maintains LSP databases. |
Route flapping occurs. |
Prevent session flapping caused by route flapping. |
MCSW |
Multicast product adaptation task, which processes received and sent multicast packets and delivers Layer 3 multicast entries |
|
|
MFIB |
Manages Layer 3 multicast forwarding entries. |
A large number of data/registration packet entries are received, and interfaces frequently flap. |
Configure a policy to filter data, find the cause for flapping, and eliminate the flapping. |
MPSI |
MPLS service LPU adaptation task |
|
Check port flapping and protocol status. |
MPSM |
MPLS service MPU adaptation task |
||
PAT |
Manages patch operations, for example, load, activate, run, and delete patches. |
Patches are loaded to the standby MPU and LPUs. |
The CPU usage of the PAT task will increase for a while when patches are being loaded. Currently, no proper approach is available to solve this problem. To prevent patch loading from affecting services, do not perform batch service operations during patch loading. |
PM |
Performance management task, which processes performance statistics data and PM configuration commands |
When there are many PM configurations (a large amount of statistics data), performance data collection and processing are triggered. |
|
RSVP |
Implements the RSVP protocol stack and maintains the CR-LSP database. |
RSVP LSP flaps or a large number of RSVP packets are sent and received. |
RSVP LSP flapping is often caused by link or IGP flapping. You need to eliminate link or IGP flapping. If a large number of RSVP packets are sent and received, check whether there are invalid RSVP packets. |
SFPM |
Queries manufacturer information and digital diagnostic information of optical modules. |
There are non-certified optical modules on the switch, causing I2C failures. |
Replace non-certified optical modules with certified ones. |
SNPG |
Layer 2 multicast protocol stack task, which processes received and sent Layer 2 multicast packets and delivers Layer 2 multicast entries. |
|
|
VIDL |
Collects statistics on CPU usage of idle tasks. |
A larger value for this task indicates a lower CPU usage. |
The system calculates the CPU usage for this task based on the duration in which the task occupies CPU resources. Therefore, no action is required. |
VT0 |
Authenticates the user with the user ID 0 and processes commands. |
User operations, especially, input and output operations are frequently performed. For example, commands are copied to the screen (input) or a large number of display commands are executed (output). |
Reduce the frequency at which input and output operations are performed. This problem is automatically solved after the operations end. |
VT1 |
Authenticates the user with the user ID 1 and processes commands. |
||
VT2 |
Authenticates the user with the user ID 2 and processes commands. |
||
VTYD |
Processes login requests of all users. |
A large number of user input operations are performed, for example, commands are copied to the screen. |
Reduce the frequency at which input operations are performed. |
WMT_DEV |
Device management task:
|
During batch AP login/logout, upgrade, radio calibration, terminal location, a large number of messages from APs are concurrently processed. |
Set the interval for scanning the air interface to a larger value and check whether APs frequently go offline. |
WMT_SEC |
User management task:
|
More than 20 users connect to the switch or are roaming simultaneously. |
When more than 20 users connect to the switch simultaneously, the CPU usage of this task is about 15% and CPU resources are used to process user access, authentication, and roaming. When more than 20 users connect to the switch simultaneously, capacity expansion is required. |
We0 |
WebServer task. |
A large number of HTTP packets are processed. |
Run the undo http server enable command to disable web users from logging in through HTTP, and run the undo http secure-server enable command to disable web users from logging in through HTTPS. |
We1 |
|||
WT0 |
Web service processing task, which processes requests of all web users |
Operations are frequently performed on the web platform. |
Reduce the frequency at which web operations are performed. |
WT1 |
|||
WT2 |
|||
UCM/SAM |
Processes user login/logout and permission control. |
The number of concurrent users is large, or the users go online and offline frequently. |
Check whether a large number of users go online and offline and whether the authentication configuration is changed. |
If the top tasks on your switch are not included in the preceding table, see CPU-related Tasks and Functions for Modular Switches to find out which services caused the high CPU usage.
If the top tasks on your switch are not included in the preceding table or CPU-related Tasks and Functions for Modular Switches, contact Huawei switch resellers.
The preceding table is only a reference for you to locate a high CPU usage problem. To fix the problem, see How to Fix the High CPU Usage Problem.
Determining Fault Causes According to CPU Usages of Tasks (Fixed Switches)
Run the display cpu-usage command to view the top 3 tasks occupying high CPU usage (in V200R005 and later versions, the tasks are listed in a descending order of CPU usage).
Find out the reason why CPU usage is high and solution according to Table 1-6.
Task Name |
Description |
Reason for High CPU Usage |
Solution |
---|---|---|---|
VIDL |
Collects statistics on CPU usage of idle tasks. |
A larger value for this task indicates a lower CPU usage. |
The system calculates the CPU usage for this task based on the duration in which the task occupies CPU resources. Therefore, no action is required. |
bmLINK.0 |
Linkscan task, which scans interface status and notifies the application modules of interface status changes |
A large number of link interruptions are reported or miim access is time-consuming. Link interruptions are caused by LOS of optical modules. Non-certified optical modules and optical module failures will lead to many abnormal interruptions (non-standard optical modules will cause this situation). |
Replace the optical modules with Huawei-certified optical modules. |
linkscan |
|||
AGNT |
Implements the IPv4 SNMP protocol stack and processes SNMP connection between the NMS and switch. |
NMS operations are frequently performed. |
Analyze network management events and reduce the NMS request rate or block NMS requests if necessary. |
AGT6 |
Implements the IPv6 SNMP protocol stack and processes SNMP connection between the NMS and switch. |
||
ARP |
Implements the ARP protocol stack, manages the ARP state machine, and maintains the ARP database. |
|
Adjust the CAR for packets sent to the CPU and aging time. |
CFM |
Configuration management task, which restores MPU configuration and interface configuration |
Configurations are restored. |
No action is required. |
CWP_CWP |
Distributes CAPWAP services, receives and distributes CAPWAP packets. |
High CPU usage occurs during message queue maintenance, packet distribution and statistics collection, or CAPWAP timer processing (retransmission, fragmentation, reassembly, and state machine), or when a large number of packets exist, traffic is sent continuously, or an attack occurs. |
Decrease the service concurrency rate, and expand the system capacity or use high-performance main control units such as SRUH. |
DEV/HOTT/FMCK/SRMI |
Device management task |
|
Confirm with Huawei switch resellers whether the hardware is faulty. For details, see Checking Whether the Problem Is Caused by a Hardware Failure. |
CWP_FWD |
Creates CAPWAP socket, receives and sends socket packets, and rapidly receives and sends packets. |
Traffic is continuously sent when there are a large number of CAPWAP control packets, or a CAPWAP attack exists. |
When more than 20 users connect to the switch concurrently, it is normal that the CPU usage of this task is within 15%. You can only expand the capacity to solve the problem. |
DHCP |
Implements the DHCP protocol stack and provides the functions such as DHCP snooping and DHCP relay. |
The CPU experiences a DHCP attack. |
For details, see Checking Whether the Problem Is Caused by a Network Attack. |
ETHA |
Ethernet packet distribution and processing task |
A large number of protocol packets are sent to the CPU. |
Configure the rate limit of protocol packets properly and deploy the attack defense function. |
EpldIntTask |
Processes CPLD interrupts. |
When many CPLD interrupts are generated, the workload of processing these interrupts becomes heavy, and the CPU usage of this task becomes high. |
Check whether many CPLD interrupts are generated. |
FIB |
Generates IPv4 software forwarding entries on the MPU and delivers the entries to the interface card to guide data forwarding. |
When a large number of routes are delivered, route flapping continuously occurs. |
- |
FIB6 |
Manages IPv6 FIB entries, maintains software entries, and requests the hardware adaptation layer to maintain chip entries. |
||
FMAT |
Trap management task, which processes the traps generated by all services |
A large number of traps are generated. For example, a large number of interfaces alternate between Up and Down states. |
The high CPU usage problem is automatically solved when the number of generated traps is stable. |
FTPS |
Provides the FTP server service and FC0 as well as FC1 services. |
The CPU usage of the FC task becomes high when large files are being transferred, for example, a large file is being transferred and even multiple large files are being transferred concurrently. |
The high CPU usage problem is automatically solved after file transfer ends. To prevent this problem, minimize concurrent transfer of multiple large files. |
FTS |
Upper-layer packet sending and receiving task |
When many protocol packets are sent to the CPU, the CPU usage of this task significantly increases. This is a major cause for high system CPU usage. The reasons why many protocol packets are sent to the CPU include:
|
|
HTTP |
Processes HTTP packets. |
The CPU usage becomes high when a large number of external HTTP packets are being processed, for example, web operations are frequently performed. |
Reduce the frequency of packet sending triggered by external operations. |
INFO |
Information center main task, which receives and outputs the logs, alarms, and debugging information generated by service modules |
When logs and debugging information are frequently triggered, frequently writing files to the CF card may also cause a high CPU usage due to poor performance of the CF card. |
Reduce the frequency at which operations triggered by logs and debugging information are performed. |
INT |
Processes CPLD interrupts sent by the kernel. |
When many CPLD interrupts are generated, the workload of processing these interrupts becomes heavy, and the CPU usage of this task becomes high. |
Check whether many CPLD interrupts are generated. |
LDP |
Implements the LDP protocol stack and maintains LSP databases. |
Route flapping occurs. |
Prevent session flapping caused by route flapping. |
MCSW |
Multicast product adaptation task, which processes received and sent multicast packets and delivers Layer 3 multicast entries |
|
|
MFIB |
Manages Layer 3 multicast forwarding entries. |
A large number of data/registration packet entries are received, and interfaces frequently flap. |
Configure a policy to filter data, find the cause for flapping, and eliminate the flapping. |
MPSI |
MPLS service LPU adaptation task |
|
Check port flapping and protocol status. |
MPSM |
MPLS service MPU adaptation task |
||
PAT |
Manages patch operations, for example, load, activate, run, and delete patches. |
Patches are loaded to the standby MPU and LPUs. |
The CPU usage of the PAT task will increase for a while when patches are being loaded. Currently, no proper approach is available to solve this problem. To prevent patch loading from affecting services, do not perform batch service operations during patch loading. |
PM |
Performance management task, which processes performance statistics data and PM configuration commands |
When there are many PM configurations (a large amount of statistics data), performance data collection and processing are triggered. |
|
SFPT |
Fixed switches' optical module processing task |
There are non-certified optical modules on the switch, causing I2C failures. |
Replace non-certified optical modules with certified ones. |
SNPG |
Layer 2 multicast protocol stack task, which processes received and sent Layer 2 multicast packets and delivers Layer 2 multicast entries. |
|
|
SOCK |
Schedules and processes IP packets. |
When many protocol packets are sent to the CPU, the CPU usage of this task significantly increases. This is a major cause for high system CPU usage. The reasons why many protocol packets are sent to the CPU include:
|
|
VT0 |
Authenticates the user with the user ID 0 and processes commands. |
User operations, especially, input and output operations are frequently performed. For example, commands are copied to the screen (input) or a large number of display commands are executed (output). |
Reduce the frequency at which input and output operations are performed. This problem is automatically solved after the operations end. |
VTYD |
VTY daemon process, which handles all user login requests |
A large number of user input operations are performed, for example, commands are copied to the screen. |
Reduce the frequency at which input operations are performed. |
We0 |
WebServer task. |
A large number of HTTP packets are processed. |
Run the undo http server enable command to disable web users from logging in through HTTP, and run the undo http secure-server enable command to disable web users from logging in through HTTPS. |
We1 |
|||
WT0 |
Web service processing task, which processes requests of all web users |
Operations are frequently performed on the web platform. |
Reduce the frequency at which web operations are performed. |
bcmDPC |
Reports interrupts when chip failures occur. |
|
|
bcmL2MOD.0 |
Chip 0 MAC address entry learning task |
MAC address flapping or a hash conflict occurs. |
|
l2au |
MAC learning task |
MAC address flapping or a hash conflict occurs. |
- |
l2sy |
MAC synchronization task |
||
WMT_DEV |
Device management task:
|
During batch AP going-online or going-offline, upgrade, radio calibration, or terminal location, a large number of messages from APs are processed. |
Set the interval for scanning the air interface to a larger value and check whether APs frequently go offline. |
WMT_SEC |
User management task:
|
More than 20 users connect to the switch or are roaming simultaneously. |
When more than 20 users connect to the switch simultaneously, the CPU usage of this task is about 15% and CPU resources are used to process user access, authentication, and roaming. When more than 20 users connect to the switch simultaneously, capacity expansion is required. |
UCM/SAM |
Processes user login/logout and permission control. |
The number of concurrent users is large, or the users go online and offline frequently. |
Check whether a large number of users go online and offline and whether the authentication configuration is changed. |
If the top tasks on your switch are not included in the preceding table, see CPU-related Tasks and Functions for Fixed Switches to find out which services caused the high CPU usage.
If the top tasks on your switch are not included in the preceding table or CPU-related Tasks and Functions for Fixed Switches, contact Huawei switch resellers.
The preceding table is only a reference for you to locate a high CPU usage problem. To fix the problem, see How to Fix the High CPU Usage Problem.
How to Fix the High CPU Usage Problem
After determining the top tasks and reasons, analyze the root causes and take troubleshooting measures.
Checking Whether the Problem Is Caused by a Hardware Failure
If you determine that the problem is caused by a hardware failure according to Determining Fault Causes According to CPU Usages of Tasks (Modular Switches) or Determining Fault Causes According to CPU Usages of Tasks (Fixed Switches) (the DEV, HOTT, FMCK, or SRMI task has a high CPU usage), contact Huawei switch resellers for help.
If services are affected, reset the card that causes high CPU usage (powering off the card is recommended) to recover services temporarily.
Checking Whether the Problem Is Caused by a Network Attack
In some situations, network attacks may cause high CPU usage. Network attacks are initiated by hosts or network devices by sending a large number of forged packets to switches, affecting security and services on the target switches. When a network attack occurs, the switch is busy with the requests from the attack source. Therefore, some tasks occupy many CPU resources, causing a high CPU usage on the switch.
Common Network Attacks
Common network attacks, such as ARP, ARP Miss, and DHCP attacks, can cause a high CPU usage on a switch. These attacks are all initiated by sending a large number of protocol packets; therefore, packet statistics on the switch show a large number of packets sent to the CPU.
- ARP and ARP-Miss attacks
- ARP and ARP Miss flood
- ARP spoofing
- DHCP protocol packet attack
- Other attacks
- ICMP attack
- DDoS attack
- Broadcast attack
- TTL expiry attack
- IP packet attack initiated using the device's IP address as the destination IP address
- SSH/FTP/Telnet attacks
Network Attack Locating
- Run the display version and display device commands to check the switch version and component types. Record the information for follow-up operations.
- Run the display cpu-defend statistics command to view statistics about the packets sent to the CPU, and determine whether too many protocol packets are discarded due to timeout.
- Run the reset cpu-defend statistics command to clear statistics about the packets sent to the CPU.
- After several seconds, run the display cpu-defend statistics command to view statistics about the packets sent to the CPU.
If there are too many packets of a protocol, determine whether it is normal depending on the networking. If not, there is a high probability that the switch is undergoing a protocol packet attack.
<HUAWEI> reset cpu-defend statistics <HUAWEI> display cpu-defend statistics all Statistics on slot 2: ----------------------------------------------------------------------------------------------------------- Packet Type Pass(Bytes) Drop(Bytes) Pass(Packets) Drop(Packets) ----------------------------------------------------------------------------------------------------------- arp-miss 0 0 0 0 arp-request 40800 35768 600 52600 bgp 0 0 0 0 ... -----------------------------------------------------------------------------------------------------------
The preceding information shows that the switch has discarded many ARP request packets. If these packets are abnormal, the switch undergoes an ARP attack.
- Configure the attack source tracing function to find out the attack source.If a CPU is busy with many valid or attack packets, services may be interrupted. The switch provides the local attack defense function to protect the CPU. Local attack defense policies include attack source tracing, port attack defense, CPCAR, and blacklist.
- Create a local attack defense policy based on attack source tracing.
- Create an ACL and add the gateway IP address to the whitelist of attack source tracing.
<HUAWEI> system-view [HUAWEI] acl number 2000 [HUAWEI-acl-basic-2000] rule 5 permit source 10.1.1.1 0 //10.1.1.1 is the gateway IP address. [HUAWEI-acl-basic-2000] quit
- Create a local attack defense policy based on attack source tracing.
[HUAWEI] cpu-defend policy policy1 [HUAWEI-cpu-defend-policy-policy1] auto-defend enable //Enable attack source tracing. By default, this function is disabled. [HUAWEI-cpu-defend-policy-policy1] undo auto-defend trace-type source-portvlan //Set the attack tracing mode to MAC + IP based. By default, attack source tracing is based on source MAC address, source IP address, source interface, and VLAN ID. To delete unneeded mode, run the undo auto-defend trace-type command. [HUAWEI-cpu-defend-policy-policy1] undo auto-defend protocol 8021x dhcp icmp igmp tcp telnet ttl-expired udp //Delete the types of traced packets. By default, the types include 802.1X, ARP, DHCP, ICMP, IGMP, TCP, Telnet, TTL expiry, and UDP. [HUAWEI-cpu-defend-policy-policy1] auto-defend whitelist 1 acl 2000 //Add the gateway IP address to a whitelist. [HUAWEI-cpu-defend-policy-policy1] quit
Beginning with later versions of V200R009, the attack source tracing configuration model is redesigned, attack source tracing is enabled by default, and source tracing protocols are designed to overwrite mode according to normal use habits.[HUAWEI] cpu-defend policy policy1 [HUAWEI-cpu-defend-policy-policy1] auto-defend protocol arp //Trace only the source of ARP packets. By default, attack source tracing supports the following types of packets: 802.1X, ARP, DHCP, ICMP, IGMP, TCP, Telnet, TTL expiry, and UDP. In V200R010, IPv6 DHCPv6, ND, ICMPv6, and MLD are supported. [HUAWEI-cpu-defend-policy-policy1] auto-defend whitelist 1 acl 2000 //Add the gateway IP address to a whitelist. [HUAWEI-cpu-defend-policy-policy1] quit
- Create an ACL and add the gateway IP address to the whitelist of attack source tracing.
- Apply the local attack defense policy.
- Modular switches
Both MPUs and LPUs have their own CPUs. Local attack defense policies are configured differentially for MPUs and LPUs.
Before creating and applying attack defense policies, check attack information on the MPUs and LPUs. If the attack information on the MPUs and LPUs is consistent, apply the same attack defense policy to the MPUs and LPUs; otherwise, apply different policies to them.
- Apply an attack defense policy to an MPU.
<HUAWEI> system-view [HUAWEI] cpu-defend-policy policy1 [HUAWEI] quit
- Apply an attack defense policy to an LPU.
If an attack defense policy has been applied to all LPUs, it cannot be applied to a specified LPU. Similarly, if an attack defense policy has been applied to a specified LPU, it cannot be applied to all LPUs.
- If all LPUs process similar services, apply an attack defense policy to all LPUs.
<HUAWEI> system-view [HUAWEI] cpu-defend-policy policy2 global
- If LPUs process different services, apply an attack defense policy to a given LPU.
<HUAWEI> system-view [HUAWEI] slot 1 [HUAWEI-slot-1] cpu-defend-policy policy2
- If all LPUs process similar services, apply an attack defense policy to all LPUs.
- Apply an attack defense policy to an MPU.
- Fixed switches
- Apply an attack defense policy to a stand-alone switch.
<HUAWEI> system-view [HUAWEI] cpu-defend-policy policy1 global
- In a stack:
- Apply an attack defense policy to the master switch.
<HUAWEI> system-view [HUAWEI] cpu-defend-policy policy1
- Apply an attack defense policy to all stacked switches.
<HUAWEI> system-view [HUAWEI] cpu-defend-policy policy1 global
- Apply an attack defense policy to the master switch.
- Apply an attack defense policy to a stand-alone switch.
- Modular switches
- View attack source information.
After configuring local attack defense based on attack source tracing, run the display auto-defend attack-source and display auto-defend attack-source slot slot-id commands to view attack source information.
The MAC address of gateway should be excluded from the suspicious attack sources.
- Create a local attack defense policy based on attack source tracing.
Handling Suggestion
Select an appropriate method based on the attack source information and networking.
- Configure ARP security to prevent ARP attacks.
The switch provides ARP security to prevent ARP and ARP Miss packet attacks.
For details about ARP security, see "ARP Security Solutions" in the Configuration Guide > Security > ARP Security Configuration.
- Configure a punishment action for attack source tracing: drop attack packets within a given period.
- Enable the punishment function for attack source tracing and set the punishment action to drop all attack packets within 300s.
<HUAWEI> system-view [HUAWEI] cpu-defend policy policy1 [HUAWEI-cpu-defend-policy-policy1] auto-defend enable //Enable attack source tracing. By default, this function is disabled. [HUAWEI-cpu-defend-policy-policy1] auto-defend action deny timer 300 //By default, the punishment function for attack source tracing is disabled.
- Configure a blacklist for local attack defense. The packets from the users in the blacklist are discarded.
If an attack source is considered as attacker (for example, attack source address is 1.1.1.0/24), blacklist the users with the specified characteristics through an ACL.
# Configure ACL 2001 to match the packets with source address 1.1.1.0/24. The switch drops the packets that match the ACL.
[HUAWEI] acl number 2001 [HUAWEI-acl-basic-2001] rule permit source 1.1.1.0 0.0.0.255 [HUAWEI-acl-basic-2001] quit [HUAWEI] cpu-defend policy policy1 [HUAWEI-cpu-defend-policy-policy1] blacklist 1 acl 2001
- Configure a punishment action for attack source tracing: shut down the interface receiving attack packets.
Use this punishment action if attack packets are sent from a specified interface and shutting down this interface does not affect services.
Shutting down an interface may cause a service interruption and affect valid users. Use this method with caution.
# Shut down the interface that receives attack packets.
<HUAWEI> system-view [HUAWEI] cpu-defend policy policy1 [HUAWEI-cpu-defend-policy-policy1] auto-defend enable //Enable attack source tracing. By default, this function is disabled. [HUAWEI-cpu-defend-policy-policy1] auto-defend action error-down
- Enable the punishment function for attack source tracing and set the punishment action to drop all attack packets within 300s.
Checking Whether the Problem Is Caused by Network Flapping
When network flapping occurs, the network topology frequently changes. The switch is busy with network switching events, causing a high CPU usage. Network flapping includes STP flapping and OSPF route flapping.
STP Flapping
When STP flapping occurs, the switch frequently calculates the STP topology, and updates its MAC address table and ARP table, causing a high CPU usage.
- Fault Location
- If you consider that frequent STP flapping may occur, run the display stp topology-change command multiple times at an interval of several seconds to view the current STP topology change information. Alternatively, you can check the trap and log information on the switch to determine whether the STP topology has changed.
# Run the command multiple times. Check whether the value of Number of topology changes increases.
<HUAWEI> display stp topology-change CIST topology change information Number of topology changes :35 Time since last topology change :0 days 1h:7m:30s Topology change initiator(notified) :GigabitEthernet2/0/6 Topology change last received from :101b-5498-d3e0 Number of generated topologychange traps : 38 Number of suppressed topologychange traps: 8 MSTI 1 topology change information Number of topology changes :0
- When you confirm that the network topology is frequently changed, run the display stp tc-bpdu statistics command after several seconds again. Check whether interfaces on the switch have received Topology Change (TC) BPDUs. If so, find out the source of the TC BPDUs, that is, the device causing the topology change.
- If only the TC(Send) value increases, the topology change is caused by the local switch.
- If only the TC(Send) value of a single interface increases, the topology change is caused by this interface.
- If the TC(Send) values of multiple interfaces increase, check the events and logs on the NMS to analyze the STP topology change reason. Find out the interface causing the flapping.
- If multiple values in the TC(Send/Receive) column increase, check the event and log information on the NMS to determine whether the local switch causes the topology change, and check whether STP flapping occurs on the device connected to the problematic interface.
# View statistics about TC/TCN BPDUs on ports.
<HUAWEI> display stp tc-bpdu statistics -------------------------- STP TC/TCN information -------------------------- MSTID Port TC(Send/Receive) TCN(Send/Receive) 0 GigabitEthernet2/0/6 21/4 0/1 0 GigabitEthernet2/0/7 93/0 0/1 0 GigabitEthernet2/0/8 115/0 0/0 0 GigabitEthernet2/0/9 110/0 0/0 0 GigabitEthernet3/0/23 29/5 0/0
- If only the TC(Send) value increases, the topology change is caused by the local switch.
- If you consider that frequent STP flapping may occur, run the display stp topology-change command multiple times at an interval of several seconds to view the current STP topology change information. Alternatively, you can check the trap and log information on the switch to determine whether the STP topology has changed.
- Suggestion
- Enable TC protection trap to help you understand how the switch processes TC BPDUs.
Run the snmp-agent trap enable feature-name mstp and stp tc-protection commands in the system view to enable TC protection trap.
By default, a switch is enabled to prevent topology change attacks. That is, within the stp tc-protection interval, the switch processes a maximum number of stp tc-protection threshold TC BPDUs.
After the trap is enabled, the switch reports the MSTP_1.3.6.1.4.1.2011.5.25.42.4.2.15 hwMstpiTcGuarded and MSTP_1.3.6.1.4.1.2011.5.25.42.4.2.16 hwMstpProTcGuarded traps.
For details about the traps, see Alarm Information.
- Perform operations according to topology changes.
- STP topology changes when the access interface alternates between Up and Down.
Run the stp edged-port enable command in the interface view to set the access interface as an edge port, and run the stp bpdu-protection command in the system or STP process view to enable BPDU protection.
- The root bridge is changed unexpectedly.
Run the display stp command. Check whether CIST Root/ERPC is the expected interface MAC address. If not, the root bridge has changed unexpectedly.
Run the stp root-protection command in the interface view to enable root protection, ensuring the correct topology.
<HUAWEI> display stp -------[CIST Global Info][Mode MSTP]------- CIST Bridge:4096 .707b-e8c8-00e9 Config Times:Hello 2s MaxAge 20s FwDly 15s MaxHop 20 Active Times:Hello 2s MaxAge 20s FwDly 15s MaxHop 20 CIST Root/ERPC:4096 .707b-e8c8-00e9 / 0 (This bridge is the root) CIST RegRoot/IRPC:4096 .707b-e8c8-00e9 / 0 (This bridge is the root) CIST RootPortId:0.0 BPDU-Protection:Disabled CIST Root Type:Secondary root TC or TCN received:1 TC count per hello:0 STP Converge Mode:Normal Share region-configuration :Enabled Time since last TC:1 days 14h:25m:38s Number of TC:2 Last TC occurred:GigabitEthernet0/0/1 ----[Port18(GigabitEthernet0/0/1)][LEARNING]---- Port Protocol:Enabled Port Role:Designated Port Port Priority:128 Port Cost(Dot1T ):Config=auto / Active=20000 Designated Bridge/Port:4096.707b-e8c8-00e9 / 128.18 Port Edged:Config=default / Active=disabled Point-to-point:Config=auto / Active=true Transit Limit:6 packets/s Protection Type:None Port STP Mode:STP Port Protocol Type:Config=auto / Active=dot1s BPDU Encapsulation:Config=stp / Active=stp PortTimes:Hello 2s MaxAge 20s FwDly 15s RemHop 20 TC or TCN send:0 TC or TCN received:0 BPDU Sent:11 TCN: 0, Config: 12, RST: 0, MST: 1 BPDU Received:0 TCN: 0, Config: 1, RST: 0, MST: 0
- STP topology changes when the access interface alternates between Up and Down.
- If the topology change reason is unknown or the fault persists, collect network information (including interface connections) and logs (the log.log file or the display logbuffer command output), and provide collected information to Huawei switch resellers.
- Enable TC protection trap to help you understand how the switch processes TC BPDUs.
OSPF Route Flapping
- Fault Location
- Run the display ospf peer last-nbr-down command to check the reason why the OSPF neighbor relationship goes Down.
The reason is displayed in the Immediate Reason and Primary Reason fields.
- Check logs on the switch to determine why the OSPF neighbor becomes Down.
Run the display logbuffer command, and you can find the following log information:
OSPF/3/NBR_DOWN_REASON:Neighbor state leaves full or changed to Down. (ProcessId=[USHORT], NeighborRouterId=[IPADDR],NeighborAreaId=[ULONG], NeighborInterface=[STRING],NeighborDownImmediate reason=[STRING], NeighborDownPrimeReason=[STRING],NeighborChangeTime=[STRING])
The NeighborDownImmediate reason field indicates the cause for the OSPF neighbor Down event.
- Run the display ospf peer last-nbr-down command to check the reason why the OSPF neighbor relationship goes Down.
- Suggestion
Determine the reason depending on the key fields and take corresponding measures.
Possible causes of the fault are as follows:- Neighbor Down Due to Inactivity
The Hello packet is not received within the deadtime (set by the ospf timer dead command in the interface view).
When an OSPF neighbor is Down, OSPF neighbor flapping occurs and OSPF neighbor relationship cannot be set up. Run the display ospf peer brief command to check whether OSPF neighbor flapping occurs or OSPF neighbor relationship cannot be set up.- OSPF neighbor relationship flaps.
OSPF neighbor flapping may be caused by a small CPCAR value for OSPF, link flapping or congestion on interfaces, and a large amount of LSA flooding.
- Run the display cpu-defend statistics packet-type ospf command to view statistics about the OSPF packets sent to the CPU. If too many OSPF packets are discarded, check whether the switch undergoes an OSPF attack or the CPCAR value for OSPF is too small.
- View the log to check whether interfaces alternate between Up and Down. If link flapping or congestion occurs, check the link on the interface.
- If the holdtime of the OSPF neighbor relationship is smaller than 20s, run the ospf timer dead interval command to change the holdtime to be greater than 20s.
- Run the sham-hello enable command in the OSPF view to enable the OSPF sham-hello function, so that the switch can maintain the neighbor relationship using non-Hello packets such as LSU. This allows the switch to detect OSPF neighbor relationships sensitively.
- If the fault persists after the preceding operations are performed, contact Huawei switch resellers.
- OSPF neighbor relationship cannot be set up.
Check whether the configurations in the OSPF view of devices on both ends are the same. If the configurations such as the OSPF area ID or area type (NSSA, stub area, or common area) are different, the two devices cannot establish an OSPF neighbor relationship.
Run the display ospf [ process-id ] interface command to check whether OSPF is successfully enabled on the interfaces.
<HUAWEI> display ospf 1 interface OSPF Process 1 with Router ID 2.2.2.2 Interfaces Area: 0.0.0.0 (MPLS TE not enabled) Interface IP Address Type State Cost Pri Eth0/1/1 10.1.1.2 Broadcast Waiting 1 1
- If OSPF is not enabled on interfaces, run the ospf enable [ process-id ] area area-id command in the interface view to enable OSPF.
- If the OSPF process has been enabled on the related interface, run the display ospf error command multiple times at an interval of several seconds to check whether OSPF authentication information on the two devices is the same according to the Bad authentication type and Bad authentication key fields.
<HUAWEI> display ospf 1 error OSPF Process 1 with Router ID 2.2.2.2 OSPF error statistics General packet errors: 0 : IP: received my own packet 3 : Bad packet 0 : Bad version 0 : Bad checksum 0 : Bad area id 0 : Drop on unnumbered interface 0 : Bad virtual link 3 : Bad authentication type 0 : Bad authentication key 0 : Packet too small 0 : Packet size > ip length 0 : Transmit error 0 : Interface down 0 : Unknown neighbor 0 : Bad net segment 0 : Extern option mismatch
If the value of the Bad authentication type or Bad authentication key value keeps increasing, OSPF authentication information on the two devices is different. To configure the same authentication information for the two devices, run the ospf authentication-mode command in the interface view or the authentication-mode command in the OSPF process view.
- If the Bad authentication type or Bad authentication key value does not increase, the authentication information is the same. If the neighbor intermittently disappears when the display ospf peer command is executed, OSPF neighbor relationship flaps. Refer to the related information in this section to resolve this problem.
- OSPF neighbor relationship flaps.
- Neighbor Down Due to Kill Neighbor
If the interface is Down, BFD is Down, or the reset ospf process command is executed, the OSPF neighbor relationship goes Down.
View the NeighborDownPrimeReason field to determine the reason.
- Neighbor Down Due to 1-Wayhello Received or Neighbor Down Due to SequenceNum Mismatch
When the OSPF status of the peer device goes Down first, the peer device sends a 1-Way Hello packet to the local device, causing OSPF on the local device to go Down.
Determine why OSPF status of the peer device becomes Down.
For other reasons, see OSPF/3/NBR_DOWN_REASON in Log Information.
- Neighbor Down Due to Inactivity
Checking Whether the Problem Is Caused by Network Loop
A network loop will cause MAC flapping. A large number of protocol packets are sent to the CPU, overwhelming the CPU.
- Fault Location
A network loop may have the following symptoms:
- The CPU usage of a switch exceeds 80%.
- Indicators of interfaces in the VLAN where a loop has occurred blink faster than usual.
- MAC flapping frequently occurs.
- The administrator cannot remotely log in to the switch, and the switch responds to the operations on console port slowly.
- A lot of ICMP packets are lost in ping tests.
- The display interface command output shows a large number of broadcast packets received on an interface.
- Loop alarms are generated after loop detection is enabled.
- The PCs connected to switch receive a large number of broadcast or unknown unicast packets.
- Suggestion
- Observe interface indicators and collect traffic statistics on interfaces to locate the interfaces undergoing broadcast storms.
- Check the devices hop by hop according to the topology to locate the devices that cause the loop.
- Locate the interface that causes the loop and shut down the interface to remove the loop.
- if the fault persists after the preceding operations are performed, collect network information (including interface connections) and logs (the log.log file or the display logbuffer command output), and provide collected information to Huawei switch agents.
This chapter describes only the method of locating network loops and handling suggestions. For more information, see the network loop troubleshooting guide.
How to Relieve CPU Load
- Plan the network configurations, configure loop prevention protocols, and enable loopback detection to prevent loops.
- Run the loopback-detect untagged mac-address ffff-ffff-ffff command in the system view to broadcast BPDUs for loopback detection and prevent them from being terminated by unexpected devices.
- Run the loopback-detect enable command in the interface view to enable loopback detection.
When the total number of VLANs on the interfaces with loopback detection enabled exceeds 1024, run the loopback-detect action shutdown command on these interfaces to set the action for a detected loopback to shutdown. (The VLAN counter is incremented by 1 every time an interface is added to a VLAN, even when multiple interfaces are added to the same VLAN.)
- Configure ARP security to protect the device against ARP or ARP Miss attacks.
For details about ARP security, see "ARP Security Solutions" in the Configuration Guide > Security > ARP Security Configuration.
- On the network prone to DHCP and ARP attacks, such as campus networks, configure local attack defense policies for DHCP and ARP protocol packets.
This section provides suggestions on local attack defense policies in normal cases. The requirements on different protocol packets sent to the CPU may vary according to the model and version. In practice, configure CPU attack defense based on actual service requirements; otherwise, the configuration may fail or services may be affected.
- MPU on modular switches
# cpu-defend policy main-board auto-defend enable //Default configuration for later versions of V200R009 undo auto-defend trace-type source-portvlan //Default configuration for later versions of V200R009 undo auto-defend protocol tcp igmp telnet ttl-expired //auto-defend protocol arp dhcp (for V200R009) auto-defend action deny auto-defend whitelist 1 interface GigabitEthernet x/x/x //Add interconnected interfaces to the whitelist. auto-defend whitelist 2 interface GigabitEthernet x/x/x //Add uplink interfaces to the whitelist. # cpu-defend-policy main-board #
- LPU on modular switches
# cpu-defend policy io-board auto-defend enable //Default configuration for later versions of V200R009 undo auto-defend trace-type source-portvlan //Default configuration for later versions of V200R009 undo auto-defend protocol tcp igmp telnet ttl-expired //auto-defend protocol arp dhcp (for V200R009) auto-defend action deny auto-defend whitelist 1 interface GigabitEthernet x/x/x //Add interconnected interfaces to the whitelist. auto-defend whitelist 2 interface GigabitEthernet x/x/x //Add uplink interfaces to the whitelist. # cpu-defend-policy io-board global #
- Fixed switches
# cpu-defend policy main auto-defend enable //Default configuration for later versions of V200R009 undo auto-defend trace-type source-portvlan //Default configuration for later versions of V200R009 undo auto-defend protocol tcp igmp telnet ttl-expired //auto-defend protocol arp dhcp (for V200R009) auto-defend action deny auto-defend whitelist 1 interface GigabitEthernet x/x/x //Add interconnected interfaces to the whitelist. auto-defend whitelist 2 interface GigabitEthernet x/x/x //Add uplink interfaces to the whitelist. # cpu-defend-policy main global #
- MPU on modular switches
- Log in to the switch as an administrator through SSH, Telnet, and SNMP. Configure an ACL to allow only the administrator to log in.
# In VTY 0-14, configure an ACL to allow only the user with source IP address 10.1.1.1/32 to log in to the switch.
<HUAWEI> system-view [HUAWEI] acl 2001 [HUAWEI-acl-adv-2001] rule 5 permit source 10.1.1.1 0 [HUAWEI-acl-adv-2001] quit [HUAWEI] user-interface vty 0 14 [HUAWEI-ui-vty0-14] acl 2001 inbound
- When a port group has more than 40 member ports and you add these member ports to 4K VLANs at the same time, the CPU usage may jump to over 80% in a short period. Therefore, you are advised to add the member ports to no more than 500 VLANs at a time.
- Changing the type of more than 20 ports together may cause the CPU usage to exceed 80% in a short period. Therefore, you are advised to change the type of ports one by one.
- Frequent MAC address flapping may result in a high CPU usage. If MAC address flapping may occur frequently on an interface, run the mac-address flapping action error-down command on the interface to enable the system to set the interface to error-down state after detecting MAC address flapping.
- Load and activate the patch files of the corresponding software version.
Visit http://support.huawei.com/enterprise/ to obtain the corresponding patch file and documents (patch release notes and installation guide).
- Scan virus on the PCs or servers connected to the switch periodically.
- The switch provides CPCAR values for each protocol. Generally, the default CPCAR values can meet requirements. If service traffic volume is too high, contact Huawei switch resellers to adjust the CPCAR values.
Appendix
Commands, Alarms, Logs, and OIDs Related to High CPU Usage
Commands
Command |
Description |
---|---|
display interface [ interface-type ] counters { inbound | outbound } |
Displays number of packets sent and received on each interface. |
display cpu-usage [ slave | slot slot-id ] |
Displays CPU usage statistics. |
display cpu-defend statistics [ packet-type packet-type ] [ all | slot slot-id ] |
Displays statistics on protocol packets sent to the CPU. |
display arp packet statistics |
Displays ARP packet statistics. |
display dhcp statistics |
Displays DHCP packet statistics. |
display cpu-defend rate [ packet-type packet-type ] [ slot slot-id | all ] |
Displays the rates at which protocol packets are sent to the CPU. |
display cpu-defend policy [ policy-name ] |
Displays information about the attack defense policy. |
display auto-defend configuration [ cpu-defend policy policy-name | slot slot-id | mcu ] |
Displays information about attack source tracing. |
display cpu-defend configuration |
Displays CAR values, including the rate at which packets are sent to the CPU and CPU queues to which protocol packets are sent. |
display logbuffer [ size value | slot slot-id | module module-name | security | level { severity | level } ] * |
Displays log information on the switch. |
display trapbuffer [ size value ] |
Displays trap information on the switch. |
display stp [ process process-id ] [ instance instance-id ] topology-change |
Displays information about STP topology changes. |
display stp [ process process-id ] [ instance instance-id ] [ interface interface-type interface-number | slot slot-id ] tc-bpdu statistics |
Displays STP TC BPDU statistics. |
reset cpu-defend statistics [ packet-type packet-type ] [ all | slot slot-id ] |
Clears statistics on packets sent to the CPU. |
cpu-defend policy policy-name |
Configures an attack defense policy. |
blacklist blacklist-id acl acl-number |
Configures an ACL-based blacklist. |
whitelist whitelist-id acl acl-number |
Configures an ACL-based whitelist. |
queue packet-type packet-type queue-value |
Specifies the queue number of the CPU to which protocol packets are sent. |
auto-defend enable |
Enables the attack source tracing function. |
undo auto-defend trace-type { source-mac | source-ip | source-portvlan } * |
Deletes the source tracing mode. |
undo auto-defend protocol { 8021x | arp | dhcp | dhcpv6 | icmp | icmpv6 | igmp | mld | nd | tcp | telnet | ttl-expired | udp }* |
Deletes the packet type in attack source tracing. |
auto-defend whitelist whitelist-number { acl acl-number | interface interface-type interface-number } |
Configures a whitelist for attack source tracing. The users in the whitelist are excluded from attack source tracing. |
auto-defend alarm enable |
Enables event report in attack source tracing. |
auto-defend action { deny [ timer time-length ] | error-down } |
Enables punish action for attack source tracing and specifies the action. |
auto-port-defend whitelist whitelist-number { acl acl-number | interface interface-type interface-number } |
Configures the whitelist for port attack defense. |
System view: cpu-defend-policy policy-name [ global ] Slot view: cpu-defend-policy policy-name |
Applies the attack defense policy. (The command format depends on switch models and versions. In this example, the modular switch runs V200R007.) |
Alarm Information
- ENTITYTRAP_1.3.6.1.4.1.2011.5.25.219.2.14.1 hwCPUUtilizationRising //The CPU usage of the switch exceeded the threshold.
ENTITYTRAP/4/ENTITYCPUALARM:OID [oid] CPU utilization exceeded the pre-alarm threshold.(Index=[INTEGER], EntityPhysicalIndex=[INTEGER], PhysicalName=[OCTET], EntityThresholdType=[INTEGER], EntityThresholdValue=[INTEGER], EntityThresholdCurrent=[INTEGER], EntityTrapFaultID=[INTEGER].)
- BASETRAP_1.3.6.1.4.1.2011.5.25.129.2.4.1 hwCPUUtilizationRisingAlarm //The CPU usage of the switch exceeded the threshold.
BASETRAP/2/CPUUSAGERISING: OID [oid] CPU utilization exceeded the pre-alarm threshold.(Index=[INTEGER], BaseUsagePhyIndex=[INTEGER], UsageType=[INTEGER], UsageIndex=[INTEGER], Severity=[INTEGER], ProbableCause=[INTEGER], EventType=[INTEGER], PhysicalName="[OCTET]", RelativeResource="[OCTET]", UsageValue=[INTEGER], UsageUnit=[INTEGER], UsageThreshold=[INTEGER])
- MSTP_1.3.6.1.4.1.2011.5.25.42.4.2.15 hwMstpiTcGuarded //After TC protection is enabled on an MSTP-enabled switch, extra TC BPDUs that are received after the number of TC BPDUs received in a specified period has exceeded the threshold are processed after the TC protection time expires.
MSTP/4/TCGUARD:OID [OID] The instance received TC message exceeded the threshold will be deferred to deal with at the end of TC protection time. (InstanceID=[INTEGER])
- MSTP_1.3.6.1.4.1.2011.5.25.42.4.2.16 hwMstpProTcGuarded //After TC protection is enabled for an MSTP process, extra TC BPDUs that are received after the number of TC BPDUs received in a specified period has exceeded the threshold are processed after the TC protection time expires.
MSTP/1/PROTCGUARD:OID [OID] MSTP process's instance received TC message exceeded the threshold will be deferred to deal with at the end of TC protection time. (ProcessID=[INTEGER], InstanceID=[INTEGER])
Log Information
- DEFD/6/CPCAR_DROP_MPU //The rate of packets sent to the CPU exceeded the CPCAR value on the MPU.
DEFD/6/CPCAR_DROP_MPU:Rate of packets to cpu exceeded the CPCAR limit on the MPU. (Protocol=[STRING], CIR/CBS=[ULONG]/[ULONG], ExceededPacketCount=[STRING])
Parameter
Description
Protocol
Protocol type.
CIR/CBS
Committed information rate and committed burst size.
ExceededPacketCount
Packet count exceeded.
- DEFD/6/CPCAR_DROP_LPU //The rate at which packets are sent to the CPU exceeded the CPCAR values on the LPU.
DEFD/6/CPCAR_DROP_LPU:Rate of packets to cpu exceeded the CPCAR limit on the LPU in slot [STRING]. (Protocol=[STRING], CIR/CBS=[ULONG]/[ULONG], ExceededPacketCount=[STRING])
Parameter
Description
slot
Slot ID.
Protocol
Protocol type.
CIR/CBS
Committed information rate and committed burst size.
ExceededPacketCount
Packet count exceeded.
- SECE/4/PORT_ATTACK //A lot of attack packets from the corresponding VLAN were received on the interface.
SECE/4/PORT_ATTACK:Port attack occurred.(Slot=[STRING], SourceAttackInterface=[STRING], OuterVlan/InnerVlan=[ULONG]/[ULONG], AttackProtocol=[STRING], AttackPackets=[ULONG] packets per second)
Parameter
Description
Slot
Slot of an MPU or LPU.
SourceAttackInterface
Interface that initiates the attack.
OuterVlan
Outer VLAN ID or single VLAN ID of the attack source.
InnerVlan
Inner VLAN ID of the attack source.
AttackProtocol
Attack packet type.
AttackPackets
Rate of attack packets, in pps.
- SECE/4/USER_ATTACK //User attack information was generated on an MPU or LPU.
SECE/4/USER_ATTACK:User attack occurred.(Slot=[STRING], SourceAttackInterface=[STRING], OuterVlan/InnerVlan=[ULONG]/[ULONG], UserMacAddress=[STRING], AttackProtocol=[STRING], AttackPackets=[ULONG] packets per second)
Parameter
Description
Slot
Slot of an MPU or LPU.
SourceAttackInterface
Interface that initiates the attack.
OuterVlan
Outer VLAN ID or single VLAN ID of the attack source.
InnerVlan
Inner VLAN ID of the attack source.
UserMacAddress
MAC address of the attack source.
AttackProtocol
Attack packet type.
AttackPackets
Rate of attack packets, in pps.
- SECE/4/SPECIFY_SIP_ATTACK //The attack source information is displayed when a switch is attacked.
SECE/4/SPECIFY_SIP_ATTACK:The specified source IP address attack occurred.(Slot=[STRING], SourceAttackIP = [STRING], AttackProtocol=[STRING], AttackPackets=[ULONG] packets per second)
Parameter
Description
Slot
Slot of an MPU or LPU.
SourceAttackIP
IP address of the attack source.
AttackProtocol
Attack packet type.
AttackPackets
Rate of attack packets, in pps.
- SECE/4/PORT_ATTACK_OCCUR //When the switch detects attack packets on an interface, the switch starts attack defense on the interface.
SECE/4/PORT_ATTACK_OCCUR:Auto port-defend started.(SourceAttackInterface=[STRING], AttackProtocol=[STRING])
Parameter
Description
SourceAttackInterface
Interface that initiates the attack.
AttackProtocol
Attack packet type.
- SECE/6/PORT_ATTACK_END //After an attack source is excluded, the switch cancels attack defense on the interface.
SECE/6/PORT_ATTACK_END:Auto port-defend stop.(SourceAttackInterface=[STRING], AttackProtocol=[STRING],ExceededPacketCountInSlot=[STRING])
Parameter
Description
SourceAttackInterface
Interface that initiates the attack.
AttackProtocol
Attack packet type.
ExceededPacketCountInSlot
Count of dropped packets. After attack defense is triggered on multiple interfaces, packet loss does not occur only on interfaces recorded in the log. (Added in R10)
- VOSCPU/4/CPU_USAGE_HIGH //The CPU was overloaded. Names of the tasks whose CPU usages rank top three and their CPU usages were displayed. If these tasks contained sub-tasks, names of the sub-tasks and their CPU usages were also displayed.
VOSCPU/4/CPU_USAGE_HIGH:The CPU is overloaded (CpuUsage=[ULONG]%, Threshold=[ULONG]%), and the tasks with top three CPU occupancy are: [CPU-resources-usage]
Parameter
Description
[CPU-resources-usage]
Names of the tasks whose CPU usages rank top three and their CPU usage. If these tasks contained sub-tasks, names of the sub-tasks and their CPU usages were also displayed.
CpuUsage
Current CPU usage.
Threshold
CPU usage threshold.
- OSPF/3/NBR_DOWN_REASON //The neighbor status goes Down.
OSPF/3/NBR_DOWN_REASON:Neighbor state leaves full or changed to Down. (ProcessId=[USHORT], NeighborRouterId=[IPADDR], NeighborAreaId=[ULONG], NeighborInterface=[STRING],NeighborDownImmediate reason=[STRING], NeighborDownPrimeReason=[STRING], NeighborChangeTime=[STRING])
Parameter
Description
ProcessId
Process ID.
NeighborRouterId
Neighbor router ID.
NeighborAreaId
Neighbor area ID.
NeighborInterface
Neighbor interface.
NeighborDownImmediate reason
Possible reasons why the OSPF neighbor goes Down:
Neighbor Down Due to Inactivity: The switch does not receive any Hello packets from the OSPF neighbor within the Dead Time.
Neighbor Down Due to LL Down LLDown: The switch does not receive any LLD packet from the OSPF neighbor within the Dead Time.
Neighbor Down Due to Kill Neighbor: The interface connected to the OSPF neighbor is Down, the BFD session on the interface is Down, or the reset ospf process command has been executed. You can view the NeighborDownPrimeReason field to determine the specific cause.
Neighbor Down Due to 1-Wayhello Received or Neighbor Down Due to SequenceNum Mismatch: The OSPF status on the peer interface goes Down and the remote device sends a 1-Way Hello packet to the local device. As a result, the OSPF status of the local device also changes to Down.
Neighbor Down Due to AdjOK?: The AdjOK? event times out.
Neighbor Down Due to BadLSreq: The BadLSReq event occurs on the interface.
NeighborDownPrimeReason
Possible reasons why the neighbor goes Down:
Hello Not Seen: No Hello packet is received.
Interface Parameter Mismatch: The interface settings on two ends of a link do not match.
Logical Interface State Change: The logic interface status changes.
Physical Interface State Change: The physical interface status changes.
OSPF Process Reset: The OSPF process restarts.
Area reset: The area is reset due to an area type change.
Area Option Mis-match: The options of the areas to which interfaces on both ends belong do not match.
Vlink Peer Not Reachable: The virtual link neighbor is unreachable.
Sham-Link Unreachable: The Sham-Link neighbor is unreachable.
Undo Network Command: The network command is undone.
Undo NBMA Peer: The neighbor configuration on the NBMA interface is cleared.
Passive Interface Down: The silent-interface command is executed on the local interface.
Opaque Capability Enabled: The opaque capability is enabled.
Opaque Capability Disabled: The opaque capability is disabled.
Virtual Interface State Change: The virtual link interface status changes.
BFD Session Down: The BFD session goes Down.
Down Retransmission Limit Exceed: The maximum number of retransmission times is reached.
1-Wayhello Received: A 1-way Hello packet is received.
Router State Change from DR or BDR to DROTHER: The local interface role is changed from DR or BDR to DROTHER.
Neighbor State Change from DR or BDR to DROTHER: The neighbor interface role is changed from DR or BDR to DROTHER.
NSSA Area Configure Change: The configuration of the NSSA area is modified.
Stub Area Configure Change: The configuration of the stub area is modified.
Received Invalid DD Packet: An invalid DD packet is received.
Not Received DD during RouterDeadInterval: No DD packet is received during Dead timer restart.
M,I,MS bit or SequenceNum Incorrect: The M, I, and MS bits in received DD packets are different from those defined in the protocol.
Unable Opaque Capability,Find 9,10,11 Type Lsa: The LSAs of types 9, 10, and 11 are received, but the Opaque capability is not enabled.
Not NSSA,Find 7 Type Lsa in Summary List: The local area does not belong to NSSA, but Type-7 LSA exists in Summary.
LSrequest Packet,Unknown Reason: An LSR packet is received due to an unknown reason.
NSSA or STUB Area,Find 5 ,11 Type Lsa: The local area belongs to NSSA or Stub, but Type-5 and Type-11 LSAs exist.
LSrequest Packet,Request Lsa is Not in the Lsdb: The neighbor requests an LSA through LSR from the local process or area, but the LSA does not exist in the LSDB of the local process.
LSrequest Packet, exist same lsa in the Lsdb: The process receives an LSA, which exists in the local LSDB and neighbor request list.
LSrequest Packet, exist newer lsa in the Lsdb: The process receives an updated LSA, which exists in the local LSDB and neighbor request list.
Neighbor state was not full when LSDB overflow: The LSDB overflows, but the neighbor status is not Full.
Filter LSA configuration change: The configuration of LSA filter is modified.
ACL changed for Filter LSA: The ACL configuration of LSA filter is modified.
Reset Ospf Peer: The OSPF neighbor is reset.
NeighborChangeTime
Time when the status changes.
OID Information
Object |
OID |
Data Type |
Description |
Implemented Specifications |
---|---|---|---|---|
hwEntityCpuUsage |
1.3.6.1.4.1.2011.5.25.31.1.1.1.1.5 |
Integer32 |
CPU usage Value range: 2-100 |
read-only |
hwEntityCpuUsageThreshold |
1.3.6.1.4.1.2011.5.25.31.1.1.1.1.6 |
Integer32 |
CPU usage threshold Value range: 2-100 Default value: 80 for modular switches; 95 for fixed switches |
read-write |
Local Attack Defense Policy
The switch provides a local attack defense policy to protect its CPU. When the CPU receives a large number of valid packets or malicious attack packets, this function protects the CPU to prevent service interruption.
Function Overview
As shown in Figure 1-8, local attack defense policies include attack source tracing, port attack defense, CPCAR, and blacklist. The port attack defense and CPCAR functions are enabled by default.
Improper CPCAR adjustment will affect network services. To modify the CPCAR settings, contact Huawei switch resellers.
Attack Source Tracing
After attack source tracing is enabled, the switch analyzes and collects statistics on the packets sent to the CPU. The switch provides thresholds for packets, and considers the packets exceeding thresholds as attack packets. Then the switch locates the source interface and IP address of the attack source, reports logs to users, and takes measures on the attack source. The switch may also discard the attack packets or shut down the attacked interface.
- Set the source tracing mode.
The switch supports the following attack source tracing modes:
- Source IP address-based tracing: defends against Layer 3 attack packets.
- Source MAC address-based tracing: prevents Layer 2 attack packets with a fixed source MAC address.
- Interface+VLAN based tracing: defends against Layer 2 attack packets with different source MAC addresses.
If you are unknown of the packet attack type, configure all of the preceding modes.
- Set the packet type in attack source tracing.
The switch can perform attack source tracing for each of 802.1X, ARP, DHCP, ICMP, IGMP, TCP, Telnet, TTL 1, UDP, DHCPv6, MLD, ICMPv6, and ND packets, or all of them.
When an attack occurs, you cannot identify the type of attack packets. The auto-defend protocol command allows you to flexibly specify the types of traced packets.
- Set the attack defense action.
After identifying an attack source, the switch takes actions on the attack source to prevent it attacking the switch:
- Discards the attack packets within a period.
- Shuts down the interface receiving the attack packets.
- Configure the whitelist.
If you want to exclude some users from attack source tracing and punishment actions, add the users to the whitelist. The switch does not take attack source tracing actions on the users in the whitelist.
Generally, uplink interfaces need to be added to the whitelist to prevent impact on services.
- Set the attack source tracing threshold.
The switch supports the attack source tracing threshold, sampling rate, and event report threshold.
In Figure 1-9, the source tracing mode is based on source IP address, the threshold is 4 pps, and the attack source tracing punishment action is discard packets. If the rate of packets sent to the CPU within one second exceeds the threshold, the system considers that an attack has occurred, generates a log in which the attack source address is 10.3.2.1, and discards packets from this address for a certain period of time.
Port Attack Defense
If too many packets sent from an interface to the CPU from occupying bandwidth, the packets from other interfaces cannot be sent to the CPU to cause a service interruption. Port attack defense controls the number of packets sent to the CPU.
After port attack defense is configured, a switch can trace the source and limit the rate of packets sent to the CPU based on ports, protecting the CPU against DoS attacks.
By default, the port attack defense function is enabled. The switch calculates rate of packets received on an interface. If the packet rate exceeds the threshold within the aging time, the switch considers that an attack occurs. Then the switch traces the source and limits the rate of attack packets on the port, and records a log.
The switch takes the following measures in rate limiting:
- When the packet rate does not exceed the limit (the value is the same as the CPCAR value in attack defense policy), the switch moves the packets to a low-priority queue and then sends them to the CPU.
The switch calculates the rate of protocol packets received by the interface, and performs attack source tracing and rate limiting on the attack packets. When the rate of protocol packets received by an interface exceeds the threshold, the switch considers that an attack has occurred and sends a log. The switch moves packets to the low-priority queue (queue 2, generally. For details about queues, see CPCAR), and then sends the packets to the CPU.
- When the rate of packets exceeds the threshold, the switch discards the packets.
Port attack defense provides the following functions:
- Attack defense for the specified protocol packets
The switch can perform port attack defense for each of ARP Request, ARP Reply, DHCP, ICMP, IGMP, and IP fragment packets, or for all of them.
- Whitelist
If you want to exclude some users from attack source tracing, add the users to the whitelist.
Generally, the uplink interface needs to be added to the whitelist to ensure prompt processing on network-side protocol packets and packets from authorized users to be sent to the CPU.
- Port attack defense thresholds
The switch supports the attack source tracing threshold, sampling rate, and aging time.
When an attack occurs, you cannot identify the type of attack packets. The auto-defend protocol command allows you to flexibly specify the types of traced packets.
In Figure 1-10, both port 1 and port 2 send ARP request and DHCP packets to the CPU. The rate of ARP request packets sent by port 1 and the rate of DHCP packets sent by port 2 exceed the threshold. The switch considers that an attack has occurred, and moves the packets to queue 2, which has a low priority.
By default, port attack defense is enabled on a switch. The rate limiting actions taken by port attack defense have less impact than the rate limiting actions taken by attack source tracing.
CPCAR
The Control Plane Committed Access Rate (CPCAR) limits the rate of packets sent to the CPU to protect the control plane. After packets are sent to the CPU, the switch performs the following types of rate limiting:
- Rate limiting based on protocol
The switch specifies a threshold for each protocol. When the rate of protocol packets exceeds the threshold, the switch discards the packets so that each protocol can be processed promptly.
- Scheduling and rate limiting based on queue
After protocol-based rate limiting is performed, the switch moves packets to queues depending on layer (management/control/forwarding) and importance. The queues have different priorities. The packets in queues are scheduled based on priorities. When conflict occurs, the packets in high-priority queues are processed first. In addition, the switch can limit rate for each queue. It restricts the maximum rate of packets sent from each queue to the CPU. This ensures stable switch running when the CPU has a high load.
The switch has eight queues: queues 0-7. The queue with a large ID has a high priority. To view the packet queues, run the display cpu-defend configuration all command.
- Unified rate limiting
On a stable network, the number of packets sent to the CPU is within an acceptable range. If a large number of packets are sent to the CPU within a short period, the CPU is busy processing these packets, resulting in a high CPU usage. To restrict the total number of packets processed by the CPU, the switch performs rate limiting on all packets to ensure normal running of the CPU.
In Figure 1-11, a large number of protocol packets are sent to the CPU:
- Performs rate limiting on protocol packets based on protocol type.
- Moves packets to different queues depending on the queues of the protocols. The queue with a large ID has a high priority.
- Limits the rate of all packets. If the packet rate exceeds the threshold, the switch discards the packets in low-priority queues.
The CPCAR does not take effect on the management interface. If the network connected to a management interface undergoes a serious attack, users may fail to log in to the switch through the management interface. You are advised to scan virus on the PCs or replan the network.
The switch provides a default CPCAR setting for each protocol. Improper CPCAR settings will affect services on the network. To adjust CPCAR values for specified types of protocol packets based on services and network environment, contact Huawei switch resellers.
Generally, the default CPCAR settings can meet requirements.
Blacklist
A switch receives a large number of protocol packets, overwhelming the CPU. The switch may fail to process valid protocol packets or protocol flapping occurs. You can use the methods such as packet obtaining and attack source tracing to determine the attack source characteristics (such as MAC or IP address), and then configure a blacklist to discard these packets.
You can create a blacklist on a device and add users with specified characteristics to the blacklist. The device then discards the packets from these users. In Figure 1-12, blacklist 1 matches the packets with source IP address 10.1.1.0/24 and blacklist 2 matches packets with source IP address 10.2.2.0/24. When these packets are sent to the CPU, the switch discards them.
Configuring a Local Attack Defense Policy
- Create a local attack defense policy.
- Run the system-view command to enter the system view.
- Run the cpu-defend policy policy-name command to create an attack defense policy and enter its view.
- Configure attack source tracing.
- Run the auto-defend enable command to enable attack source tracing.
- Run the auto-defend trace-type { source-ip | source-mac | source-portvlan }* command to set the attack source tracing mode.
- Run the auto-defend protocol { all | { 8021x | arp | dhcp | icmp | igmp | tcp | telnet | ttl-expired | udp } * } command to set the packet type for attack source tracing.
- Run the auto-defend whitelist whitelist-number { acl acl-number | interface interface-type interface-number } command to configure a whitelist.
- Run the auto-defend action { deny [ timer time-length ] | error-down } command to enable the attack source tracing action function and set the action.
- Configure port attack defense.
- Run the auto-port-defend enable command to enable port-based attack defense.
By default, the port attack defense function is enabled.
- Run the auto-port-defend protocol { all | { arp-request | arp-reply | dhcp | icmp | igmp | ip-fragment } * } command to set the packet type in port attack defense.
By default, port attack defense is applicable to ARP Request, ARP Reply, DHCP, ICMP, IGMP, and IP fragment packets.
- Run the auto-port-defend enable command to enable port-based attack defense.
- Set the rate limit for protocol packets.
The rules of sending protocol packets to CPU include car and deny. When both the car and deny rules are configured for the same type of protocols, the rule configured later takes effect.
- To enable CPCAR limiting for the packets sent to the CPU and set the threshold, run the car { packet-type packet-type | user-defined-flow flow-id } cir cir-value [ cbs cbs-value ] command.
- To set the action taken on the packets sent to the CPU to discard, run the deny { packet-type packet-type | user-defined-flow flow-id } command.
- Run the blacklist blacklist-id acl acl-number command to create a blacklist.
A maximum of eight blacklists can be configured in an attack defense policy.
Packets matching the ACL applied to a blacklist are discarded, regardless of whether the ACL contains a permit or deny rule.
- Apply the local attack defense policy.
After a local attack defense policy is created, the policy must be applied.
- Modular switches
Both MPUs and LPUs have their own CPUs. Local attack defense policies are configured differentially for MPUs and LPUs.
Before creating and applying attack defense policies, check attack information on the MPUs and LPUs. If the attack information on the MPUs and LPUs is consistent, apply the same attack defense policy to the MPUs and LPUs; otherwise, apply different policies to them.
- Apply an attack defense policy to MPU.
- Run the system-view command to enter the system view.
- Run the cpu-defend-policy policy-name1 command to apply the attack defense policy.
- Apply an attack defense policy to an LPU.
If an attack defense policy has been applied to all LPUs, it cannot be applied to the specified LPU. In a similar manner, if an attack defense policy has been applied to a specified LPU, it cannot be applied to all LPUs.
- If all LPUs process similar services, apply an attack defense policy to all LPUs.
Run the cpu-defend-policy policy-name2 global command to apply an attack defense policy.
- If LPUs process different services, apply an attack defense policy to the specified LPU.
- Run the slot slot-id command to enter the slot view.
- Run the cpu-defend-policy policy-name2 command to apply an attack defense policy.
An attack defense policy applied to a slot view takes effect only for the LPU in this slot.
- If all LPUs process similar services, apply an attack defense policy to all LPUs.
- Apply an attack defense policy to MPU.
- Fixed switches
- On a stand-alone switch:
- Run the system-view command to enter the system view.
- Run the cpu-defend-policy policy-name global command to apply the attack defense policy globally.
- In a stack:
- Run the system-view command to enter the system view.
- Apply the attack defense policy.
- To apply the attack defense policy to all stacked devices, run the cpu-defend-policy policy-name global command.
- To apply the attack defense policy to the master device, run the cpu-defend-policy policy-name command.
- On a stand-alone switch:
- Modular switches
Tasks Occupying CPU Resource
Task Name |
Description |
---|---|
BUFM |
Outputs debugging information. |
1731 |
Implements the Y.1731 protocol stack, manages the protocol state machine, and maintains the protocol database. |
_EXC |
Processes system exception events. |
_TIL |
Monitors and processes deadloops caused by software exceptions. |
AAA |
Interacts with modules such as the UCM and RADIUS to process user authentication messages, and maintains authentication and authorization entries. |
ACL |
Controls access users. |
ADPG |
Maintains dynamic VLAN-related chip entries (adaptation layer task). |
ADPT |
Implements the EFM protocol stack, manages the protocol state machine, and maintains the protocol database. |
age_task |
Ages out MAC address entries. |
AGNT |
Implements the IPv4 SNMP protocol. |
AGT6 |
Implements the IPv6 SNMP protocol. |
ALM |
Adds, clears, and manages alarm information. |
ALS |
Implements automatic laser shutdown. |
AM |
Manages IP address pools and addresses for modules such as DHCP. |
AMCP |
Synchronizes data from MPU to SPU (application layer protocol). |
APP |
Schedules Layer 3 services in a unified manner. |
ARP |
Implements the ARP protocol stack, manages the ARP state machine, and maintains the ARP database. |
au_msg_hnd |
Processes AU messages, which are used for MAC entry learning and delivery. |
bcmC |
Counts the number of packets on chip ports. |
bcmD |
Implements asynchronous message processing in chip drive software. |
bcmR |
Receives packets from the chip. |
bcmT |
Transmits packets to the chip. |
bcmX |
Transmits packets to the chip of specified type asynchronously. |
bcmL2MOD.0 |
Learns MAC address entries. |
BEAT |
Sends and receives heartbeat packets to monitor inter-board communication. |
BFD |
Implements the BFD protocol stack, manages the protocol state machine, and maintains the protocol database. |
bmLI |
Scans interface status and notifies the application modules of interface status changes. |
BOX |
Outputs the data stored in the black box, including error and exception information generated during system operations. |
BULK_CLASS |
Manages the USB flash drive (operating system task). |
BULK_CLASS_IRP |
Manages USB I/O request packets (operating system task). |
BusM A |
Manages USB bus (operating system task). |
CCTL |
Collects and schedules performance data in batches. |
CDM |
Manages configuration data. |
CFM |
Recovers configurations. |
CHAL |
Completes functions at the hardware adaptation layer. |
CKDV |
Controls and manages the clock module. |
CMD_Switching |
Listens on sockets. |
CMDA |
Executes commands in batches. |
cmdExec |
Executes commands. |
CSBR |
Checks configuration consistency between the active and standby MPUs. |
CSPF |
Implements the CSPF protocol stack and completes path computation. |
CssC |
Handles cluster events. |
CSSM |
Implements cluster protocol stack and manages cluster status. |
DEFD |
Monitors traffic sent to the CPU and maintains CPU defense data. |
DELM |
Deletes MAC address entries in STP. |
DEV |
Manages hardware modules on the switch. |
DEVA |
Handles subcard hot swapping. |
DFSU |
Loads logic files. |
DHCP |
Implements the DHCP protocol stack and provides the functions such as DHCP snooping and DHCP relay. |
DLDP |
Implements the DLDP protocol stack, manages the protocol state machine, and maintains the protocol database. |
DSMS |
Processes environment alarms generated by the environment monitoring system. |
EAP |
Implements 802.1x authentication, MAC address authentication, and MAC address bypass authentication, manages the protocol state machine, and maintains the protocol database. |
Ecm |
Manages low-level inter-board communication. |
EFMT |
Sends 802.3ah test packets. |
EHCD_IH |
Drives USB host controller (operating system task). |
ELAB |
Manages electronic labels. |
EOAM |
Implements the EOAM 802.1ag protocol, manages the protocol state machine, and maintains the protocol database. |
Eout |
Outputs debugging information about the ECM task. |
FBUF |
Sends packets. |
FCAT |
Obtains the packets sent or received by the CPU for fault location. |
FECD |
Processes MOD synchronization messages. |
FIB |
Generates IPv4 forwarding entries on the control plane and delivers the entries to the forwarding plane to guide data forwarding. |
FIB6 |
Manages IPv6 FIB entries, maintains software entries, and requests the hardware adaptation layer to maintain chip entries. |
FM93 |
Outputs fault information. |
FMAT |
Manage faults. |
FMCK |
Detects device faults. |
FMON |
Monitors logic card failures. |
frag_add |
Synchronizes MAC entries from the hardware table to the software table, traverses the hardware table, and adds the MAC address entries that do not exist in the software table to the software table. |
frag_del |
Synchronizes MAC entries from the hardware table to the software table, traverses the software table, and deletes the MAC entries that do not exist in the hardware table from the software table. |
FTPS |
Offers the FTP service. |
FTS |
Receives packets. This task is created by FECD. After the driver receives packets that do not need to be processed by the super task, it sends the packets to the FTS task for processing. |
GREP |
Manages GRE forwarding entries in chip (adaptation layer task). |
GTL |
Manages common data such as memory and character strings. |
GVRP |
Implements the GVRP protocol stack, manages the protocol state machine, and maintains the protocol database. |
HACK |
Processes HA response messages. |
HOTT |
Manages hot swapping of interface cards. |
HS2M |
Synchronizes data between the active and standby MPUs to ensure high reliability. |
HVRP |
Implements the HVRP protocol stack, manages the protocol state machine, and maintains the protocol database. |
IFNT |
Processes interface status change events. |
IFPD |
Manages interfaces, maintains interface database, and processes interface status change events. |
INFO |
Receives and sends logs, traps, and debugging information generated by service modules. |
IP |
Schedules IP protocol tasks in a unified manner. |
IPCQ |
Retransmits IPC messages upon message transmission failures. |
IPCR |
Sends, receives, and distributes IPC messages to related service modules. |
IPMC |
Adapts to Layer 3 multicast protocols, responds to changes on the control plane, and issues forwarding entries. |
ISSU |
Provides smooth upgrade for firmware. |
ITSK |
Sends, receives, and distributes various protocol packets. |
L2 |
Schedules Layer 2 services in a unified manner. |
L2MC |
Listens on IGMP/MLD packets on interfaces and implements fast join/leave group member interfaces. |
L2V |
Manages VPLS and VLL services, maintains control plane data, and requests the adaptation layer to maintain forwarding entries in chip. |
L3I4 |
Delivers IPv4 unicast forwarding entries from LPUs. |
L3IO |
Delivers entries of Layer 3 protocols, such as URPF and VRRP, to interface cards. |
L3M4 |
Adapts to the ARP protocol on the MPU, delivers IPv4 unicast forwarding entries, and responds to the changes at the control plane. |
L3MB |
Adapts to Layer 3 protocols, such as URPF and VRRP, on the MPU, and delivers forwarding entries. |
LACP |
Implements the LACP protocol stack, manages the LACP state machine, and maintains the LACP database. |
LCS |
Manages licenses. |
LCSP |
Loads authorized features allowed by the license file. |
LDP |
Implements the LDP protocol stack and maintains the LDP LSP database. |
LDRV |
Synchronizes software versions between active and standby MPUs. |
LDT |
Implements the LDT protocol stack, manages the protocol state machine, and maintains the protocol database. |
LHAL |
Provides the hardware adaptation layer to shield hardware differences. |
LINK |
Schedules link layer tasks in a unified manner. |
linkscan |
Monitors the status of links. |
LLDP |
Implements the LLDP protocol stack, manages the LLDP state machine, and maintains the LLDP database. |
LOAD |
Loads the system image file and patch packages. |
LSPA |
Maintains LSP forwarding entries and instructs the hardware adaptation layer to maintain chip entries. |
LSPM |
Creates, updates, and deletes LSPs. |
MCSW |
Adapts to Layer 3 multicast protocols, responds to changes on the control plane, and issues forwarding entries. |
MERX |
Processes the packets received on the management interface. |
MFF |
Implements the MAC forced forwarding (MFF) function. |
MFIB |
Manages Layer 3 multicast forwarding entries. |
MIRR |
Implements port mirroring. |
MOD |
Manages, distributes, and reclaims module numbers. |
MPLS |
Implements MPLS protocol stack, and distributes, manages, and reclaims labels. |
MSYN |
Synchronizes MAC entries between cards. |
MTR |
Collects memory usage data at scheduled time. |
mv_rxX |
Handles packet receiving queues in CPU X (X is an integer ranging from 0 to 7). |
NDIO |
Delivers IPv6 unicast forwarding entries from LPUs. |
NDMB |
Adapts to the ND protocol on the MPU, issues IPv6 unicast forwarding entries, and responds to changes on the control plane. |
NQAC |
Acts as the NQA client to respond to and process NQA packets. |
NQAS |
Acts as the NQA server to respond to and process NQA events and packets. |
NSA |
Manages chip entries at the VRP NetStream adaptation layer. |
NTPT |
Implements the NTP protocol stack, manages the protocol state machine, and maintains the protocol database. |
OAM |
Implements the MPLS OAM protocol stack, manages the protocol state machine, and maintains the protocol database. |
OAM1 |
Adapts to the OAM 802.1ag protocol, responds to protocol-layer changes, and responds to changes on the forwarding plane. |
OAMI |
Processes packets received from logic cards. |
OAMT |
Responds to protocol changes and maintains chip entries (adaptation layer task). |
OS |
Operating system task. |
Ping |
Quickly responds to ping packets. |
PNGI |
Provides fast ping reply on LPUs. |
PNGM |
Provides fast ping reply on MPUs. |
Port |
Processes chip debugging commands. |
port_statistics |
Collects port statistics. |
PPI |
Maintains interface status on chips (adaptation layer task). |
PTAL |
Implements redirection authentication, authentication and authorization, manages the protocol state machine, and maintains the protocol database. |
QOSA |
Manages QoS configurations and maintains chip entries. |
QOSB |
Delivers QoS entries to LPUs and maintains QoS entries. |
RACL |
Creates session table entries based on TCP/UDP/ICMP initial packet, monitors and ages out session table entries. |
RDS |
Implements the RADIUS protocol stack, manages the protocol state machine, and maintains the protocol database. |
RMON |
Monitors the system remotely. |
root |
System root task. |
ROUT |
Completes route learning for routing protocols, selects best routes, and delivers routes to the FIB. |
RPCQ |
Provides the remote procedure call function. |
RRPP |
Implements the RRPP protocol stack on interface cards, detects interface status quickly, and delivers hardware entries. |
RSA |
Calculates the RSA key. |
RSVP |
Implements the RSVP protocol stack and maintains the CR-LSP database. |
RTMR |
Manages scheduled tasks. |
SAM |
Delivers service entries to LPUs and maintains the entries. |
SAPP |
Manages application layer protocol dictionary and whitelist, maintains software entries and instructs the adaptation layer to set chip status. |
SDKD |
Detects the status of the interfaces connected to the backplane and collects the packet rate on the interfaces. |
SDKE |
Displays LSW chip entries. |
SECB |
Delivers security entries to LPUs and maintains the security entries. |
SECE |
Implements security functions such as ARP security, IP security, and CPU security, manages the protocol state machine, and maintains protocol databases. |
SERVER |
TCP/IP server task. |
SFPM |
Queries manufacturer information and digital diagnosis information of optical modules. |
SLAG |
Implements the E-Trunk function. |
SMAG |
Smart link agent that can quickly detect and process interface status change vents. |
SMLK |
Implements the Smart Link protocol stack, manages the protocol state machine, and maintains the protocol database. |
smsL |
Loads the environment monitoring module. |
smsR |
Sends environment monitoring requests. |
smsT |
Enables the environment monitoring system to send packets. |
SNPG |
Listens on and processes IGMP and MLD protocol packets. |
SOCK |
Schedules and processes IP packets. |
SRMI |
Processes external interrupts. |
SRMT |
Device management timer task. |
SRVC |
Processes DHCP packets related to IP sessions, and interacts with the user management module and AAA module to complete authorization and accounting. |
STFW |
Super forwarding task that maintains forwarding entries in the trunk memory. |
STND |
Assists the operating system in task and event scheduling. |
STP |
Implements the STP protocol stack, manages the STP state machine, and maintains the STP database. |
STRA |
Monitors traffic, identifies attacking traffic, and punishes attack sources. |
STRB |
Monitors LPUs and identifies attack traffic. |
SUPP |
Processes interrupt messages and timer messages in the device management module. |
t1 |
Temporary task (operating system task). |
TACH |
Implements the HWTACACS protocol stack, manages the protocol state machine, and maintains the protocol database. |
TAD |
Transmits traps. |
TARP |
Processes trap messages. |
tBulkClnt |
Manages the USB driver (operating system task). |
TCPKEEPALIVE |
Maintains TCP connections. |
TCTL |
Controls the upload of batch collected performance data. |
tDcacheUpd |
Updates the disk cache (operating system task). |
tExcTask |
Handles exceptions (operating system task). |
TICK |
Processes the system clock. |
tLogTask |
Processes logs (operating system task). |
TM |
Maintains chip entries for the access service. |
tNetTask |
Processes network-related events (operating system task). |
TNLM |
Manages tunnels. |
TNQA |
Schedules NQA client tasks in a unified manner. |
TRAF |
Collects statistics on VLL, VPLS, and L3VPN. |
TRAP |
Processes trap messages. |
tRlogind |
Enables remote login to virtual terminals (operating system task). |
tTelnetd |
Telnet server task (operating system task). |
TTNQ |
Schedules NQA server tasks in a unified manner. |
tUsbPgs |
Device management task that manages USB plug-in and plug-out (operating system task). |
tWdbTask |
Debugging proxy task (operating system task). |
U 34 |
Processes user's commands. |
UCM |
Interacts with the AAA module to process user status and maintain user entries. |
UDPH |
UDP Helper |
USB |
USB-based upgrade task. |
usbPegasusLib |
USB host LIB (operating system task). |
usbPegasusLib_IRP |
USB host I/O LIB (operating system task). |
UTSK |
User framework task that optimizes protocol processing to ensure preferential processing of protocol packets. |
VCON |
Serial port redirection task. |
VFS |
Manages the virtual file system. |
VIDL |
Collects statistics on CPU usage of idle tasks. |
VMON |
Monitors system task running. |
VOAM |
Offers NQA VPLS MAC diagnosis. |
VP |
Receives and sends VP packets between boards. |
VPR |
Receives VP packets between boards. |
VPRE |
Processes VP messages. |
VPS |
Sends VP packets between boards. |
VRPT |
Timer test task. |
VRRP |
Implements the VRRP protocol stack, manages the VRRP state machine, and maintains the VRRP database. |
VT |
Virtual terminal task. |
VT0 |
Authenticates the first login user and processes the user's commands. |
VTRU |
Processes the Up/Down events of V Trunk. |
VTYD |
Processes login requests of all users. |
WEB |
Implements Web authentication. |
WEBS |
Allows users to log in to the device through Web. |
XMON |
Traces system task running. |
XQOS |
Service quality task. |
CPU-related Tasks and Functions for Modular Switches
Task Name |
Description |
Reason for High CPU Usage |
Solution |
---|---|---|---|
_EXC |
Processes system exception events. |
In normal cases, this task does not cause high CPU usage. The task is scheduled only when product or service exceptions occur. |
- |
_IPC |
IPC message receiving task on the sub-core. |
- |
- |
_VP |
VP message receiving task on the sub-core. |
- |
- |
_TIL |
Monitors and processes deadloops caused by software exceptions. |
In normal cases, this task does not cause high CPU usage. The task is scheduled only when a product or service task fails to be scheduled or a deadloop occurs. |
- |
1AGA |
EOAM_1AG super task for delivering module events. |
- |
- |
1AGAGT |
EOAM_1AG super task for delivering module events. |
- |
- |
AAA |
Manages user authentication, authorization, and accounting. |
Authentication, authorization, and accounting are performed for a large number of users. |
Reduce online users. |
ACL |
Controls access users. |
Too many ACLs are delivered at a time. |
Prolong the interval between configuring ACLs. |
ADPGVRP |
Task of the GVRP adaptation module. |
- |
- |
ADPT |
Layer 2 adaptation task. Processes BFD VLANIF interface Down events and CFD logic interruption events, and sets the timer for the EFM module. |
- |
- |
ALM |
Adds, clears, and manages alarm information. |
- |
- |
AM |
Manages IP address pools and IP addresses for modules such as DHCP. |
A large number of users apply for IP addresses. |
Reduce the number of users who apply for IP addresses. |
AMCP |
Synchronizes data from MPUs to SPUs (application-layer management and control protocol). |
- |
- |
APP |
Centrally schedules Layer 3 service tasks. |
Multiple tasks are performed to process many service messages. |
Run the display utask-info utask-id slice-time command to check which UTASK task takes a long time. |
APS |
Processes Ethernet protection switching events. |
- |
- |
ARPA |
Processes ARP attack defense events. |
Many ARP attacks are detected on the switch. |
Filter out packets from unauthorized users on interfaces. |
BES |
Sub-core task of the basic transaction service module. |
- |
- |
BGPMDT |
Multicast VPN MDT task. |
- |
- |
BGPMVPN |
Multicast VPN MVPN task. |
- |
- |
Boot |
Dumps diagnostic logs of the driver. |
- |
- |
BPDU |
BPDU module task, which processes some timer messages and the asynchronous messages generated when the MPLS module notifies the AC Up and Down events. |
- |
- |
CWP_BUP |
Processes MAP messages. |
In normal cases, this task does not cause high CPU usage. |
Decrease the service concurrency rate, and expand the system capacity or use high-performance main control units such as SRUH. |
ASFI |
Processes sFlow messages on LPUs. |
sFlow sampling is configured on a large number of interfaces, and the sampling ratio or sampling interval is too small. |
Deploy the sFlow service properly, and configure the sampling ratio and sampling interval based on actual traffic on the interfaces. |
ASFM |
Processes sFlow messages on MPUs. |
||
ASMN |
Manages ASs in an SVF system. |
- |
- |
bcmCNTR.0 |
Collects traffic statistics on chip 0. |
- |
- |
bcmCNTR.1 |
Collects traffic statistics on chip 1. |
- |
- |
bcmCNTR.2 |
Collects traffic statistics on chip 2. |
- |
- |
bcmD |
BCM debugging task. |
A large amount of debugging information is printed. |
- |
bcmI |
bcmINTR task that processes kernel interrupts. |
Many kernel interrupts are reported. |
- |
bcmIbodSync.0 |
Resolves the buffer exceptions on HG interfaces of chip 0. |
Synchronization is performed frequently. |
- |
bcmIbodSync.2 |
Resolves the buffer exceptions on HG interfaces of chip 2. |
||
bcmIpfixDma.0 |
Collects service traffic statistics on the Ipfix register of chip 0. |
The register is frequently accessed. |
- |
bcmIpfixDma.2 |
Collects service traffic statistics on the Ipfix register of chip 2. |
||
bcmL2age.0 |
Ages out MAC address entries on chip 0. |
- |
- |
bcmL2age.2 |
Ages out MAC address entries on chip 2. |
- |
- |
bcmMEM_SCAN.0 |
Periodically checks the memory on chip 0. |
- |
- |
bcmMEM_SCAN.1 |
Periodically checks the memory on chip 1. |
- |
- |
bcmMEM_SCAN.2 |
Periodically checks the memory on chip 2. |
- |
- |
bcmPortMon.0 |
Monitors status of ports on chip 0. |
The status of a port changes frequently. |
- |
bcmPortMon.1 |
Monitors status of the FBUF port on chip 1. |
||
bcmPortMon.2 |
Monitors status of ports on chip 2. |
||
bcmXGS3AsyncTX |
Synchronizes packet sending information. |
- |
- |
BEAT |
Sends and receives heartbeat packets to monitor inter-card communication. |
- |
- |
BFD |
Implements the BFD protocol stack, manages the protocol state machine, and maintains the protocol database. |
A large number of BFD sessions flap. |
Delete or shut down BFD sessions. |
BFDA |
BFD adaptation task that processes IPC messages as well as ARP and MAC address change messages. |
- |
- |
BFDS |
Processes BFD sending and detection timers and other events. |
- |
- |
BOX |
Exports the data stored in the black box, including error and exception information generated during system operations. |
Errors, assertions, exceptions or deadloops occur on the device. |
- |
BOX_Out |
|||
BTRC |
Traces internal debugging functions. |
The trace function is enabled. |
Disable the trace function. |
BULK_CLASS_IRP |
Manages USB I/O request packets (operating system task). |
- |
- |
BusM A |
Manages the USB bus (operating system task). |
- |
- |
CAPM |
Processes CAPWAP events. |
There are too many online users. |
Reduce the number of online users. |
CDRF |
Restores the switch to factory default settings. |
- |
- |
CFGMGR |
Task of the configuration management module. |
- |
- |
CMAI |
CMAINT task that implements cloud-based management, maintenance, and diagnosis functions. |
- |
- |
CMNG |
CMNGHA task that implements the active/standby synchronization mechanism of Redis databases in the cloud-managed scenario. |
- |
- |
CMP |
Certificate Management Protocol (CMP) task. |
- |
- |
CMPM |
Reports intelligent O&M data. |
The CPU usage will be high within a short period of time if a large amount of data needs to be centrally reported for many services. |
Disable data reporting for some services. |
CMREG |
Task for registering with the controller. |
- |
- |
CPMN |
COMP task for sub-core component management. |
- |
- |
CSISSU |
Fast ISSU processing task. |
- |
- |
CSTP |
Performs asynchronous processing on time-consuming STP commands in NETCONF mode. |
- |
- |
CWPA |
WLAN processing task. |
A large number of APs go online or offline, interfaces to which many APs connect are changed, or a large number of STAs go online or offline concurrently. |
Plan the network again and limit the number of online APs and STAs. |
CCTL |
Collects and schedules performance data in batches. |
Data is being collected. |
No action is required. |
CHAL |
Completes functions at the hardware adaptation layer. |
- |
- |
CKDV |
Controls and manages the clock module. |
- |
- |
CLKI |
Processes the timers, IPC messages, and interrupt messages in the clock module of the MPU. |
- |
- |
CMDA |
Executes commands in batches. |
Many service commands are delivered in batches. |
Reduce the number of commands delivered in batches. |
co0 |
Serial port task. |
User operations, especially, input and output operations are frequently performed. For example, commands are copied to the screen (input) or a large number of display commands are executed (output). |
Reduce the frequency at which input and output operations are performed. This problem is automatically solved after the operations end. |
COMT |
Commits ACL configurations to APs. |
A large number of APs go online concurrently. |
Plan the network properly and avoid many concurrent online APs. |
CSBR |
Checks configuration consistency between the active and standby MPUs. |
This task is rarely used and is unlikely to cause high CPU usage. |
No action is required. |
CSPF |
Calculates paths for TE tunnels. |
The TEDB for CSPF frequently changes. |
Check whether the link or IGP flaps. If so, rectify the fault. |
CSS |
Sets up CSS systems and maintains the status and topology (main CSS task). |
- |
- |
CSST |
Tests CSS links and monitors the CSS link status. |
- |
- |
CSSD |
Delays bringing CSS ports Down so that CSS port status changes will not cause CSS split within a short time. |
- |
- |
CSSF |
Performs cross-version upgrades in a CSS system quickly. |
- |
- |
CSSP |
Sends and receives protocol packets in a CSS system. |
- |
- |
CWP_DTLS |
Performs DTLS encryption. |
DTLS links are created or disabled, DTLS negotiation is performed, or APs set up DTLS links in batches. |
This task is used when APs go online through DTLS links. However, it is rarely used. If this task causes high CPU usage, disable DTLS based on the network requirements. |
LBS |
Locates terminals and analyzes the spectrum of non-wireless devices. |
The air scan interval is too short or the radio environment is complex. |
Increase the air scan interval to a proper value by considering both the location precision and CPU usage. |
DCPA |
DHCP task. |
- |
- |
DELM |
Deletes MAC addresses in all slots. |
- |
- |
DNSS |
Provides domain name resolution services for cfgmgr. |
- |
- |
DRIV |
Writes LSW statistics on a modular switch to diagnostic logs. |
- |
- |
DCPI |
Monitors IP traffic (IP FPM). |
Many configurations are enabled and the measurement interval is short. |
Avoid many configurations and increase the measurement interval. |
DEFD |
Processes CPU defense events. |
Too many packets are sent to the CPU. |
Limit the rate of packets sent to the CPU. |
DEVA |
Loads and initializes FSUs, synchronizes entity trees, and performs active/standby switchovers (auxiliary device management task). |
- |
- |
DFSU |
Loads and initializes FSUs. |
- |
- |
DIAG |
Equipment module task on the MPU. |
- |
- |
DLDP |
Sends and receives DLDP protocol packets and manages the protocol state machine. |
DLDP is enabled on too many interfaces and the interval at which DLDP packets are sent is too short. |
|
DRVD |
Processes diagnosis messages for the drive module. |
- |
- |
DSMS |
Processes environment alarms generated by the environment monitoring system. |
- |
- |
EAGE |
Distributes Ethernet packets between the main core and sub-core. |
A large number of protocol packets are sent to the CPU. |
Configure the rate limit of protocol packets properly and deploy the attack defense function. |
EAP |
Performs MAC address and 802.1X authentication. |
Authentication is performed for a large number of MAC and 802.1X users. |
Reduce the number of authentication users. |
Ecm |
Manages low-level inter-card communication. |
- |
- |
EFMT |
Sends 802.3ah test packets. |
- |
- |
EHCD_IH0 |
Processes EHCI interrupts (VxWorks operating system task). |
- |
- |
ELAB |
Manages electronic labels. |
- |
- |
EMDI |
Processes EMID services on LPUs. |
- |
- |
EMDM |
Processes EMID services on the MPU. |
- |
- |
ETHA |
Distributes and processes Ethernet packets. |
A large number of protocol packets are sent to the CPU. |
Configure the rate limit of protocol packets properly and deploy the attack defense function. |
ETHL |
Distributes and processes Ethernet packets related to the sub-core. |
A large number of protocol packets are exchanged between the main core and sub-core. |
Configure the rate limit of protocol packets properly and deploy the attack defense function. |
EVC |
VXLAN access configuration task. |
- |
- |
Even |
Event management task. |
- |
- |
EVPN |
Processes EVPN services on the MPU. |
- |
- |
EXTAgent |
OPS extended agent module. |
- |
- |
EZDT |
EZOP_Dtls |
- |
- |
EOAM |
Implements the EOAM 802.1ag protocol, manages the protocol state machine, and maintains the protocol database. |
The associated service flaps. |
This task rarely causes high CPU usage. If the problem occurs, ensure that the associated service does not flap. |
Eout |
Exports debugging information about the ECM task. |
- |
- |
ERPS |
ERPS adaptation task for initializing global ACL rules and registering events for ERPS. |
- |
- |
ESAP |
eSAP adaptation task. |
There are too many online APs and users. |
Reduce the number of online APs and users. |
esm_recovery.0 |
Fixes soft errors on the extended TCAM of chip 0. |
Soft errors occur in entries on extended chips. |
Collect information about faulty entries and restart the card. |
esm_recovery.2 |
Fixes soft errors on the extended TCAM of chip 2. |
||
EZOP |
Manages the EasyOperation function. This function is used to upgrade the software version and load configurations and patches in batches. |
- |
- |
EZPP |
Manages and processes EasyOperation packets. |
- |
- |
FCAT |
Obtains packets. |
Too many packets are obtained and printed frequently. |
- |
FECD |
Processes messages at the FECD layer. |
Too much diagnostic information is printed. |
- |
FINT |
Super task for quickly responding to LPU removal and installation interrupts. |
- |
- |
FMCK |
FMEA detection task. |
The interface detection task is time-consuming, and it usually does not cause high CPU usage. |
- |
FMEB |
Fault detection task. |
- |
- |
FTS |
Packet sending and receiving task. |
- |
- |
FWRT |
Flash suppression task. |
- |
- |
FLOW |
Performs traffic measurement. |
Too much traffic needs to be collected and analyzed. |
Disable the sFlow service in the case of heavy network traffic. |
FMES |
Exports device fault information and monitors the chip and CPLD status. |
- |
- |
FNTL |
Fast path task for exchanging kennel-mode and user-mode packets. |
- |
- |
FTS_ |
Sends packets to and receives packets from the CPU. |
A large number of protocol packets are sent to or received from the CPU. |
Check whether attacks exist. |
GEM |
Manages general events. |
This task is currently not executed. |
No action is required. |
GEMR |
Manages general events. |
This task is currently not executed. |
This task is currently not executed. |
GLRM |
License adaptation task that registers license-controlled items. |
- |
- |
GREI |
GRE module adaptation task on the LPU. |
- |
- |
GREM |
GRE module adaptation task on the MPU. |
- |
- |
GRES |
Task corresponding to the label and token resource modules. |
The CPU is high due to the application that applies for resources. The GRESM task usually does not cause high CPU usage. |
Check whether the service that applies for labels or tokens flaps. |
GRSA |
Creates RSA and DSA key pairs. |
||
GTL |
Manages common data such as memory and character strings. |
This task will not cause high CPU usage. |
No action is required. |
GVRP |
Receives and sends GVRP packets, and processes internal messages of the GVRP protocol. |
The CPU usage is high due to a large number of VLANs that need dynamic GARP registration or a large network radius. |
Increase the timer value. |
H2CM |
Main task for HTTP/2 clients. |
- |
- |
H2CT |
HTTP/2 client timer task. |
- |
- |
HACA |
HACA module. |
- |
- |
HACK |
Notifies the HA message sending result. |
- |
- |
HERB |
Intelligent heartbeat task. |
- |
- |
HGMP |
Task of the HGMP adaptation module. |
- |
- |
HOUP |
Smart upgrade task. |
- |
- |
HP2C |
HP2C task for managing HTTP/2 clients. |
- |
- |
HS2M |
HA backup task. |
- |
- |
HSB |
Association with VRRP to provide the dual-system hot standby service. |
- |
- |
HTPC |
HTTP client task. |
- |
- |
HTPS |
HTTP/HTTPS channel processing task. |
- |
- |
HTPSRD |
HTTPS redirection processing task. |
- |
- |
HWTACACS |
HWTACACS module. |
- |
- |
HVRP |
Processes HVRP command lines, sends and receives packets, and processes timer messages. |
- |
- |
IDE |
Terminal identification module. |
- |
- |
IFMO |
|
- |
- |
IFOA |
Collects MIB data of interfaces and obtains interface Up/Down information. |
- |
- |
IKPI |
Collects the running status and performance indicators of monitored devices. |
- |
- |
IPCC |
Task for communication between cloud-based management processes. |
- |
- |
ISSU |
ISSU backup task. |
- |
- |
IFAD |
Processes IPC messages delivered by the VCT. |
VCT detection is performed frequently. |
- |
IFLP |
Collects traffic statistics on a management interface periodically. |
A large number of interfaces are configured and the measurement interval is short. |
- |
IFNT |
Processes interface status change events. |
The interface flaps frequently. |
- |
IFPD |
Manages interfaces, maintains interface database, and processes interface status change events. |
There are a large number of interfaces, interface link status flaps, or optical modules become faulty. |
- |
IFWL |
Processes wireless interface-related events. |
A large number of APs go online or offline, interfaces to which many APs connect are changed, or a large number of STAs go online or offline concurrently. |
- |
INPT |
Serial port task. |
- |
- |
IPCK |
Processes the received IPC messages and sends ACK messages to the peer. |
The service process is simple, which will not cause high CPU usage. |
- |
IPCQ |
Retransmits IPC messages upon message transmission failures. |
The retransmission frequency is not high, which will not cause high CPU usage. |
- |
IPCR |
Sends, receives, and distributes IPC messages to related service modules. |
- |
- |
IPFP |
Monitors IP traffic (IP FPM). |
Many configurations are enabled and the measurement interval is short. |
- |
IS2U |
ISSU function adaptation task. |
- |
- |
ISC6 |
Processes commands of IPsec6 and encrypts packets. |
This task will not cause high CPU usage. |
- |
ITSK |
Sends, receives, and distributes various protocol packets. |
A large number of protocol packets are sent and received. |
- |
JOB |
Maintenance assistant task. |
When the maintenance assistant meets the trigger conditions, the CPU usage will be high if many commands in the script are executed in batches. |
Reduce the number of commands in the script. |
L2 |
Centrally schedules Layer 2 service tasks, and supports the MGR, ErrorDown, BPTNL, LNP, VCMP, MFLP, VLAN, and QinQ features. |
LNP: There are too many interfaces. VCMP: VLANs are frequently deleted or created. BPTNL: A large number of packets are transparently transmitted. |
LNP: This feature rarely causes high CPU usage. If the problem occurs, check the reason of interface flapping and avoid frequent flapping. VCMP: Avoid frequently creating or deleting VLANs. BPTNL: Configure transparent transmission of protocol packets on interfaces. |
L2_E |
Main task of the EOAM feature. |
The associated service flaps. |
This task rarely causes high CPU usage. If the problem occurs, ensure that the associated service does not flap. |
L2_P |
Supports LACP, HGMP, 3AH, and ELMI features. |
- |
- |
L2_R |
Supports ERPS, RRPP, and SEP features. |
Incorrect connections exist after a protocol is deployed and the device suffers from a TC packet attack. |
Ensure that physical loops are closed. |
L2_T |
Supports the Eth-Trunk feature. |
- |
- |
L2IF |
Processes real-time backup and batch backup of MAC address and VLAN information. |
- |
- |
L2MO |
MAC address learning task. |
- |
- |
l2st |
Processes MVL security event messages. |
- |
- |
lshelp |
Delayed callback task. |
- |
- |
LYNC |
Identifies Lync sessions and sets different priorities for different traffic types. |
- |
- |
L2PQ |
Processes IPC messages of Layer 2 protocols. |
- |
- |
L2V |
Processes L2VPN services, including VLL and VPLS. |
Flapping occurs on the public network. As a result, a large number of services send mapping packets, and connections are re-established. |
Solve the problem of public network flapping. |
L3I4 |
Processes Layer 3 IPv4 services on the LPU. |
- |
- |
L3IO |
Processes Layer 3 services in the common module on the LPU. |
- |
- |
L3M4 |
Processes Layer 3 IPv4 services on the MPU. |
- |
- |
L3MB |
Processes Layer 3 services in the common module on the MPU. |
- |
- |
LAGAGT |
Agent task on the LPU for sending and receiving LACP negotiation packets. |
A large number of LACP negotiation packets are received and the LACP frequently flaps. |
Analyze the configuration and traffic on interfaces, and verify that the Eth-Trunk service is normal. |
LBDT |
Sends, receives, and processes loopback detection packets. |
LBDT is configured for many VLANs and interfaces. |
Disable LBDT in some VLANs and on some interfaces. |
WMT_PM |
Collects PM performance data. |
eSight collects AP data periodically. |
Adjust the PM performance measurement interval. |
LCSP |
License adaptation task that registers license-controlled items. |
- |
- |
LDCM |
Command line task in the load module. |
- |
- |
LDT |
Sends, receives, and processes loop detection packets. |
- |
- |
LDTP |
Receives loop detection packets. |
LDT is configured for many VLANs and interfaces. |
Disable LDT in some VLANs and on some interfaces. |
LHAL |
Provides the hardware adaptation layer for LPUs to shield hardware differences. |
- |
- |
LINK |
Centrally schedules link layer tasks. |
Multiple tasks are performed to process many service messages. |
Run the display utask-info utask-id slice-time command to check which UTASK task takes a long time. |
LLDP |
Sends, receives, and processes LLDP protocol packets. |
A switch receives a large number of LLDP protocol packets because it has too many LLDP neighbors. |
Reduce the number of LLDP neighbors on the switch. |
LNP |
LNP protocol task. |
- |
- |
LOAD |
Loads the system image file and patch packages. |
- |
- |
LRCV |
Receives packets in the load module on the MPU. |
- |
- |
LSPA |
MPLS LSP process (MPLS LSP AGENT) task. |
- |
- |
LSPM |
Processes LSP services. |
LDP, RSVP, or BGP LSPs flap frequently, triggering LSP creation and deletion. |
Determine which type of LSPs flap, such as LDP, BGP, or RSVP LSP. LSP flapping is typically caused by IGP, BGP, or VPN route flapping. |
LT0 |
Local Telnet task. This task is rarely used on live networks. |
This task is rarely used on live networks. |
No action is required. |
MACA |
TCAM MAC aging processing task. |
- |
- |
MACC |
MAC address learning control task. |
- |
- |
MBDS |
Updates lightweight data to the shared memory in a multi-core system. This task is used in the early stage of a multi-core system. |
- |
- |
MCM |
McmDiag task for multi-core diagnosis. |
- |
- |
McmDiag |
Multi-core diagnosis task. |
- |
- |
MCME |
Monitors the status of processes in other cores in a multi-core system. |
- |
- |
MDGW |
mDNS gateway task. |
- |
- |
MDRY |
mDNS relay task. |
- |
- |
MDSP |
mDNS snooping task, which parses and listens on mDNS packets. |
- |
- |
MPAT |
Multi-core patch processing task. |
- |
- |
MSTP_ADP |
Task of the MSTP adaptation module. |
- |
- |
MTP |
Probe protocol management task. |
- |
- |
MACL |
Creates and updates MQC traffic policies. |
Too many traffic policies are created and frequently updated. |
Prolong the interval between configuring MQC traffic policies. |
MACRESTORE |
Retrieves bottom-layer MAC software entries. |
- |
- |
MAD |
Processes MAD in direct mode. |
- |
- |
MADP |
Processes MAD in relay mode. |
- |
- |
MCSF |
Processes multicast entries delivered to SFUs. |
Multicast entries are repeatedly updated due to route or port flapping. |
Check whether route or port flapping occurs. |
MDNS |
Processes mDNS protocol packets. |
A large number of mDNS packets are sent to the CPU. |
Limit the rate of mDNS packets sent to the CPU, and check whether too many mDNS packets are caused by external attacks or network loops. |
MERX |
Processes the packets received on the management interface. |
The management interface receives a large number of packets. |
Rate limiting on the management interface prevents the attack of a large number of packets. |
METH |
Redirection task for the management interface. |
- |
- |
MFF |
MFF task. |
ARP-MFF packets are processed. |
Configure the rate limit of ARP-MFF packets properly and deploy the attack defense function. |
Mirr |
Processes the mirroring service. |
A large number of configurations are synchronized in the batch backup process. |
Reduce the mirroring configurations. |
MOD |
Learns MAC address entries. |
MAC address flapping or a hash conflict occurs. |
- |
MPSF |
MPLS service adaptation task on the SFU. |
- |
- |
NDIO |
Layer 3 IPv6 adaptation task on the LPU. |
- |
- |
NDMB |
Layer 3 IPv6 adaptation task on the MPU. |
- |
- |
MTR |
Collects memory usage data at scheduled time. |
- |
- |
NFPT |
Manages scheduled tasks. |
This task will not cause high CPU usage. |
No action is required. |
NPFM |
Chip fault detection and handling task. |
- |
- |
NVO3 |
VXLAN tunnel management task. |
- |
- |
NQAF |
Provides the NQA FTPR function. |
The NMS frequently uses FTP to obtain the results of NQA test instances. |
Decrease the frequency of operations. |
NSA |
Processes the NetStream service. |
A large number of flows are sent to the CPU of an LPU. |
Use flexible flows to decrease the number of flows. |
NTLK |
Netlink fast path for exchanging kennel-mode and user-mode messages. |
- |
- |
NTPT |
Provides the NTP clock synchronization function. |
A large number of NTP attack packets are received. |
Configure NTP authentication. |
OAM |
Implements the MPLS OAM protocol stack, manages the protocol state machine, and maintains the protocol database. |
- |
- |
OAM1 |
Adapts to the OAM 802.1ag protocol, responds to protocol-layer changes, and responds to changes on the forwarding plane. |
- |
- |
OAMI |
Processes packets received from logical cards. |
- |
- |
OAMT |
Responds to protocol changes and maintains chip entries (adaptation layer task). |
- |
- |
OCSP |
OpenSSL task. |
- |
- |
OIDS |
Object management task. |
- |
- |
OMIN |
Responds to configuration delivery, status obtaining, and user-defined operations performed on the controller. |
When a large number of packets are exchanged between the device and controller, the CPU usage may be high for a short period of time. After the exchange is complete, the CPU usage falls into the normal range. |
The CPU usage will automatically fall into the normal range. |
OMLG |
Task for the cloud-based management module to manage syslogs of processes. |
The CPU usage may be high for a short period when a large number of logs are generated. |
The CPU usage will automatically fall into the normal range after services are processed. |
OMMS |
BIN patch of the monitor process of the highest level. |
- |
- |
OMNG |
Handles the exit events of processes for the deception and cloud-based management modules. |
- |
- |
OMSB |
Task for the cloud-based management module to manage creation and exit events of processes and send notifications. |
The CPU usage may be high for a short period of time. |
The CPU usage will automatically fall into the normal range after services are processed. |
OOM1 |
Monitors OMM kill events of cgroup. |
- |
- |
OOM2 |
Monitors OMM kill events of cgroup. |
- |
- |
OPS |
OPS task. |
- |
- |
OPSA |
EXTAgent task. |
- |
- |
OPSC |
Processes OPS commands. |
- |
- |
OPSE |
Executes OPS scripts. |
- |
- |
OSPFv3-FRR |
OSPFv3 FRR task. |
- |
- |
OS |
Operating system virtual task. |
This task will not cause high CPU usage. |
- |
PATB |
Patch task. |
- |
- |
Pers |
Persistency task. |
- |
- |
PKIM |
Public key infrastructure (PKI) task. |
- |
- |
PLRN |
Learns MAC addresses by software. |
- |
- |
PMC |
Handles performance management commands. |
- |
- |
PROC |
Multi-core heartbeat monitoring task. |
- |
- |
PARITY_CHECK |
Detects soft errors in entries. |
Soft errors occur in entries. |
- |
PATC |
Manages patches. |
- |
- |
PCAI |
Processes the iPCA service on LPUs. |
- |
- |
PCAM |
Processes the iPCA service on MPUs. |
- |
- |
PGMC |
XMPP-side connection task for free mobility. |
- |
- |
PGMP |
Manages free mobility policies. |
- |
- |
PGMX |
XMPP-side task for free mobility. |
- |
- |
PMS |
Uploads performance measurement files. This task is triggered when automatic upload of performance measurement files is enabled. |
Generally, files are not uploaded frequently and the sizes of uploaded files are small. Therefore, this task will not cause high CPU usage. |
No action is required. |
PNGI |
Processes the Layer 3 fast ping service on LPUs. |
- |
- |
PNGM |
Processes the Layer 3 fast ping service on MPUs. |
- |
- |
POE |
Checks whether PDs are in present and checks the grading status and power control policies of the PDs. |
- |
- |
POE+ |
Processes the PPPoE plus protocol. |
A large number of PPPoE packets are sent to the CPU. |
|
PPI |
Maintains VLAN and MAC address data and delivers entries (L2 adaptation task). |
Network loops or flapping occurs, or port security is configured on multiple ports. |
|
PPP |
Processes the PPPoE protocol. |
A large number of PPPoE packets are sent to the CPU. |
|
PTAL |
Processes Portal authentication. |
A large number of HTTP packets for Portal authentication are sent to the CPU. |
|
QDIA |
Intelligent diagnosis task. |
- |
- |
QOSA |
Processes QoS services on MPUs. |
Too many messages on MPUs are backed up to standby MPUs in the batch backup process. |
Reduce QoS configurations. |
QOSB |
Processes QoS services on LPUs. |
Too many messages on MPUs are backed up to standby MPUs in the batch backup process. |
Reduce QoS configurations. |
RACL |
Processes reflective ACLs. |
Many reflective ACLs are configured using commands and updated frequently. |
Prolong the interval between configuring reflective ACLs. |
RDS |
Processes the RADIUS protocol. |
A large number of RADIUS packets are sent to the CPU. |
|
RMON |
Monitors the system remotely. |
This task will not cause high CPU usage. |
No action is required. |
root |
System root task. |
- |
- |
ROUT |
Completes route learning for routing protocols, selects best routes, and delivers routes to the FIB. |
A large number of multicast packets are received, and multicast entries are updated due to route changes or interface changes. |
Configure multicast filtering policies. |
RPCQ |
Dispatches RPC messages. |
- |
- |
RRPP_ADP |
Task of the RRPP adaptation module. |
- |
- |
RRPP |
Implements the RRPP protocol stack on LPUs, detects interface status quickly, and delivers hardware entries. |
A common FDB attack occurs. |
Check whether hubs are introduced to the network. |
SAID |
Periodically detects faults after the device is started, including data collection, data diagnosis, and troubleshooting. |
- |
- |
SCEP |
Simple Certificate Enrollment Protocol (SCEP) task. |
- |
- |
SCKA |
SOCKADP task. |
- |
- |
SEA |
Monitors the service quality of RTP audio and video applications in real time. |
- |
- |
SFLOW |
sFlow task. |
- |
- |
SMDG |
Intelligent diagnosis task. |
- |
- |
smspicmg |
Processes vCMU hot swapping and resends packets upon timeout events. |
- |
- |
smsvcmu |
Manages power modules and fan modules. |
- |
- |
smsvLd |
vCMU loading task. |
- |
- |
smsvRq |
vCMU request processing task. |
- |
- |
smsvRs |
vCMU response processing task. |
- |
- |
smsvtimer |
vCMU timer task. |
The timer processing takes a long time, which usually does not cause high CPU usage. |
- |
SOCK |
Schedules and processes IP protocol packets. |
- |
- |
SOTM |
Processes some timer messages of the protocol stack. |
- |