No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

CPU Utilization of an LPU of ME60 Exceeded the Threshold After an LPU Reset Due to Too Many Access Attempts

Publication Date:  2013-09-27 Views:  37 Downloads:  0
Issue Description

A power outage occurred in a customer's equipment room. After the LPU of an ME60 was powered on, some users complained about Internet access failures. Varied errors such as 691 and 678 were displayed.

Exception information on the ME60:

< ME60>display  health

7 BSU                 99%           55%  456MB/827MB

15 BSU                 92%           48%  404MB/831MB

 <ME60>display  cpu-usage slot 7

POXR                 44%         0/3037dd38       POXR                        

POXS                  5%         0/ b12df12        POXS

<ME60>display  cpu-usage slot 15

POXR                 20%         0/3037dd38       POXR                        

POXS                  4%         0/ b12df12        POXS

[ME60-diagnose]display alarm all history

140    Error      13-04-20  10:01:25    The CPU utilization of LPU 7 (Entity)

                                        crosses the warning/critical threshold

                                        [BasCode:0x12200,EntCode:0x0]

144    Error      13-04-20  13:59:09    The CPU utilization of LPU 15 (Entity)

                                         crosses the warning/critical threshol

                                        d[BasCode:0x12200,EntCode:0x0]
Handling Process

Huawei completed the following steps to diagnose the problem:

1. Suspected that the problem was caused by an excessive number of user authentication packets.

According to the exception information, the POXR module, which processed user authentication and probe packets, occupied most CPU resources.

2. Checked packet statistics. PPPoE protocol packets were lost, whereas PADI packets increased rapidly.

[ME60-diagnose]display  cpu-defend  statistics-all  slot  15

CarID Index   Packet-Info                        Passed-Packets   Dropped-Packets

363   253     PES_EXCP_ID_PPPOE_CTRL          11417476           1769575

389  1102     PES_EXCP_ID_PPP_PADI                2854111               0

 

3. Checked the number of users connected to the LPU over the affected period.

The users connected to the LPU lagged. Typically, PPPoE users over the LPU increase by at least 40 per second.

<ME60>display access-user slot

  Slot                     : 15       Total  user : 2957
Root Cause
A large number of users attempted to connect the LPU after the LPU reset.
Solution

Reduce the CPCAR for sending PADI packets to the CPU on ME60:

1. Check for the CPCAR index of PADI packets.

[ME60-diagnose]display  cpu-defend  statistics-all  slot  15

CarID Index   Packet-Info                        Passed-Packets   Dropped-Packets

389    1102   PES_EXCP_ID_PPP_PADI                2854111               0

 

2. Reduce the CPCAR value in the CPU defend policy.

[ME60-cpu-defend-policy-1]car index 1102 cir 50

 

3. Apply to policy to the LPU.

[ME60-slot-15]cpu-defend-policy  1

 

The CPCAR value can be adjusted in V6R2SPC035 and later for V6R2 and in V6R5.

After this value is reduced, users are connected to the LPU gradually. If this value is set to 50, users connected to the LPU increase by about 50 per second.
Suggestions

After an LPU is reset due to power-offs, upgrades, or board operations, PPPoE users automatically attempt dial-up access to the LPU. If the LPU carry a large number of PPPoE users, CPU may remain high for a long period on, resulting in user access failures.

If the number of PPPoE users on an LPU exceeds 8k, configure the CPCAR parameter of PADI packets properly in the CPU defend policy.

END