3. Service topology:
I. Uneven load sharing:
Four ME60s existing in the customer ISP network functioned as LNSs. New ME60s in the network functioned as LACs. After services were cut over, users went online successfully.The customer reported that the number of sessions over each tunnel was approximate before the cut over but was dramatically different after the cutover. Statistics about sessions on one LAC were as follows:
LocalTID RemoteTID RemoteAddress Port Sessions RemoteName
This problem was resolved after LAC and RADIUS-delivered configurations were changed.
II. CPU usage of the board in slot 1 on the LNS was high if the LNS had about 26000 online users.
On an ME60 functioning as an LNS, the LPUK boards in its slots 1, 2, 5, and 7 functioned as tunnel boards. However, only one 10G interface on the LPUK board in slot 1 connected to LACs as a network-side egress interface. Therefore, echo request and reply packets from all L2TP users were transparently transmitted from the tunnel boards in slots 2, 5, or 7 to the board in slot 1, and then to LACs.
When one LNS had about 25000 online users, a large number of echo packets were transmitted to the board in slot 1. As a result, the CPU usage of the board in slot 1 often reached about 70%.
1. Checked configurations on the live network.
Four boards were bound with the LNS-group, among which only interface 1/0/0 on the board in slot 1 connected to LACs.
2. Checked the CPU usage of the board in slot 1 when an LNS has about 26000 online users.
3. Checked the CPU usage of the board in slot 1 when an LNS has about 15981 online users.
The CPU usage was 61% and VPR task-occupied CPU usage was 15%. The board in slot 1 had 5803 users, and the other boards had 10178 users, 4022 users on the board in slot 2, 2234 users on the board in slot 5, and 3922 users on the board in slot 7.
4. Checked the VP packets that the board in slot 1 received.
The information was as follows:
According to the preceding information, the board in slot 1 received over 1000 VP packets per second so that the VPR tasks occupied a large amount of CPU resources. Generally, a board receives less than 500 VP packets.
5. Analyzed the VP task packets.
After going online, L2TP users transmitted echo packets to each other. The packet transmission interval was determined by a terminal if the terminal initiated detection and was 20s if the device initiated detection.
The following showed the statistics about echo packets within 1s. According to the information, about 1000 packets were transmitted per second when about 10000 users were online. This indicated that the echo packets were transmitted at an interval about 10s, much shorter than 20s.
Conclusion: The PPP echo packet transmission interval configured on user terminals was short. In addition, because all users in the LNS group shared one network interface board, the CPU usage of the network interface board became high when a large number of users were online.
To resolve the problem, add network-side egress interfaces connecting to LACs and plan routes between LACs to LNSs for L2TP users, to share load of the board in slot 1.
Perform the following operations:
1. Add network-side egress interfaces on an LNS. The added interfaces must be located on other boards instead of the current network interface board.
2. Change configurations at the LAC-side and configurations that the RADIUS server delivers to the LAC-side so that user packets from LACs are shared by two links.3. Modify the routes from the LAC to the intermediate switch or core router and outbound routes on the LNS to ensure that replies from the LNS to the LAC can be transmitted to interfaces on two boards.