According to the feedback from a representative office in country Z, the quality of voice services on one NE40E under its supervision at a site in country M deteriorated. Several hours later, the services were restored automatically.
The NE40E software version is V600R001C00SPC800.
For information about the network topology, see the attachment.
1. Log in to the NE40E on the live network to perform a ping test on the interface that transmits voice services. Packets are lost in the ping operation.
2. Check the interface status. The result shows that traffic is not evenly distributed among MP-Group 2/0/1 and the other two MP-Group interfaces. The following workaround method is used to address this problem.
The core network engineers confirm that the data traffic on the live network is all UDP packets and the destination UDP port numbers for the network services are discontinuous. In this case, redirection can be manually configured to direct data traffic to the three links instead of one link only.
On the user-side interface (the GE interface in the figure) of the NE40E, configure a redirection policy and complex traffic classification to classify the UDP destination port numbers into three groups. After that, apply the redirection policy to the three groups by specifying the three MP-Group interfaces as the outbound interfaces for the traffic of the three groups respectively. Load balancing then can be performed among the three MP-Group interfaces.
3. Require the field engineers to replace the user-side board (LPU 1 in the figure) with an LPUF-10 as soon as possible to solve the problem.
1. The interface status is checked. On MP-Group 2/0/1, the bandwidth usage exceeds 90%. The other two MP-Group interfaces do not carry much traffic and their traffic volumes are far less than that on MP-Group 2/0/1. It seems that traffic congestion has occurred on MP-Group 2/0/1 and caused packet loss. However, since the three MP-Group interfaces are configured to carry out load balancing together, the difference between their traffic volumes should not be so great.
2. The board type is checked. The uplink board (LPU 1 in the figure) on the device is an LPUK. The LPUK supports 2-tuple hash only. This means that on the LPUK, different service flows (port numbers are different and protocol-based interworking is unavailable) with the same source and destination IP addresses will be considered the same flows and directed to the same link. As a result, when the LPUK forwards traffic, a certain link may be fully utilized if there are only a few IP packets among the traffic and other links are rather idle. This is a weakness of the LPUK and it is due to the limitation of the LPUK’s hardware. The way to solve the problem is to replace the LPUK with an LPUF-10 that supports 5-tuple hash.
(1) Use an LPUK as the uplink board on the NE40E, and estimate the impact of the problem on services if the NE40E performs IP-based load balancing for the traffic that has only a few IP packets in it.
(2) The usage scenarios of load balancing vary with each other, and service flows in different scenarios are quite different. To avoid make mistakes and achieve the best load balancing effects, you are recommended to consult the R&D engineers before deploying load balancing for services.