NE40 VRP3.10 2321
NE20 VRP5.30 23122004
Many NE20 E1 and NE40 CPOS are connected via mp-group. The bandwidth is 2*2M and EBGP is enabled. NE40/C6509/S8512 is connected via GE and OSPF is enabled.
The client's PC attached under NE20 slowly visits applied program of the server under S8512. There is loss if it pings large packets.
1. Check NE40/NE20 cpu and memory and they are normal. The loss is not caused by insufficient handling capability.
2. There is loss when the client pings NE40 large packet, but interconnected mpgroup of NE20/NE40 has no increase of crc errors. The loss is not related with physical interface and link.
3. There is no loss when NE40 pings server. The link from NE40 to server interface and link is normal. The loss occurs in the link from NE40 to NE20.
4. Make EACL match test on NE40. When client pings icmp packet of server, NE40 can receive it and forward it to server. NE40 can also receive icmp packet responded by server.
5. Many client ping servers attached under NE20 have loss. NE20 is not problematic. The loss occurs in NE40 CPOS.
6. Make packet capture test at interconnected GE of NE40/C6509. The packet that server responds to client is mainly non-fragmented TCP packet with 1500-bit. Observe traffic statistics of interconnected mp-group interface of NE40/NE20 and it is above 2Mbps. The interface statistics is average value within one period and it cannot reflect traffic outburst. It is possible that responded large packet of GE causes traffic outburst and NE40 CPOS loss.
7. Further analyze NE40 FIC and there is scheduling queue limit. By default BE traffic can only utilize 1/32 of 256K token bucket (8K). Suppose outburst traffic is larger than 8k within one second, there is loss. There is no limit for EF traffic. Enable QoS for the traffic of client visiting server and define it as EF traffic. Set the bandwidth of EF traffic as 3.2M (3.2M/4M=80%) on mp-group interface, i.e. EF traffic occupies 80% of the bandwidth. After the setting is finished, there is no loss when client pings server and the service is normal.
It is required to locate problematic site by segment because there are many devices.
For co-existing HIC and FIC, the mutual access based on HIC and FIC may have loss by traffic outburst. Analyze interface queue scheduling mechanism of the device and handle it with matching methods to ensure normal application.