No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

Troubleshooting Guide

CloudEngine 16800, 12800, 12800E, 8800, 7800, 6800, and 5800 Series Switches

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Packet Fragmentation Leads to a Ping Failure

Packet Fragmentation Leads to a Ping Failure

Keywords

Packet fragmentation, ping failure, CE12800, switch

Abstract

ICMP packet fragments are mis-sequenced during transmission and so are discarded. This leads to a ping failure.

Problem Description

Figure 14-9 Network topology

CO_CS and OP_DS are CE12800s. WN_DS is an S9700 and has NAT deployed. The FW is a non-Huawei firewall.

Fault Symptom

  • The OP server can ping the headquarters' server with small packets but fails with large packets that exceed 1472 bytes.

  • OP_DS and CO_CS can ping the headquarters' server with small packets but fail with large packets that exceed 1472 bytes.

Procedure

If the local device pings the remote device with large packets that exceed 1472 bytes, ICMP packets will be fragmented. In the following situations, check whether packet fragmentation causes packet mis-sequencing or incorrect labeling and then leads to forwarding failures: Two ends can ping with small packets but not large packets. Small packets can be forwarded, but large packets cannot be forwarded.

Analyze the fault cause according to the location of the device that initiates a ping operation.
  • The OP server cannot ping the headquarters' server with large packets.

    In Figure 14-9, headers of packets on the downlink and uplink ports that connect OP_DS to the FW (as shown by the green and blue circles respectively) are obtained. Figure 14-10 and Figure 14-11 show obtained packet header information on the downlink and uplink ports respectively.

    Figure 14-10 Obtained packet header information on the downlink port that connects OP_DS to the FW
    Figure 14-11 Obtained packet header information on the uplink port that connects OP_DS to the FW

    The obtained packet header information shows: Packets sent from OP_DS to the FW are sequenced: Each large ICMP request packet is placed before a small ICMP request packet. Packets sent from the FW to OP_DS are in reverse order: Each small ICMP request packet is placed before a large ICMP request packet. In this situation, fragmented packets are mis-sequenced.

    When the OP server sends large packets to ping the headquarters' server, ICMP packet fragments are mis-sequenced after being processed by the FW connected to OP_DS in bypass mode. When the ICMP packet fragments arrive at WN_DS, WN_DS discards these fragments because NAT cannot process mis-sequenced fragments.

    Three of four lines of the Eth-Trunk outbound interface on the FW are removed to change the Eth-Trunk into a single link so as to check whether the FW causes packet mis-sequencing. Subsequently, the OP server can ping the headquarters' server successfully with large packets.

    In conclusions, the ping failure is resulted from packet mis-sequencing on the FW's Eth-Trunk interface.

  • OP_DS and CO_CS cannot ping the headquarters' server with large packets.

    The following uses CO_CS as an example. Packet headers are obtained on the uplink port that connects CO_CS to WN_DS, as shown in Figure 14-12.

    Figure 14-12 Obtained packet header information on the uplink port that connects CO_CS to WN_DS

    The obtained packet header information shows that CO_CS receives an ICMP response from the headquarters' server. In the ICMP response, the don't fragment (DF) bit and more fragments (MF) bit are both sent to 1.

    According to RFC 791:
    • don't fragment: The value 0 indicates that a packet can be fragmented, while the value 1 indicates that a packet cannot be fragmented.
    • more fragments: The value 0 indicates that the fragment is the last fragment, while the value 1 indicates that there are other fragments in addition to this fragment.

    If the DF bit of a packet is set to 1 but the packet needs to be fragmented, this packet will be discarded. Packets with the DF bit 1 can be sent to the hosts that cannot reassemble fragments. When CO_CS finds that both the DF bit and MF bit of the received ICMP response are set to 1, it considers the ICMP response an invalid packet and discards the packet. As a result, a ping failure occurs.

    CO_CS receives packets with incorrect fragment flag bits and so cannot obtain information about the headquarters' server and intermediate devices. The following analyzes the cause of this problem.

    The headquarters' server sets the DF bit in its ICMP response to 1 in an effort to obtain the minimum MTU, also called path MTU (PMTU) of the return path. In this PMTU discovery mechanism, the server specifies the PMTU of a path, sends packets that are shorter than the PMTU and use the DF bit 1 on this path. If some large packets cannot be fragmented and forwarded by some intermediate devices on the path, these devices will discard the packets and return an ICMP error packet, indicating that the packets need to be fragmented but the DF bit has been set to 1. When the server receives such an ICMP error packet, it reduces the PMTU of the path until it does not receive any ICMP error packet. PMTU discovery is then complete. In the PMTU discovery mechanism, devices can respond with ICMP error packets when receiving packets that need to be fragmented but carry the DF bit 1. In actual applications, however, some devices forcibly fragment such received packets without changing the DF bit from 1 to 0. As a result, CO_CS receives packets with the incorrect DF bit.

    In conclusions, the ping failure is resulted from the incorrect DF bit of packets processed by the headquarters' server or intermediate network devices on the path from OP_DS and CO_CS to the headquarters' server.

Solution

  • If the OP server cannot ping the headquarters' server with large packets, locate and rectify the trunk hash problem of the non-Huawei firewall. Currently, you can change the firewall's Eth-Trunk outbound interface into a single link to prevent this problem. This preventive measure, however, will affect link bandwidth.
  • If OP_DS or CO_CS cannot ping the headquarters' server with large packets, the customer needs to ensure that all the intermediate network devices on the path from OP_DS or CO_CS to the headquarters' server support PTMU discovery and can respond with ICMP error packets so that the CE12800s can receive ICMP responses with valid DF bits. Test results show that the OP server can identify ICMP responses with both the DF bit and MF bit set to 1. Therefore, this problem will not affect actual services. You are advised to test the actual networking environment and evaluate the impact.
Translation
Download
Updated: 2020-01-07

Document ID: EDOC1000060766

Views: 611752

Downloads: 2956

Average rating:
This Document Applies to these Products

Related Version

Related Documents

Share
Previous Next