Users Fail to Access the Server Through the IPSec Tunnel Because the TCP MSS Value on the AR Is Incorrect

Publication Date:  2015-10-14 Views:  536 Downloads:  0
Issue Description
As shown in the following figure, the weblogic server is installed on cluster servers of App01 and App04 in the Application Zone of the PIMS Centre. xxx-AR12-01 in the Fixed Information Collection Office (FICO) and PIM001-AR22-01 or PIM001-AR22-02 in the PIMS Centre establish an IPSec tunnel over the ISP L2VPN.



A user uses three PCs and two portable computers to access the weblogic server through the IPSec tunnel. Only one portable computer can successfully access the weblogic server sometimes, and a blank page is displayed when other PCs access the weblogic server. As shown in the following figure, the client waits for the HTTP response.



Run the display ipsec sa command. The command output shows that the IPSec tunnel has been set up and the configuration is correct.
Handling Process
1. Obtain the packet headers from the PCs that fail to or successfully access the weblogic server.

Use the Follow TCP Stream tool to check the packet header from the PC that fails to access the weblogic server. There is the failed HTTP request, and the weblogic server once sent a response to the client. As shown in the following figure, the weblogic server sends HTTP 302. After HTTP 302 is redirected, the PC continues to send a request to the weblogic server. The weblogic server returns HTTP 200, but no text is returned.



2. Check whether the weblogic server works properly.

A user can access the weblogic server from the CP Centre, indicating that the weblogic server works properly. In addition, the aggregation switches is normal.

3. Ping the IP address of the weblogic server from a computer in the FICO.



The TTL is 252 and 253 sometimes, indicating that the path where data packets are transmitted changes. The firewall may block ping response packets as attack packets.

4. Check whether the firewall is normal.

Run the undo firewall session link-state check tcp command on the firewall, and test the PCs. The fault persists. The firewall is normal.

5. Check the L2VPN link of the egress router or ISP.

Check the packets about access failures. The following susceptible data packets are found:



Analyze the data packets. There is the segment loss in TCP packets returned by the weblogic server.



When a PC in the FICO sends large non-fragmented ping packets to the weblogic server, the ping operation fails. For example, the ping 10.248.0.15 –l 1446 –f command is used.

Check packets about successful access. There are many response packets with fragment loss. There is a possibility that a PC successfully accesses the weblogic server.



Fragment packets are lost when being transmitted along the ISP link. That is, the TCP MSS is incorrect.

6. Change the TCP MSS.

Run the tcp adjust-mss 1200 command in the interface view of the AR to change the TCP MSS to 1200.

After the MSS is changed, all PCs in the FICO can successfully access the weblogic server.
Root Cause
The TCP MSS specifies the maximum segment length. If the MSS plus the TCP and IP packet headers is larger than the MTU, data packets are fragmented and sent out.

In this scenario, the total length (MSS + TCP packet header + IP packet header + IPSec header) of TCP packets is larger than the MTU, so data packets are fragmented and sent out. The fragmentation consumes more CPU resources, and encryption and decryption of fragments also consume CPU resources of devices along the transmission link. When too many CPU resources are consumed, data packets may be lost. A user uses three PCs and two portable computers to access the weblogic server through the IPSec tunnel. Only one PC can successfully access the weblogic server sometimes, and a blank page is displayed when other PCs access the weblogic server.
Solution
In IPSec scenarios, considering the TCP and IP packet headers, it is recommended that the TCP MSS be set to 1200 on an interface of the egress router. Therefore, devices along the ISP link do not consume CPU resources to fragment the packets. Packet loss does not occur, and the weblogic server can work properly.
Suggestions
In VPN scenarios, consider the size of outgoing data packets. If the size is large, data packets will be fragmented. The fragmentation consumes more CPU resources, and encryption and decryption of fragments also consume CPU resources of devices along the transmission link. When too many CPU resources are consumed, data packets may be lost. Some high-level applications such as HTTP reset the Don't fragment (DF) field of IP packets to prevent TCP packets from being fragmented. If the DF field is reset and the interface MTU is smaller than the MSS, the router discards TCP packets because TCP packets cannot be fragmented.

Ensure that the MSS plus other costs is not larger than the MTU. The maximum MTUs supported by Ethernet and PPPoE packets are 1500 bytes and 1492 bytes respectively. You are advised to set the MSS to 1200 bytes.

END