No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

The NAT Server Fails to Work Because of Incorrect ARP Entries on the Peer Router of the Outgoing NE40

Publication Date:  2012-07-27 Views:  2 Downloads:  0
Issue Description
The networking diagram is as follows:
Server―intranet―egress router (NAT)―network cable―optical transceiver―optical fiber―optical transceiver―Internet
To construct the intranet, make the Megabit Ethernet port of a router as the outbound interface. Because the optical fiber is adopted at the peer end, the NE40 must be first connected to the optical transceiver, and then to the peer router through the Ethernet cable. It is the same case with the peer router. For the intranet of the user’s network, the address segment of 192.168.0.0/16 is assigned. For the public network, the segment of x.x.x.0/28 is assigned. The extranet IP address of the router is x.x.x.1/28 and the IP address of the peer router is x.x.x.2/28. Internal users access the Internet through the egress router whose IP address is translated into x.x.x.3-10 upon NAT. In addition, the intranet server needs to provide the WWW services externally. Turn it into a NAT server with x.x.x.11 as its public network address and 192.168.1.1 as its VPN address. Previously, the user used the equipment manufactured by XX vendor and the network worked normally. With the growth of internal users, the performances of the previous equipment could not meet the requirements. Therefore, the user replaced the old routers with Huawei’s NE40.
After the NE40 was configured as required by the user and according to the original configuration, the previous routers were replaced. However, it was found that internal users could access the Internet normally, but extranet users could not access the WWW server. 
 
Alarm Information
Null
Handling Process
The engineer modified the address of the extranet port of the NE40 to x.x.x.11/28 and then tried to ping x.x.x.2 at the remote end. First, the address could not be pinged, but in a moment all worked normally. Thus, the ARP entry of x.x.x.11 on the peer router became the MAC address of the NE40, which was what the engineer hoped. When the original configuration was resumed, the engineered conducted the test again and found that the problem was solved. 
Root Cause
This problem may be caused by:
The intranet or the intranet server.
Because extranet users could not access the server, the engineer first pinged the server on the NE40. The result proved the server was not Down. Then the engineer tried to access the WWW services provided by the server through a PC in the intranet and succeeded, indicating that the server could provide the services normally. Although the network configuration of the server could not be checked, the engineer was almost sure that there was nothing wrong with it because the configuration of the server that worked normally before was not modified, nor was the network topology.
Then, the extranet.
The engineer tried to ping the IP address of the extranet port of the NE40 from a PC on the public network and succeeded. Then the engineer tried to ping the IP address of the PC and also succeeded. Therefore, the communications with the extranet proved to be normal.
Finally, the NE40.
The engineer checked the configuration of the NE40 and found no problem at first. To confirm that, the engineer conducted the test on the same configuration in the lab and still found no problem. Internal users could access the extranet through NAT and external users could also access the WWW server.
Till now no problem was found with the internal communication, external communication, or the NE40. Thus, the troubleshooting came to a standstill.
The engineer continued to test the equipment. When the service was switched to the previous router, all worked well. When the service was switched back to the NE40, the access to the server still failed. From this angle, it seemed that there must be something wrong with the NE40.
The engineer traced the route x.x.x.11 through the PC on the public network by running the tracert command. Then it was found that the route was interrupted at a certain address, but there was nothing wrong with tracert x.x.x.1. To probe into the problem, the engineer accessed the WWW services provided by the server through the PC on the public network. He conducted the packet catching test respectively on the intranet port and extranet port of the NE40, finding that the WWW packets of x.x.x.11could be caught at the extranet port, but no WWW packets to 192.168.1.1 were found at the intranet port. It seemed that the packets were sent to the NE40, but the NE40 failed to forward the packets. When the previous router was used, all worked normally. Thus, the problem seemed to be caused by the NE40. However, in the lab, the test conducted on the same configuration of the same version proved nothing wrong.
Was there anything wrong with the IP address of the public network? The engineer tried using a new address x.x.x.12 and found everything normal. However, the problem still existed when the original address was used. Now the workaround was found, but the cause of the problem was still unclear. The problem was not solved completely.
Based on the preceding finding that tracert x.x.x.11 through the PC on the public network failed, but tracert x.x.x.1 proceeded normally, the engineer thought over the forwarding process of packets: When a packet is sent to a Layer 3 network equipment, the equipment must know the Layer 3 route and the Layer 2 MAC address. The packets were sent to the NE40, indicating that the information was learned by the peer equipment. Then how would the NE40 process the packets? There was nothing wrong with the configuration or the destination IP address. Was it possible that the problem was caused by the Layer 2 MAC address? The engineer checked the packets again caught before, finding the destination MAC address was not the NE40’s. Now the cause was found out for the forwarding failure. This was the key to solving the problem.
From the preceding operation, the engineer confirmed that the peer router encapsulated the MAC address of x.x.x.11 incorrectly, which led to the incorrect ARP entries. The current problem was how to make the peer router learn the correct MAC address of x.x.x.11. Because the peer equipment was connected through the optical transceiver and it was impossible to access that equipment to shut down the interface or reset the ARP entries, the only thing that the engineer could do was update the address locally, for example, sending free ARP packets. For this purpose, the engineer modified the address of the extranet port of the NE40 to x.x.x.11/28 and then tried to ping x.x.x.2 at the remote end. First, the address could not be pinged, but in a moment all worked normally. Thus, the ARP entry of x.x.x.11 on the peer router became the MAC address of the NE40, which was what the engineer hoped. When the original configuration was resumed, the engineered conducted the test again and found that the problem was solved. 
 
Suggestions
The user previously used another router as the outgoing port. The public network address of the server was x.x.x.11. In normal circumstances, the peer equipment of the egress router had the ARP entry of x.x.x.11 and the MAC address was that of the previous router. When this router was replaced with the NE40, due to the existence of the optical transceiver, the corresponding interface of the peer router would not be Down and the ARP entry would not disappear on its own initiative. For the NE40, which used x.x.x.11 as the address of the NAT server, it would not send free ARP packets on its own initiative. That was why the ARP entries at the remote end failed to be updated (generally, aging of ARP entries takes about two hours). The cutover test conducted on the NE40 at this time found the incorrect ARP entries. Checking the packets caught by the extranet port of the NE40 when the external users could not access the server and the destination MAC address, the engineer found that the problem was definitely caused by the incorrect MAC address, which proved the previous judgment. The problems characterized by the following features are mainly caused by the incorrect MAC address:
The public network address of the NAT server was used before. Thus, ARP entries corresponding to the address exist on the peer router.
The egress router is configured with NAT Server. Thus, the server will not send free ARP packets on its own initiative.
The peer equipment is connected through the optical transceiver. Thus, the interface does go Down and ARP entries do not disappear automatically.
The simplest way to solve the problem: Update the ARP entries on the peer equipment locally. Configure the address of the NAT server as the interface address. After the IP address of the peer router is pinged, resume the previous configuration.
Of course, that the IP address upon NAT is unavailable is not necessarily caused by the ARP entries. The simplest way to confirm the problem is to catch the packets to check whether the destination MAC address is correct. If not, solve the problem in the way described in the preceding section. 
 

END