Customer feedback some users in GPRS networks can’t access the internet when get the IP address, it is necessary to interrupt session to acquire new IP to resume normal business.
1, ping and routing analysis
A) fault terminal equipment can ping firewall interface normally.
B) can’t ping the external network next hop’s C manufacturer’s IP address
C) from E1000 firewall equipment can ping C manufacturer’s address normally.
Analysis from the above phenomenon:
Fault business should be appeared between the firewall and C manufacturer's equipment.
2, firewall debugging command analysis
A) the main firewall collects session information of fault business, the firewall has completed NAT conversion successfully and send out the message, but have not received reply message.
B) when ping the IP address of C manufacturer's equipment interface from the terminal, view from statistical information, received ICMP request message, and send the ICMP response message. As shown in figure answer information analysis.JPG can be seen it sent out 12 ICMP request after the NAT translation, but received 0 ICMP response.
3, from the above phenomenon analysis, terminal send request message has been do NAT conversion on the main firewall, and be sent to the C manufacturer's equipment, and C manufacturer's equipment also sent response message, but not to the main firewall, it is likely that the response messages are sent to standby firewall, lead to ping abnormal.
4, consult with customer and view C manufacturer’s ARP table, confirm some address learning of NAT address pool is standby firewall address. Delete such error ARP entry and configure VRRP ID parameters, problem solving.
Check the firewall’s NAT configuration, found NAT address pool IP and E1000 firewall external network interface IP address are in the same network segment, but haven’t been configured corresponding VRRP ID,
Combined with problems phenomenon analysis, on-site service failure is because E1000 firewall NAT address pool hasn’t been configured VRRP ID, will cause the primary/secondary firewall respond to CISCO equipment ARP request at the same time, therefore parts NAT address’s ARP of C manufacturer's equipment will learn from the backup firewall, thus sent the message to the backup firewall by fault.
Error message’s specific interactive process is as follows:
1, As principle diagram.JPG shows, the red is C manufacturer's equipment request the ARP message of NAT address pool’s address, which is radio message, forward by S5328 second switch to FW-0 and FW-1.
2, because FW-0 and FW-1 are configured with the same NAT address pool, and NAT address pool didn’t configure VRRP ID parameters, this will lead to after the primary/secondary firewall received C manufacturer's equipment ARP request message, it will do ARP response processing (the blue is FW-0 ARP response message, and the purple is FW-1 ARP response message).
3, when C manufacturer's equipment received FW-0 ARP response message firstly, and received FW-1 ARP response message later, it will learn the ARP message as standby firewall’s ARP (but we need to learn FW-0 ARP response), which can cause the business messages from C manufacturers to NAT address are sent to the FW-1.