Troubleshooting Common Faults in Accessing the Internet Through a NAT-Capable BRAS in Distributed Networking
Introduction
This document describes how to troubleshoot when users cannot go online or access the Internet after NAT services are deployed on a broadband remote access server (BRAS) in distributed networking.
Prerequisites
This document applies to NE40E and ME60 series products running V800R010C00 or later.
Understanding Distributed NAT
NAT can be deployed in either centralized or distributed networking.
- Distributed NAT: In this mode, NAT-capable service boards are installed on devices (for example, BRASs) to perform NAT.
- Centralized NAT: is an early NAT deployment mode. In this mode, a standalone NAT device performs NAT and is attached to a core router (CR) or BRAS.
Distributed NAT Workflow
NAT can be performed in either forward (private to public network) or reverse (public to private network) direction.
- Forward NAT:
- After receiving a packet, a NAT device determines whether to perform forward NAT:
The device matches the user packet against a UCL (numbered 6000 through 9999) bound to a traffic diversion policy:
- If the packet matches the UCL, the device diverts the packet to the NAT service board.
- If the packet does not match the UCL, the device forwards the packet according to the regular forwarding process.
- After receiving a packet, a NAT device determines whether to perform forward NAT:
- The packet is diverted to the NAT service board bound to a NAT instance for translation. When the first packet arrives at the NAT service board, the board selects a public IP address from an address pool bound to the NAT instance and a public port number from a port range bound to the instance. The public IP address and port number replace the existing source IP address and port number, respectively, in the user packet. Then, to perform NAT, the NAT board creates a session table and matches subsequent packets against the table.
- After translation, the user packet is forwarded to the next hop according to the regular forwarding process.
- Reverse NAT:
- After receiving a packet, a NAT device determines whether to perform reverse NAT:
The device matches the user packet against a traffic diversion policy:
- If the destination address in the packet matches a NAT address pool route contained in the FIB table, reverse NAT needs to be performed.
- If the destination address in the packet matches a route of another type, the device forwards the packet according to the regular forwarding process.
- After receiving a packet, a NAT device determines whether to perform reverse NAT:
- The NAT device diverts the matching packet to a NAT service board.
The NAT service board performs reverse translation on the user packet based on a NAT mapping entry. The destination public IP address and port number in the user packet are replaced with private IP address and port number, respectively.
- After reverse NAT is performed, the user packet is forwarded to the next hop according to the regular forwarding process.
Distributed NAT Configuration File
Configure the BRAS service on the NAT device so that users can go online. For details, see HUAWEI NE40E Router Configuration Guide - User Access. The configuration of the NAT function is as follows:
license
active nat session-table size 6 slot 1 card 0
#
service-location 1
location slot 1 card 0
#
service-instance-group group1
service-location 1
#
nat instance nat1 id 1
service-instance-group group1
nat address-group group1 group-id 1 11.1.1.1 11.1.1.5
nat outbound 3001 address-group address-group1
#
user-group group1
#
acl number 3001
rule 10 permit ip source 10.110.10.0 0.0.0.255
#
acl number 6001
rule 1 permit ip source user-group group1
#
traffic classifier c1
if-match acl 6001
#
traffic behavior b1
nat bind instance nat1
#
traffic policy p1
classifier c1 behavior b1
#
traffic-policy p1 inbound
#
domain isp1
user-group group1 bind nat instance nat1
#
ip route-static 11.11.11.0 24 null 0
#
ospf 1
import-route static
#
return
Troubleshooting Flowchart for Common Faults
Distributed NAT Users Fail to Go Online
Run the display aaa online-fail-record command to check and analyze causes of user access failure (see #EN-US_TOPIC_0177194659/ref12538855).
<HUAWEI> display aaa online-fail-record
--------------------------------------------------------------------------------
User name : user1@dm1
Domain name : dm1
User MAC : XXXX-XXXX-XXXX
User access type : PPPoE
User interface : GigabitEthernet1/1/5
User access PeVlan/CeVlan : -/-
User IP address : 10.1.1.1
User ID : 1
User authen state : Authened
User acct state : AcctIdle
User author state : AuthorIdle
User login time : 2018-11-10 12:54:45
Online fail reason : Add nat user data fail(Search Public Addr Fail)
--------------------------------------------------------------------------------
Causes of user access failures#ref12538855
Cause |
Description and Suggestion |
---|---|
Add nat user data fail(Input Error) |
Failed to add a NAT user, because input parameters are incorrect.
|
Add nat user data fail(Create User Fail) |
Failed to add a NAT user, because the user failed to be created. Verify the session table resource configuration in the license view. |
Add nat user data fail(Port PreAlloc Fail) |
Failed to add a NAT user, because port pre-allocation failed. Verify the address pool and port resource configurations. |
Add nat user data fail(Syn User To Spu Fail) |
Failed to add a NAT user, because user data synchronization failed. |
Add nat user data fail(Search Public Addr Fail) |
Failed to add NAT user data, because of a failure to obtain a public IP address. Check that the address pool configuration is correct. |
Add nat user data fail(add slave user fail) |
Failed to add a NAT user, because the backup CGN device or standby CGN service board failed to go online. Check the hardware, resources, and configurations of the backup CGN device or standby CGN service board. |
Add nat user data fail(public resource conflict) |
Failed to add a NAT user, because of a public network resource conflict. |
Add nat user data fail(slave VPN mismatch) |
Failed to add a NAT user, because the user VPN instance differed from the VPN instance configured on the backup CGN device or standby CGN service board. Verify the VPN configuration on the backup CGN device or standby CGN service board. |
Add nat user data fail(IP Access User Limit) |
Failed to add a NAT user, because of user access restriction. Check the maximum number of access users allowed on the NAT device. |
Distributed NAT Users Go Online but Cannot Access the External Network
- Common Causes
- NAT service board resources are not allocated.
- The NAT configuration is incorrect, preventing NAT session creation.
- There is no route between the NAT gateway and external host.
- The ACL configuration is incorrect.
- An intranet host is unreachable from the NAT gateway.
- The application level gateway (ALG) function is disabled.
- Troubleshooting Procedure
- Check that resources are allocated to the service board.
- Run the display nat session-table size command to check information about session table resources allocated to each service board. For example:
<HUAWEI> display nat session-table size
---------------------------------------------------------------------------
TotalSize :48 M
UsedSize :4 M
FreeSize :44 M
SlotID CpuID CurSessTblSize CfgSessTblSize ValidFlag
1 0(engine) 2 M 2 M Valid
2 1(engine) 2 M 2 M Valid
---------------------------------------------------------------------------
Description of the display nat session-table size command output
Item |
Description |
---|---|
TotalSize |
Total number of session table resources |
UsedSize |
Total number of used session table resources |
FreeSize |
Total number of idle session table resources |
SlotID |
Slot ID of a service board |
CurSessTblSize |
Number of existing session table resources of a CPU |
CfgSessTblSize |
Number of session table resources configured for a CPU |
ValidFlag |
Flag bit of the session table resources:
|
If no resources are allocated to the service board or NAT is disabled, reconfigure the function. For details, see "Configuring the NAT Session Table and Bandwidth Resources" in HUAWEI NE40E Router Configuration Guide - NAT and IPv6 Transition Technology.
- Check that the NAT service has correct session or user information.
- Run the display nat session table command to check that a correct session has been created for the NAT service. For example:
<HUAWEI> display nat session table slot 1 engine 0
This operation will take a few minutes. Press 'Ctrl+C' to break ...
Slot: 1 Engine: 0
Current total sessions: 1.
udp: 192.168.3.198:1234[1.1.1.2:2234]--> 11.11.11.11:1024
If the protocol type, IP address, or port number displayed is incorrect, check the NAT service configuration. If this configuration is incorrect, reconfigure the NAT service. For details about how to configure NAT services, see "NAT Basic Configuration" in HUAWEI NE40E Router Configuration Guide - NAT and IPv6 Transition Technology.
If the protocol type, IP address, and port number in the session information are correct, go to Step 3.
- Run the display nat user-information command to check information about online NAT users.
If the IP address, port number, or session restriction displayed is incorrect, check the NAT service configuration. If this configuration is incorrect, reconfigure the NAT service. For details about how to configure NAT services, see "NAT Basic Configuration" in HUAWEI NE40E Router Configuration Guide - NAT and IPv6 Transition Technology.
If the IP address, port number, and session restriction in the user information are correct, go to Step 3.
- Check whether the NAT device can reach the destination host on the external network.
Run the ping command to check reachability.
- If the ping fails, run the display ip routing-table command to view the current routing table. Check whether a correct route to the external network is configured on the device. If the route configuration is incorrect, determine whether to reconfigure the route:
The external network address to be accessed by the intranet user is on a different network segment than the external network interface of the NAT device, and there is no available route between them. In this case, configure a static route on the gateway so that the intranet packets can be forwarded through the correct interface after being translated by the device.
If the external network address to be accessed by intranet users and the external network interface of the NAT device are on the same network segment, you do not need to configure a static route.
- If the NAT device can ping the external host, go to Step 4.
- Check that the route configuration of the intranet host is correct.
Run the display ip routing-table command to check whether a correct route is configured on the internal host so that packets sent to the external network can be forwarded to the NAT device. If the route configuration of the internal host is incorrect, reconfigure the route. Otherwise, go to Step 5.
- Collect the following information and contact Huawei technical support:
- Execution result of the preceding steps
- Configuration file, log information, and alarm information of the NAT device
Some Users Fail to Go Online After a NAT Service Cutover
Fault Description
After the NAT service is cut over on an NE40E, some users fail to go online.
Troubleshooting Procedure
- Run the display aaa online-fail-record command to check causes for user access failures.
Online fail reason : Add CGN user data fail(Port PreAlloc Fail)
Check #EN-US_TOPIC_0177194659/ref12538855 and find that the user fails to go online because port pre-allocation fails.
- Check the public IP address and port number of the user.
nat instance cpe1 id 1 port-range 4096 nat address-group group1 group-id 0 section 0 1.1.136.0 mask 24 nat outbound 3001 address-group group1 #
The port range size is 4096, and each public IP address is assigned 16 (65536/4096) port segments. There are 256 public IP addresses. Therefore, at most 4096 (256 x 16) users can obtain the port segment.
- Check the ACL configuration. Check whether more than 4096 users are allowed to access the network.
acl number 3001 rule 5 permit source 10.1.0.0 0.0.7.255 rule 10 permit source 10.1.8.0 0.0.7.255 rule 15 permit source 10.1.16.0 0.0.7.255 #
- According to the preceding analysis, private network users outnumber public IP addresses. As a result, some users cannot obtain port resources and fail to go online. Properly plan the ratio between public and private IP addresses, increase the number of public IP addresses in the NAT instance, or reduce the port block size.
Summary
The ratio between public and private IP addresses exceeds a specified upper limit. As a result, some users cannot obtain port resources and fail to go online.
Some Users Fail to Go Online After NAT Is Configured
Fault Description
After the NAT service is configured on an NE40E, some users fail to go online.
Troubleshooting Procedure
- Run the display aaa online-fail-record command to check causes for user access failures.
Online fail reason : Add CGN user data fail(Search Public Addr Fail)
Check #EN-US_TOPIC_0177194659/ref12538855 and find that the user fails to go online because they fail to receive a public IP address.
- Check the public IP address and port number of the user.
nat instance cpe1 id 1 port-range 4096 nat address-group group1 group-id 0 section 0 1.1.1.0 mask 24 nat outbound 3001 address-group group1 #
The port range size is 4096. Each public IP address is assigned 16 (65536/4096) port segments, and there are 256 public IP addresses. Therefore, at most 4096 (256 x 16) users can obtain the port segment.
- Check the ACL configuration. Check that fewer than 4096 private network users are allowed access, and the ratio between public and private IP addresses does not exceed the limit.
acl number 3001 rule 1 permit source 10.1.1.0 0.255.255.255 #
- The private IP address of the user is 10.1.1.1, which is not defined in ACL 3001. Therefore, the user fails to be assigned a public IP address because the private IP address 10.1.1.1 does not match the ACL.
Summary
The private network segment is not specified in the ACL. As a result, the NAT service board fails to find a matching public IP address and the user fails to go online.
Related Information
For more information about the NAT service and how to configure it, see NE40E V800R011C00SPC200 Product Documentation.