Huawei S3700 switches and WiMAX stations compatibility issue

Publication Date:  2014-10-30 Views:  268 Downloads:  0
Issue Description
The customer experienced an issue that when he connected the pc-client to the WiMAX client, besides receiving  IP address (1.1.1.1) from DHCP server located on S3700 no other communication was possible,  neither to the Internet (8.8.8.8) nor to the VLAN default gateway 1.1.1.254. If he was plugging the PC-Client on one of S2700 ports he was able to access the default gateway and the Internet.  Connecting the PC-client back to the WiMAX client still could access the 1.1.1.254 and 8.8.8.8 but if he was rebooting Huawei devices the S3700 and S2700 the problem was reappiring. For a better view refer to the topology below:


Handling Process

1. Asked the customer to provide me dumps on S2700 for every step that he was experiencing transitions from not working to access the gateway to actually working to access the gateway. So the customer saved one capture when the devices were freshly booted up (not working), one capture with the PC-Client directly connected with S2700 (working), one with the PC-Client went back after the WiMAX Client station (working) and the last one after Huawei devices are rebooted (not working).

2. From all the dumps analyzed I discovered that ARP broadcast messages are not passing through the radio link meaning that WiMAX stations did not forward the ARP broadcast messages so when the gateway needed to fill up its ARP table it triggered an ARP-REQUEST to find out the MAC address using 1.1.1.1 but all of them were stopped by the radio link. None of them arrived on PC-Client so it did not knew to whom should answer with ARP-REPLY because it had not received any requests  therefore the Gateway was dropping every packet with IP Destination 1.1.1.1, the incomplete ARP entry.
More strange is that for DHCP messages the WiMAX devices hadn’t this “restriction” that is why the PC-Client could gain an IP Address which made our troubleshooting more difficult.

3. The little “hint” that our customer gave us was that with other vendor devices there is no such problem. Asked him to provide us dumps while using other vendor devices. The capture was exactly the same, the ARP broadcast messages were not let go through by WiMAX devices but it worked for the PC-Client to access the gateway and the Internet. This situation headed us to the root cause.

4. We did not knew exactly how other vendor devices behaves in this case but we did knew that our switches has some feature called arp strict learning and by having this enabled (by default) the device learns only ARP entries for ARP REPLY packets in response to ARP Request packets sent by itself so the device can defend against most ARP attacks. But in this case if we couldn’t  use ARP replies triggered by itself maybe it could learn from someone else so I suggested the customer to disable this feature and this was the saving command. The PC-Client could access the gateway and the Internet because now the gateway learnt the information that it needed from another source and it could send packets with destination IP 1.1.1.1.
Root Cause
Having arp strict learning enabled (by default) the device learns only ARP entries for ARP Reply packets in response to ARP REQUEST packets sent by itself so if its ARP REQ packets were dropped the gateway never create a ARP entry for 1.1.1.1.
Solution
Just disable arp learning strict on the interface.
[Quidway] interface vlanif 20
[Quidway-Vlanif20] arp learning strict force-disable

END