What Is EVPN? How Does BGP EVPN Work?
EVPN Overview
Basic Concepts of EVPN
Why is EVPN developed? In the initial VXLAN solution (RFC 7348), no control plane is defined. VXLAN tunnels are manually configured, and host addresses are learned through traffic flooding. This method is easy to implement, but it causes a lot of flooding traffic on the network and makes network expansion difficult.
To solve the preceding problems, EVPN is introduced as the control plane of VXLAN that is a network virtualization overlay (NVO) protocol, as shown in Figure 1-1. EVPN can also function as the control plane of some other protocols. This document describes only information about EVPN functioning as the control plane of VXLAN.
EVPN uses the MP-BGP mechanism. Before understanding fundamentals of EVPN, let's review MP-BGP.
Traditional BGP-4 peers use Update messages to exchange routing information. An Update message can advertise reachable routes with the same path attribute. These routes are carried in the Network Layer Reachability Information (NLRI) field. BGP-4 can manage only IPv4 unicast routing information, so MP-BGP was developed to support multiple network layer protocols, such as IPv6 and multicast. MP-BGP extends NLRI based on BGP-4. After extension, the description of the address family is added to NLRI to differentiate network layer protocols, such as the IPv6 unicast address family and VPN instance address family.
Similarly, EVPN uses the MP-BGP mechanism and defines a new sub-address family, EVPN address family, in the L2VPN address family. In the EVPN address family, a new type of NLRI is added, that is, EVPN NLRI. EVPN NLRI defines several types of BGP EVPN routes, which can carry information such as the host IP address, MAC address, VNI, and VRF. After a VTEP learns the IP address and MAC address of a connected host, the VTEP can send the information to other VTEPs through MP-BGP routes. In this way, learning of host IP address and MAC address information can be implemented on the control plane, suppressing traffic flooding on the data plane.
Using EVPN as the control plane of VXLAN has the following advantages:
- VTEPs can be automatically discovered and VXLAN tunnels can be automatically established, simplifying network deployment and expansion.
- EVPN can advertise both Layer 2 MAC address information and Layer 3 routing information.
- Flooding traffic is reduced on the network.
Watching Videos to Learn About EVPN
- The EVPN Feature Introduction 1 for CloudEngine Series Switches video shows how to use BGP EVPN to establish VXLAN tunnels and forward packets in a scenario where hosts on the same network segment communicate with each other at Layer 2.
- The EVPN Feature Introduction 2 for CloudEngine Series Switches video shows how to use BGP EVPN to establish VXLAN tunnels and forward packets in a scenario where hosts on different network segments communicate with each other at Layer 3.
BGP EVPN Route Types
This chapter describes the formats and functions of BGP EVPN routes defined in BGP EVPN NLRI.
Overview of the Five Types of EVPN Routes
Table 1-1 lists the five types of EVPN routes defined in EVPN NLRI. Type 1 to Type 4 routes are defined in RFC 7432, and Type 5 route is defined in a later draft.
Route Type |
Route Description |
RFC/Draft |
---|---|---|
Type 1 |
Ethernet auto-discovery (A-D) route |
RFC 7432 |
MAC/IP advertisement route |
||
Inclusive multicast Ethernet tag route |
||
Type 4 |
Ethernet segment route |
|
IP prefix route |
draft-ietf-bess-evpn-prefix-advertisement |
Type 1 and Type 4 routes are used in EVPN Ethernet Segment Identifier (ESI) all-active scenarios. The EVPN ESI all-active function allows multi-homing and all-active VXLAN gateways to be deployed based on RFC standards, effectively improving the reliability on the VXLAN access side. Currently, only some CloudEngine switch models support this function. For details, see "EVPN ESI All-Active Function" in the product documentation of CloudEngine switches.
This document focuses on the common Type 2, Type 3, and Type 5 EVPN routes.
EVPN Type 2 Route
Format Description
EVPN Type 2 routes, also called MAC/IP advertisement routes, are used by VTEPs to advertise host IP and MAC address information to each other. Figure 1-2 shows the format of a Type 2 route.
The following table describes the fields in the route.
Field |
Description |
---|---|
Route Distinguisher |
Route distinguisher (RD) of an EVPN instance. It is similar to the RD of an L3VPN instance, and is used to distinguish different EVPN instances. A Layer 2 BD corresponds to an EVPN instance. |
Ethernet Segment Identifier |
Unique identifier of the connection between local and peer devices. |
Ethernet Tag ID |
VLAN ID configured on the device. |
MAC Address Length |
Length of the host MAC address carried in the route. |
MAC Address |
Host MAC address carried in the route. |
IP Address Length |
Mask length of the host IP address carried in the route. |
IP Address |
Host IP address carried in the route. |
MPLS Label1 |
L2VNI carried in the route. This field is used to identify a BD. |
MPLS Label2 |
L3VNI carried in the route. This field is used to identify a VRF. To isolate tenants on a VXLAN network, different VRFs (L3VPNs) are used to isolate routing tables of different tenants. In this way, routes of different tenants are stored in different VRF routing tables. L3VNIs are used to identify the VRFs. |
Application Description
The following table describes the application scenarios and functions of Type 2 routes on VXLAN networks.
Scenario |
Functions of Type 2 Routes |
---|---|
Host MAC address advertisement |
To implement Layer 2 communication between hosts on the same subnet, the VTEPs at both ends must learn host MAC addresses from each other. To achieve this, after a BGP EVPN peer relationship is established between the VTEPs, they exchange Type 2 routes in order to advertise host MAC addresses to each other. For details, see MAC Address Learning Through EVPN in this document. |
Host ARP advertisement |
As a MAC/IP route can carry both the MAC address and IP address of a host, this type of route can be used to transmit host ARP entries between VTEPs, thereby implementing host ARP advertisement. Host ARP advertisement applies to the following scenarios:
|
Host IP route advertisement |
To implement Layer 3 communication between hosts on different subnets in a distributed gateway scenario, the VTEPs (functioning as Layer 3 gateways) at both ends must learn host IP routes from each other. To achieve this, after a BGP EVPN peer relationship is established between the VTEPs, they exchange Type 2 routes in order to advertise host IP routes to each other. For details, see Host Route Advertisement in this document. |
ND entry flooding |
As a Type 2 route can carry both the MAC address and IPv6 address of a host, it can be used to transmit ND entries between VTEPs, implementing ND entry flooding. ND entry flooding applies to scenarios including NS multicast suppression, ND spoofing attack defense, and IPv6 VM migration in a distributed gateway scenario. |
Host IPv6 route advertisement |
To implement Layer 3 communication between hosts on different subnets in a distributed gateway scenario, the VTEPs (functioning as Layer 3 gateways) must learn host IPv6 routes from each other. To achieve this, after a BGP EVPN peer relationship is established between the VTEPs, they exchange Type 2 routes in order to advertise host IPv6 routes to each other. |
EVPN Type 3 Route
Format Description
EVPN Type 3 routes are used by VTEPs to advertise L2VNIs and VTEP IP addresses to each other for creating an ingress replication list. That is, Type 3 routes are used for automatic VTEP discovery and dynamic VXLAN tunnel establishment. If there is a reachable route to the peer VTEP's IP address, a VXLAN tunnel is established from the local VTEP to the peer VTEP. Additionally, if the local and remote VNIs are the same, an ingress replication list is created for BUM packet forwarding.
The NLRI of Type 3 routes consists of a prefix and a PMSI attribute. Figure 1-3 shows the format of a Type 3 route. The Originating Router's IP Address field of the NLRI contains VTEP IP address information, and the MPLS Label field of the PMSI attribute contains L2VNI information.
The following table describes the fields in the route.
Field |
Description |
---|---|
Route Distinguisher |
RD of an EVPN instance. |
Ethernet Tag ID |
VLAN ID configured on the device. The value is all 0s in a Type 3 route. |
IP Address Length |
Mask length of the local VTEP's IP address carried in the route. |
Originating Router's IP Address |
Local VTEP's IP address carried in the route. |
Flags |
Flags indicating whether leaf node information is required for the tunnel. In a VXLAN scenario, this field is meaningless. |
Tunnel Type |
Tunnel type carried in the route. In a VXLAN scenario, the value must be 6: Ingress Replication currently, which is used for BUM packet forwarding. |
MPLS Label |
L2VNI carried in the route. |
Tunnel Identifier |
Tunnel information carried in the route. In a VXLAN scenario, this field also indicates the local VTEP's IP address currently. |
Application Description
For the process of dynamically creating an ingress replication list based on Type 3 routes, see Intra-Subnet VXLAN Tunnel Establishment in this document.
EVPN Type 5 Route
Format Description
EVPN Type 5 routes, also called IP prefix routes, are used to transmit network segment routes. Different from Type 2 routes that transmit only 32-bit (IPv4) or 128-bit (IPv6) host routes, Type 5 routes can transmit network segment routes with mask lengths ranging from 0 to 32 or 0 to 128 bits.
Figure 1-3 shows the format of a Type 5 route.
The following table describes the fields in the route.
Field |
Description |
---|---|
Route Distinguisher |
RD of an EVPN instance. |
Ethernet Segment Identifier |
Unique identifier of the connection between local and peer devices. |
Ethernet Tag ID |
VLAN ID configured on the device. |
IP Prefix Length |
Mask length of the IP prefix carried in the route. |
IP Prefix |
IP prefix carried in the route. |
GW IP Address |
Default gateway address. In a VXLAN scenario, this field is meaningless. |
MPLS Label |
L3VNI carried in the route. |
Application Description
The IP Prefix Length and IP Prefix fields carry a host IP address or network segment address.
If a host IP address is carried, the route is used to advertise a host or network segment route in a distributed gateway scenario. For details, see Network Segment Route Advertisement in this document.
If a network segment address is carried, the route can be advertised to allow hosts on the VXLAN network to access an external network.
Understanding How BGP EVPN Works as the Control Plane of VXLAN
How does BGP EVPN work on a VXLAN network? This chapter describes how BGP EVPN works as the control plane of VXLAN.
In a scenario where distributed VXLAN gateways are deployed using BGP EVPN, the control plane is responsible for VXLAN tunnel establishment and dynamic MAC address learning, while the forwarding plane is responsible for intra-subnet known unicast packet forwarding, intra-subnet BUM packet forwarding, and inter-subnet packet forwarding. BGP EVPN provides various functions, including host IP route advertisement, host MAC address advertisement, host ARP advertisement, and ARP broadcast suppression. If distributed gateways are deployed on a VXLAN network, BGP EVPN is recommended.
The following sections use a VXLAN network with IPv4 underlay and overlay networks as an example to describe how BGP EVPN works as the control plane of VXLAN.
Intra-Subnet VXLAN Tunnel Establishment
A VXLAN tunnel is determined by a pair of VTEPs. In an intra-subnet communication scenario, a VXLAN tunnel can be established between two VTEPs when they have reachable routes to each other's IP address because they only need to communicate in the same Layer 2 BD. When EVPN is used to dynamically establish a VXLAN tunnel, two VTEPs establish a BGP EVPN peer relationship and exchange Type 3 routes to transmit VNI and VTEP IP address information. A VXLAN tunnel is then dynamically established between them.
The following uses Figure 1-5 as an example to describe how VXLAN tunnels are established between VTEPs through Type 3 routes.
In the figure, Leaf1, Leaf2, and Leaf3 function as VTEPs, and Leaf1 advertises routes to Leaf2 and Leaf3.
- After the VTEP IP address, L2VNI, and EVPN instance are configured on Leaf1 (as shown in the following example command output), Leaf1 advertises EVPN Type 3 routes to Leaf2 and Leaf3. The routes carry information including the L2VNI, local VTEP IP address, RD of the EVPN instance, and export route target (ERT).
[Leaf1] bridge-domain 10 vxlan vni 10 //L2VNI evpn route-distinguisher 1:10 //RD of the EVPN instance vpn-target 0:10 export-extcommunity //ERT of the EVPN instance vpn-target 100:5000 export-extcommunity vpn-target 0:10 import-extcommunity # interface Nve1 source 1.1.1.1 //VTEP IP address of Leaf1 vni 10 head-end peer-list protocol bgp #
- After Leaf2 and Leaf3 receive Type 3 routes from Leaf1, they establish Layer 2 VXLAN tunnels to Leaf1 if there are reachable routes to the VTEP IP address of Leaf1. In addition, if the VNI in the routes is the same as the local VNI, an ingress replication list is created for forwarding broadcast, multicast, and unknown unicast packets.
After Leaf2 and Leaf3 receive EVPN routes from Leaf1, they determine whether to accept the routes based on whether the RT (ERT of the EVPN instance) carried in the routes matches the import route target (IRT) of the local EVPN instance.
After the preceding process, Leaf2 and Leaf3 can create an ingress replication list to Leaf1 to guide the forwarding of BUM packets. Similarly, Leaf1 also creates an ingress replication list to Leaf2 and Leaf3.
MAC Address Learning Through EVPN
When EVPN is used as the control plane of VXLAN, MAC address learning through EVPN can replace MAC address learning through flooding on the data plane, reducing flooding traffic. VTEPs transmit Type 2 routes to learn MAC addresses through EVPN.
The following uses Figure 1-6 as an example to describe how VTEPs learn MAC addresses of remote hosts through EVPN.
In the figure, Leaf1 and Leaf2 function as VTEPs and connect to Host1 and Host2 on the same network segment, respectively. Leaf1 advertises a Type 2 route to Leaf2.
- When Host1 is connected to Leaf1, it sends ARP and DHCP packets to Leaf1. Leaf1 learns the MAC address of Host1 through the packets and records it in the local MAC address table.
After learning the MAC address entry of the local host, Leaf1 advertises an EVPN Type 2 route to its peer Leaf2. The route carries the ERT of the local EVPN instance, VTEP IP address, L2VNI, and MAC address of Host1. The ERT of the local EVPN instance, VTEP IP address, and L2VNI are obtained from the local VTEP configuration. The following is an example:
[Leaf1] bridge-domain 10 vxlan vni 10 //L2VNI evpn route-distinguisher 10:1 vpn-target 0:10 export-extcommunity //ERT of the EVPN instance vpn-target 100:5000 export-extcommunity vpn-target 0:10 import-extcommunity # interface Nve1 source 1.1.1.1 //VTEP IP address of Leaf1 vni 10 head-end peer-list protocol bgp #
- After receiving the Type 2 route from Leaf1, Leaf2 learns the MAC address of Host1 and saves it in the MAC address table. The next hop is the VTEP IP address of Leaf1.
Note that Leaf2 determines whether to accept the EVPN route received from Leaf1 based on the route target (RT) of the EVPN instance. An RT is a BGP extended community attribute used to control the advertisement and acceptance of EVPN routes. That is, RTs determine whether the local and peer ends can accept EVPN routes from each other.
There are two types of RTs:
- Export RT (ERT): When the local end advertises an EVPN route, the ERT is carried in the route.
- Import RT (IRT): When receiving an EVPN route from a peer, the local end compares the ERT carried in the route with the local IRT. The local end accepts the route only when the ERT and IRT are the same. Otherwise, the local end discards the route.
In this example, Leaf2 accepts the EVPN route from Leaf1 because the IRT configured on Leaf2 is the same as the ERT configured on Leaf1. That is, the IRT of the EVPN instance configured on Leaf2 is 0:10, which is the same as the ERT configured on Leaf1.
[Leaf2] bridge-domain 10 vxlan vni 10 //L2VNI evpn route-distinguisher 10:2 vpn-target 0:10 export-extcommunity vpn-target 100:5000 export-extcommunity vpn-target 0:10 import-extcommunity //IRT of the EVPN instance #
After the preceding process, Leaf2 can learn the MAC address of Host1 without sending a broadcast request packet. Similarly, Leaf1 can learn the MAC address of Host2.
Note that EVPN only reduces traffic flooding on the network, but cannot completely prevent it, for example, in the following scenarios:
- If a silent host exists on the network, the host does not send ARP and DHCP packets. As a result, the VTEP connected to the host cannot learn the MAC address of the host and cannot send MAC address information of the host to other VTEPs.
- When a host communicates with another device for the first time, the host broadcasts an ARP Request packet, causing traffic flooding. In this case, ARP broadcast suppression can be used to prevent flooding. For details, see ARP Broadcast Suppression on a VXLAN BGP EVPN Network in this document.
Inter-Subnet VXLAN Tunnel Establishment and Route Advertisement
Host Route Advertisement
EVPN Type 2 routes can advertise both host MAC addresses and host routes because Type 2 routes can also carry host IP addresses with 32-bit masks. Host route advertisement enables hosts on different network segments to communicate with each other in a distributed gateway scenario. VTEPs need to advertise IP routes of connected hosts to each other. Otherwise, the peer VTEP cannot learn routing information about the host connected to the local VTEP and cannot forward inter-subnet packets between hosts at Layer 3. Simply put, "You have to tell me routes to network segments connected to you. Otherwise, I do not know the path to hosts on the network segments."
The following uses Figure 1-7 as an example to describe how VTEPs use EVPN to advertise host routes.
In the figure, Leaf1 and Leaf2 function as VTEPs and Layer 3 gateways, and connect to Host1 and Host2 on different network segments, respectively. Leaf1 advertises a route to Leaf2.
- When Host1 is connected to Leaf1, it sends ARP and DHCP packets to Leaf1. Leaf1 learns the ARP entry of Host1 through the packets. Leaf1 can also obtain the L2VNI, L3VPN instance, and L3VNI associated with the L3VPN instance based on the BD to which Host1 belongs.
Why are there the L3VPN and L3VNI? Servers of multiple tenants may connect to the same leaf node. To isolate tenants, different L3VPNs are created on the leaf node to isolate the routing tables of different tenants. In this way, routes of different tenants are stored in different VRF routing tables. L3VNIs are used to identify these L3VPNs. When the leaf node receives a data packet carrying an L3VNI from a peer, it finds the corresponding L3VPN based on the L3VNI, and searches the routing table of the L3VPN instance to forward the packet.
The following example command output shows key configurations used by Leaf1 to obtain the L2VNI, L3VPN instance, and L3VNI associated with the L3VPN instance.
[Leaf1] ip vpn-instance vpn1 //L3VPN instance ipv4-family route-distinguisher 20:4 vpn-target 100:5000 export-extcommunity evpn vpn-target 100:5000 import-extcommunity evpn vxlan vni 5000 //L3VNI associated with the L3VPN instance # bridge-domain 10 vxlan vni 10 //L2VNI evpn route-distinguisher 10:4 vpn-target 0:10 export-extcommunity vpn-target 100:5000 export-extcommunity vpn-target 0:10 import-extcommunity # interface Vbdif10 //Obtain the Layer 3 VBDIF interface and the L3VPN instance bound to the interface based on BD information. ip binding vpn-instance vpn1 ip address 192.168.1.1 255.255.255.0 mac-address 0000-5e00-0102 vxlan anycast-gateway enable arp collect host enable #
In conclusion, Leaf1 obtains the following information about Host1: IP address, MAC address, L2VNI, and L3VNI associated with the L3VPN instance bound to the VBDIF interface.
- Leaf1 can generate an EVPN Type2 route (see the table in the preceding figure) for the EVPN instance based on the information. In addition to the obtained information about Host1, the route also carries information including the ERT of the local EVPN instance, the next hop (the local VTEP's IP address) of the route, and the VTEP's MAC address. Leaf1 advertises the route to its peer Leaf2.
- The EVPN instance on Leaf1 sends Host1's IP address, MAC address, and L3VNI to the local L3VPN instance so that a route to Host1 is generated in the local L3VPN instance.
- After receiving the Type 2 route from Leaf1, Leaf2 learns the IP address of Host1, saves it in the corresponding routing table, and records the corresponding L3VNI. The next hop of the route is the VTEP IP address of Leaf1. The details are as follows:
- Check whether the ERT in the route is the same as the IRT of the EVPN instance on the receiver. If they are the same, Leaf2 accepts the route and obtains the host IP address and MAC address from the route based on the EVPN instance for host ARP advertisement.
- Check whether the ERT in the route is the same as the IRT of the L3VPN instance on the receiver (as shown in the example in the following table). If they are the same, Leaf2 accepts the route, obtains the host IP address and L3VNI from the route based on the L3VPN instance, and generates a route to Host1 in the routing table of the L3VPN instance. The next hop of the route is set to the VXLAN tunnel interface of Leaf1.
Leaf1 (Sender)
Leaf2 (Receiver)
ip vpn-instance vpn1 ipv4-family route-distinguisher 20:2 vpn-target 100:5000 export-extcommunity evpn vpn-target 100:5000 import-extcommunity evpn vxlan vni 5000 # bridge-domain 10 vxlan vni 10 evpn route-distinguisher 10:2 vpn-target 100:10 export-extcommunity vpn-target 100:5000 export-extcommunity //ERT of the EVPN instance on the sender vpn-target 100:10 import-extcommunity #
ip vpn-instance vpn1 ipv4-family route-distinguisher 20:3 vpn-target 100:5000 export-extcommunity evpn vpn-target 100:5000 import-extcommunity evpn //IRT (eIRT) of the L3VPN instance on the receiver vxlan vni 5000 # bridge-domain 20 vxlan vni 20 evpn route-distinguisher 10:3 vpn-target 100:20 export-extcommunity vpn-target 100:5000 export-extcommunity vpn-target 100:20 import-extcommunity #
- After receiving the route, Leaf2 obtains the VTEP IP address of Leaf1 based on the next hop in the accepted route for the EVPN instance or L3VPN instance. If there is a reachable route to the IP address, a VXLAN tunnel to Leaf1 is established.
After the preceding process, Leaf2 can learn the IP route to Host1 and forward packets to Host1 based on the routing table. Similarly, Leaf1 can learn the IP route to Host2.
Network Segment Route Advertisement
The process of advertising network segment routes is similar to that of advertising host routes. The difference is that network segment routes are advertised through Type 5 routes, whereas Type 2 routes can only be used to advertise 32-bit or 128-bit host routes. Type 5 routes can also be used to advertise 32-bit or 128-bit host routes. For advertisement of 32-bit or 128-bit host routes, the function of Type 5 routes is similar to that of Type 2 routes.
A VXLAN gateway can advertise network segment routes only if the attached network segments are unique across the entire network.
The following uses Figure 1-8 as an example to describe how VTEPs advertise network segment routes. In the figure, Leaf1 and Leaf2 function as VTEPs and Layer 3 gateways. Leaf1 is connected to the network segment 192.168.1.0/24.
- Leaf1 saves the local IP network segment route and advertises the route to Leaf2 through an EVPN Type 5 route. The route carries information including the IP prefix, mask length, and L3VNI of the corresponding VRF (as shown in the table in the preceding figure).
- After receiving the Type 5 route from Leaf1, Leaf2 learns the IP network segment route, saves it in the corresponding routing table, and records the corresponding L3VNI. The next hop of the route is the VTEP IP address of Leaf1.
After receiving the EVPN route from Leaf1, Leaf2 checks whether the RT carried in the EVPN route matches the IRT of the local L3VPN instance to determine whether to add the network segment route to the corresponding VRF routing table. (The RT in the Type 5 route is the ERT of the L3VPN instance.) If the IRT of a VRF is the same as the RT carried in the EVPN route, Leaf2 accepts the route, obtains network segment route and L3VNI information, and generates a network segment route in the routing table. The next hop of the route is set to the VTEP IP address of Leaf1. In addition, if there is a reachable route to the VTEP IP address of Leaf1, a VXLAN tunnel to Leaf1 is established.
Leaf1 (Sender)
Leaf2 (Receiver)
ip vpn-instance vpn1 ipv4-family route-distinguisher 20:2 vpn-target 100:5000 export-extcommunity evpn //The ERT of the sender in a Type 5 route is the ERT (eERT) of the L3VPN instance. vpn-target 100:5000 import-extcommunity evpn vxlan vni 5000 # bridge-domain 10 vxlan vni 10 evpn route-distinguisher 10:2 vpn-target 100:10 export-extcommunity vpn-target 100:5000 export-extcommunity vpn-target 100:10 import-extcommunity #
ip vpn-instance vpn1 ipv4-family route-distinguisher 20:3 vpn-target 100:5000 export-extcommunity evpn vpn-target 100:5000 import-extcommunity evpn //IRT (eIRT) of the L3VPN instance on the receiver vxlan vni 5000 # bridge-domain 20 vxlan vni 20 evpn route-distinguisher 10:3 vpn-target 100:20 export-extcommunity vpn-target 100:5000 export-extcommunity vpn-target 100:20 import-extcommunity #
After the preceding process, Leaf2 can learn the network segment route of Leaf1 and forward packets to the network segment based on the routing table.
Packet Forwarding Process on a VXLAN BGP EVPN Network
The following sections use a VXLAN network with IPv4 underlay and overlay networks as an example to describe the packet forwarding process on a VXLAN network where distributed gateways are deployed using BGP EVPN.
Intra-Subnet Packet Forwarding
Intra-subnet packet forwarding is completed only between Layer 2 VXLAN gateways. Layer 3 VXLAN gateways do not need to be aware of the process.
Intra-Subnet Known Unicast Packet Forwarding
As shown in Figure 1-9, Host1 and Host2 are on the same subnet. The following uses the scenario where Host1 sends known unicast packets to Host2 as an example to describe the packet forwarding process on the VXLAN network.
- Host1 sends a packet destined for Host2. If Host1 does not have the MAC address of Host2, Host1 sends broadcast ARP Request packets to obtain the MAC address of Host2. This process is not described in detail here. It is assumed that Host1 has obtained the MAC address of Host2.
- After receiving a packet from Host1, Leaf1 determines the BD to which Host1 belongs based on the inbound interface or VLAN information of the packet and searches for the outbound interface in the BD. (As described in MAC Address Learning Through EVPN, Leaf1 learns the MAC address of Host2 and the outbound interface is VTEP 2.2.2.2.) Leaf1 then encapsulates the packet into a VXLAN packet and forwards it.
- After receiving the VXLAN packet, Leaf2 obtains the Layer 2 BD based on the VNI in the packet and performs VXLAN decapsulation to obtain the inner Layer 2 packet.
- Leaf2 searches the local MAC address table for the outbound interface based on the destination MAC address of the inner packet, and then forwards the packet to Host2.
The process of sending packets from Host2 to Host1 is the same as the preceding process.
Intra-Subnet BUM Packet Forwarding
When receiving a BUM packet from a host to another host on the same subnet, a VTEP sends the packet to all VTEPs connecting to hosts on the same subnet.
In the example shown in Figure 1-10, Host1 sends a broadcast packet. After receiving the broadcast packet from Host1, Leaf1 determines the BD to which Host1 belongs based on the inbound interface or VLAN information of the packet, searches for the list of all tunnels in the BD, encapsulates the packet based on the obtained tunnel list, and sends the packet through all tunnels. In this way, the packet is forwarded to Host2 and Host3 on the same subnet.
Inter-Subnet Packet Forwarding
In a distributed gateway scenario shown in Figure 1-11, Leaf1 and Leaf2 function as Layer 3 VXLAN gateways and perform VXLAN encapsulation and Layer 3 forwarding. The spine node functions only as a VXLAN packet forwarding node and does not process VXLAN packets.
The following uses the scenario where Host1 sends packets to Host2 as an example to describe the packet forwarding process on the VXLAN network.
- Since Host1 and Host2 belong to different network segments, Host1 first sends a packet to the gateway (Leaf1) to forward it.
- After receiving a packet from Host1, Leaf1 determines that Layer 3 forwarding is required based on the destination address of the packet. Leaf1 determines the BD to which Host1 belongs based on the inbound interface or VLAN information of the packet, finds the L3VPN instance bound to the BD, and searches the routing table of the L3VPN instance. As described in Inter-Subnet VXLAN Tunnel Establishment and Route Advertisement, Leaf1 can learn the host route of Host2 in the distributed gateway scenario.
Leaf1 obtains the L3VNI and next hop based on the route, performs VXLAN encapsulation, and forwards the packet to Leaf2.
- After receiving the VXLAN packet, Leaf2 decapsulates the packet and finds the corresponding L3VPN instance based on the Layer 3 VNI carried in the packet. Then, Leaf2 searches the routing table of the L3VPN instance and finds that the next hop of the packet is the gateway interface address. Leaf2 replaces the destination MAC address with the MAC address of Host2, replaces the source MAC address with the MAC address of Leaf2, and forwards the packet to Host2.
The process of sending packets from Host2 to Host1 is the same as the preceding process.
ARP Broadcast Suppression on a VXLAN BGP EVPN Network
The Address Resolution Protocol (ARP) resolves IP addresses into MAC addresses. When a host needs to communicate with another host on the same network segment for the first time, it sends a broadcast ARP Request packet to obtain the MAC address of the destination host because it does not have the information. Broadcast ARP Request packets are flooded on a VXLAN network, and a large number of ARP packets consume excessive network resources, deteriorating network performance.
To reduce the adverse impact of broadcast ARP Request packets on the network, you can configure ARP broadcast suppression to minimize ARP packet flooding on the VXLAN network. ARP broadcast suppression can be implemented using two functions: ARP broadcast-to-unicast conversion and Layer 2 proxy ARP.
ARP Broadcast-to-Unicast Conversion
ARP broadcast-to-unicast conversion enables a device to convert broadcast ARP packets into unicast ARP packets and then forwards them in unicast mode. ARP broadcast-to-unicast conversion is implemented as follows: A Layer 3 VXLAN gateway generates ARP broadcast suppression entries (containing host IP address, MAC address, VNI, and VTEP IP address information) based on ARP entries and sends the host information to a Layer 2 gateway through EVPN. After receiving a broadcast ARP Request packet, the Layer 2 gateway replaces the original broadcast MAC address of all Fs with the learned host MAC address. In this way, the Layer 2 gateway converts the broadcast packet into a unicast packet for forwarding.
In the distributed gateway scenario shown in Figure 1-12, Host1 and Host2 belong to the same subnet but are connected to different VTEPs. The ARP broadcast-to-unicast conversion process is as follows:
- Leaf2 can learn the ARP entry of Host2 from the ARP packet sent by Host2. Leaf2 then generates an ARP broadcast suppression entry based on the ARP entry and advertises the entry to Leaf1 through EVPN. In this way, Leaf1 can learn host information of Host2.
- When Host1 communicates with Host2 for the first time, Host1 sends a broadcast ARP Request packet to obtain the MAC address of Host2.
- After receiving the broadcast ARP Request packet, Leaf1 queries the ARP broadcast suppression table. Because Leaf1 has the host information of Host2, it replaces the broadcast destination MAC address of all Fs in the ARP Request packet with the MAC address of Host2, converting the broadcast ARP packet into a unicast ARP packet. Leaf1 then encapsulates the packet into a VXLAN packet and sends the packet to Leaf2.
If Leaf1 does not have the ARP broadcast suppression entry of Host2, it broadcasts the ARP Request packet according to the normal process.
- After receiving the VXLAN packet, Leaf2 decapsulates the packet and sends the ARP Request packet to Host2.
The ARP broadcast-to-unicast conversion function depends on a Layer 3 gateway. The Layer 3 gateway needs to learn ARP entries of hosts. If the Layer 3 gateway cannot learn ARP entries of hosts, broadcast ARP packets cannot be suppressed.
Layer 2 Proxy ARP
When ARP broadcast suppression is implemented using the ARP broadcast-to-unicast conversion function, a Layer 3 gateway is required. On a pure Layer 2 network, there is no Layer 3 gateway that can learn ARP entries of hosts. As a result, no ARP broadcast suppression entry can be generated to suppress broadcast ARP packets. ARP broadcast suppression in the preceding Layer 2 scenario can be implemented using the Layer 2 proxy ARP function.
Layer 2 proxy ARP is implemented as follows: After a Layer 2 gateway receives an ARP packet from a host, it obtains host information from the ARP packet, generates an ARP broadcast suppression entry, and sends the host information to other Layer 2 gateways through EVPN. After receiving a broadcast ARP Request packet, a Layer 2 gateway responds to the ARP Request packet based on host information in the ARP broadcast suppression table.
Layer 2 proxy ARP is an effective mechanism that helps minimize the number of broadcast ARP packets. After receiving an ARP Request packet, a Layer 2 gateway enabled with Layer 2 proxy ARP preferentially responds to the packet. It broadcasts the packet only when it fails to respond to the packet.
As shown in Figure 1-13, Host1 and Host2 belong to the same subnet. After BD-based Layer 2 proxy ARP is enabled on Layer 2 gateways, the Layer 2 proxy ARP process is as follows:
- After Layer 2 proxy ARP is enabled on Leaf2, Leaf2 detects ARP packets sent by hosts. After receiving an ARP packet from Host2, Leaf2 generates an ARP broadcast suppression entry based on the ARP packet and advertises the entry to Leaf1 through EVPN. In this way, Leaf1 can also learn host information of Host2.
- When Host1 communicates with Host2 for the first time, Host1 sends a broadcast ARP Request packet to obtain the MAC address of Host2.
- After receiving the broadcast ARP Request packet, Leaf1 queries the ARP broadcast suppression table. Because Leaf1 has the host information of Host2, it directly responds to the ARP Request packet.
If Leaf1 does not have the ARP broadcast suppression entry of Host2, it broadcasts the ARP Request packet according to the normal process.
How Do I Configure VXLAN BGP EVPN?
For details about the commands and parameters for configuring BGP EVPN on Huawei CloudEngine switches, see "Configuring VXLAN in Distributed Gateway Mode Using BGP EVPN." For configuration examples, see "Configuration Examples for VXLANs."