What Is VXLAN
What Is VXLAN?
Virtual eXtensible Local Area Network (VXLAN) is one of the Network Virtualization over Layer 3 (NVO3) technologies defined by the Internet Engineering Task Force (IETF) and is an extension to Virtual Local Area Network (VLAN). VXLAN encapsulates a Layer 2 Ethernet frame into a UDP packet and transmits the packet over a Layer 3 network.
As shown in Figure 1-1, VXLAN is essentially a tunneling technology. It establishes a logical tunnel on the IP network between the source and destination network devices to encapsulate user-side packets and forward them through the tunnel. Servers are connected to different ports of network devices in the data center VXLAN network, which can be considered as a virtual Layer 2 switch.
VXLAN has become the mainstream technology for constructing data center networks because it can meet the requirements of dynamic virtual machine (VM) migration and multi-tenancy in data center networks.
Why Is VXLAN Required?
Why is VXLAN required? This is closely related to the virtualization trend on the server side of the data center. VMs need to be dynamically migrated after server virtualization, requiring an accessible network. As the data center scale increases, the number of tenants increases sharply, requiring isolation of a large number of tenants. VXLAN can meet the two requirements.
Dynamic VM Migration Requires an Accessible Network
What Is Server Virtualization?
The physical server efficiency in traditional data centers is too low (10% to 15% on average), wasting a large number of power resources and equipment room resources. To address this issue, the server virtualization technology emerges. As shown in Figure 1-2, the server virtualization technology virtualizes a physical server into multiple logical servers that are called VMs. Each VM can run independently and has its own operating system, applications, MAC address, and IP address. VMs connect to external networks through the virtual switches (vSwitches) on physical servers.
Server virtualization technology can effectively improve server efficiency and reduce energy consumption and O&M costs, so it has been widely used.
What Is Dynamic VM Migration?
Dynamic VM migration is the process of moving VMs from one physical server to another, while ensuring continuity of services deployed on the VMs. End users are unaware of the process, so administrators can flexibly allocate server resources or maintain and upgrade the physical servers without affecting normal server use by end users.
After server virtualization, dynamic VM migration becomes a common practice. To ensure service continuity during the migration of a VM, the VM's IP address and running status (for example, the TCP session status) must remain unchanged. Therefore, VMs can be dynamically migrated only in the same Layer 2 domain.
As shown in Figure 1-3, the traditional three-layer network architecture limits the dynamic VM migration scope. The migration can occur only in a limited scope and is greatly restricted.
To enable smooth VM migration in a large scope or even across regions, all involved servers must be deployed on a large Layer 2 domain.
How Does VXLAN Meet Network Requirements During Dynamic VM Migration?
It is well known that a Layer 2 switch can implement Layer 2 communication between servers connected to the switch. When a server is migrated from one port of the Layer 2 switch to another port, the IP address of the server can remain unchanged. This meets the requirements for dynamic VM migration. VXLAN was designed to meet these requirements.
As VXLAN is essentially a tunneling technology, when the source and destination ends need to communicate with each other, a virtual tunnel is created on the IP network of the data center to transparently forward user data between the two ends. In such tunnel establishment mode, almost a full mesh topology can meet the growing communication needs in the data center.
VXLAN can construct a fully connected Layer 2 virtual network based on the data center IP network. This ensures that any two points can communicate with each other through a VXLAN tunnel without focusing on the structure and details of the underlying network. For servers, VXLAN virtualizes the entire data center network into a large Layer 2 virtual switch. All servers are connected to this Layer 2 virtual switch. Servers are unaware of how forwarding is performed within the Layer 2 virtual switch.
Based on the Layer 2 virtual switch, it is easy to understand why VXLAN can implement dynamic VM migration. When a VM is migrated from one port of the Layer 2 virtual switch to another port, the IP address of the VM does not need to be changed.
Other technologies similar to VXLAN include Network Virtualization using Generic Routing Encapsulation (NVGRE) and Stateless Transport Tunneling Protocol (STT). This document describes only VXLAN.
Sharply Increasing Tenants in the Data Center Require Isolation
In a traditional VLAN network, in accordance with standards, a maximum of about 4k VLANs are available. After server virtualization, a physical server hosts multiple VMs. Each VM has an independent IP address and MAC address. Public clouds or other large virtualized cloud data centers need to accommodate tens of thousands of tenants or even more. In this case, VLAN cannot meet these requirements.
How does VXLAN meet these requirements? VXLAN adds a 24-bit VXLAN network identifier (VNI) that is similar to a VLAN ID to a VXLAN header. Theoretically, a maximum of 16M VXLAN segments are supported, meeting the requirements for identification and isolation between large networks. The following describes the functions of VNIs.
What Are the Differences Between VXLAN and VLAN?
VLAN is as a traditional network isolation technology. In accordance with standards, a maximum of about 4k VLANs are available, which cannot meet the tenant isolation requirements of large data centers. In addition, each VLAN is a small and fixed Layer 2 virtual network, which does not support large-scale dynamic VM migration.
VXLAN overcomes the preceding disadvantages of VLAN. VXLAN uses the 24-bit VNI field (as shown in Figure 1-5) to identify up to 16M tenants, compared to a maximum of 4k tenants in VLAN. VXLAN establishes a virtual tunnel between two switches across the basic IP network of the data center and virtualizes the data center network into a large Layer 2 virtual switch to meet the requirements of large-scale dynamic VM migration.
Although VXLAN is an extension to VLAN, VXLAN is quite different from VLAN in terms of virtual tunnel establishment.
Now let's take a look at what VXLAN packets actually look like.
A VXLAN tunnel endpoint (VTEP) encapsulates the following headers into the original Ethernet frame (original L2 frame):
- VXLAN Header
A VXLAN header has eight bytes. It includes a 24-bit VNI field for defining different tenants on the VXLAN network. In addition, it also contains the VXLAN Flags field (8 bits, set to 00001000) and two reserved fields (24 bits and 8 bits, respectively).
- UDP Header
The VXLAN header and the original Ethernet frame are used as UDP data. In the UDP header, the destination port number (VXLAN Port) is fixed at 4789, and the source port number (UDP Src. Port) is calculated using the hash algorithm based on the original Ethernet frame.
- Outer IP Header
It is the encapsulated outer IP header. In the outer IP header, the source IP address (Outer Src. IP) is the IP address of the VTEP connected to the source VM, and the destination IP address (Outer Dst. IP) is the IP address of the VTEP connected to the destination VM.
- Outer MAC Header
It is the encapsulated outer Ethernet header. In the outer Ethernet header, the source MAC address (Src. MAC Addr.) is the MAC address of the VTEP connected to the source VM, and the destination MAC address (Dst. MAC Addr.) is the MAC address of the next hop along the path to the destination VTEP.
How Is a VXLAN Tunnel Established?
This section describes how a VXLAN tunnel is established and how VXLAN works.
What Are VTEPs and VNIs in VXLAN?
Next, let's learn about the VXLAN network model and common concepts. In Figure 1-6, two servers communicate through a VXLAN network.
In Figure 1-6, a tunnel is established between two top of rack (TOR) switches to encapsulate the original data frames sent by the source server into VXLAN packets so that the original data frames can be transmitted on the bearer network (such as an IP network). When the VXLAN packets arrive at the TOR switch connected to the destination server, the TOR switch decapsulates these packets into the original data frames, and forwards the frames to the destination server.
On the VXLAN network, there are some new elements, such as VTEP and VNI, that are not included in traditional data center networks. The following describes these new elements.
What Is a VTEP?
In Figure 1-6, a VTEP is an edge device on a VXLAN network. It is the start or end point of a VXLAN tunnel, which encapsulates and decapsulates original user data frames respectively.
A VTEP can be an independent network device (such as a Huawei CloudEngine series switch) or a virtual switch deployed on a server. The source VTEP encapsulates the original data frames sent by the source server into VXLAN packets and transmits the VXLAN packets to the destination VTEP on the IP network. The destination VTEP then decapsulates the VXLAN packets into the original data frames and forwards the frames to the destination server.
For details about how VTEPs establish a VXLAN tunnel and forward packets through the tunnel, see How Is a VXLAN Tunnel Established?.
What Is a VNI?
As the VLAN ID field in an Ethernet frame has only 12 bits, VLAN cannot meet isolation requirements on data center networks. The emergence of VNI is specifically to solve this problem.
In Figure 1-6, a VNI is a user identifier similar to a VLAN ID. A VNI identifies a tenant. VMs with different VNIs cannot communicate at Layer 2. In Figure 1-5, during VXLAN packet encapsulation, a 24-bit VNI is added to a VXLAN packet, which enables VXLAN to isolate a large number of tenants.
For details about how VNIs are used for VXLAN tunnel establishment and packet forwarding, see How Is a VXLAN Tunnel Established?.
In distributed gateway deployment scenarios, VNIs can be classified into Layer 2 VNIs and Layer 3 VNIs, which have different functions:
Each Layer 2 VNI is mapped to a bridge domain (BD) for intra-subnet forwarding of VXLAN packets. For details, see What Does "the Same Large Layer 2 Domain" Mean?.
A Layer 3 VNI is associated with a VPN instance for inter-subnet forwarding of VXLAN packets. For details about Layer 3 VNIs, see the Ethernet VPN (EVPN) documentation.
Which VTEPs Need to Establish VXLAN Tunnels?
A VXLAN tunnel is established between two VTEPs. A data center network has many VTEPs, as shown in Figure 1-7. Which VTEPs need to establish VXLAN tunnels?
As described previously, a Layer 2 domain can break through the physical boundaries through VXLAN tunnels, making it possible for communication between VMs on the large Layer 2 network. Therefore, if there are large Layer 2 interconnection requirements between VMs connected to different VTEPs, there should be VXLAN tunnels established between these VTEPs. In other words, all VTEPs in the same large Layer 2 domain have to establish VXLAN tunnels between them.
For example, as shown in Figure 1-7, assume that VMs each connected to VTEP_1, VTEP_2, and VTEP_3 respectively require large Layer 2 interconnection. Every two of VTEP_1, VTEP_2, and VTEP_3 then need to establish VXLAN tunnels between them, as shown in Figure 1-8.
What Does "the Same Large Layer 2 Domain" Mean?
"The same large Layer 2 domain" mentioned above is similar to the VLAN on a traditional network. On a VXLAN network, however, it has another name, bridge domain (BD).
We know different VLANs are identified by VLAN IDs, so how are different BDs identified? Actually they are identified by VNIs. For CloudEngine series switches, there is a 1:1 mapping between BDs and VNIs. The mapping is established using commands on a VTEP.
bridge-domain 10 //Create BD 10. vxlan vni 5000 //Bind VNI 5000 to BD 10. #
The VTEP will generate the BD-to-VNI mapping table according to the above configuration. The mapping table can be checked using the display vxlan vni command.
<HUAWEI> display vxlan vni Number of vxlan vni : 1 VNI BD-ID State --------------------------------------- 5000 10 up
After the mapping table is generated, the VTEP can add VNIs to the incoming packets based on the BDs to which the packets belong to. Then, how to determine which BDs packets belong to?
Which BDs Do Packets Belong to?
VTEP is one function of a switch. This means that not all the packets that enter the switch will go through a VXLAN tunnel. It is possible that some packets are forwarded according to the common Layer 2 and Layer 3 forwarding processes. Before determining which BDs packets belong to, we need to know which packets will enter a VXLAN tunnel.
Which Packets Enter a VXLAN Tunnel?
Before answering this question, let's think about how a switch receives and sends packets using VLAN technology. Packets must first be processed by interfaces on the switch prior to subsequent processing. Three types of interfaces: access, trunk, and hybrid, are defined in traditional networks. Even though the three interface types have different application scenarios, their final goals are the same: One is to check which packets are allowed to pass through based on configuration, and the other is to determine how to process the packets that are allowed to pass through.
On a VXLAN network, VTEP interfaces have similar responsibilities. On CloudEngine series switches, these interfaces are logical Layer 2 sub-interfaces, instead of physical interfaces. Similarly, Layer 2 sub-interfaces provide two functions: One is to check which packets need to enter a VXLAN tunnel based on configuration, and the other is to determine how to process the packets that are allowed to pass through. Different packet encapsulation types can be defined on Layer 2 sub-interfaces, just like configuring different interface types for Layer 2 sub-interfaces on traditional networks. CloudEngine series switches currently support four packet encapsulation types: dot1q, untag, qinq, and default.
- dot1q: If a dot1q sub-interface receives a single-tagged VLAN packet, the sub-interface forwards only the packet with a specified VLAN ID. If a dot1q sub-interface receives a double-tagged VLAN packet, the sub-interface forwards only the packet with a specified outer VLAN ID.
- untag: An untagged sub-interface accepts only packets that do not carry any VLAN tag.
- qinq: A QinQ sub-interface accepts only packets with specified double VLAN tags.
- default: A default sub-interface accepts all packets, regardless of whether they carry VLAN tags. For VXLAN encapsulation and decapsulation, a default sub-interface does not perform any VLAN tag-related action on the original packets, including the addition, replacement, and removal of VLAN tags.
The configurations of Layer 2 sub-interfaces on VTEPs at both ends of a VXLAN tunnel are not necessarily the same. Because of this, it is possible for two VMs on the same network segment but in different VLANs to communicate through a VXLAN tunnel.
In addition to Layer 2 sub-interfaces, VLANs can be used as service access points. After a VLAN is bound to a BD, the interface added to the VLAN is the VXLAN service access point. Packets entering the interface are processed by a VXLAN tunnel.
Adding a Layer 2 Sub-Interface to a BD
After learning about Layer 2 sub-interfaces, we can easily answer how to determine which BDs packets belong to. All we need is to add Layer 2 sub-interfaces to specified BDs. Then BDs to which packets belong to can be determined based on the Layer 2 sub-interface configuration.
As shown in Figure 1-9, a virtual server has two VMs, VM1 (in VLAN 10) and VM2 (in VLAN 20). The two VMs need to access the VXLAN network to communicate with other VMs. Different Layer 2 sub-interfaces can be configured for VM1 and VM2 on the physical interface 10GE1/0/1 of the VTEP connected to the virtual server and be added to different BDs. Then traffic of VM1 and VM2 is forwarded through different VXLAN tunnels.
In Figure 1-9, the link type of the upstream port of the vSwitch on the virtual server is set to trunk and the PVID is set to 20. In this case, among the packets sent from the vSwitch to the VTEP, there are both tagged packets of VM1 and untagged packets of VM2. Two Layer 2 sub-interfaces can be created on the VTEP's access interface and configured with the dot1q and untag encapsulation types respectively.
The following uses the preceding figure as an example to describe how to configure a CloudEngine switch functioning as the VTEP connected to the virtual server.
On the physical access interface 10GE1/0/1 of the switch, create two Layer 2 sub-interfaces, 10GE1/0/1.1 and 10GE1/0/1.2, and configure their packet encapsulation types to dot1q and untag, respectively.
interface 10GE1/0/1.1 mode l2 //Create Layer 2 sub-interface 10GE1/0/1.1.
encapsulation dot1q vid 10 //Configure 10GE1/0/1.1 to allow only packets with VLAN tag 10 to enter a VXLAN tunnel.
bridge-domain 10 //Add 10GE1/0/1.1 to BD 10.
#
interface 10GE1/0/1.2 mode l2 //Create Layer 2 sub-interface 10GE1/0/1.2.
encapsulation untag //Configure 10GE1/0/1.2 to allow only packets without VLAN tags to enter a VXLAN tunnel.
bridge-domain 20 //Add 10GE1/0/1.2 to BD 20.
#
How Is a VXLAN Tunnel Established?
Now let's explore how a VXLAN tunnel is established. There are typically two methods to establish a VXLAN tunnel: manual and automatic.
Manually Establishing a VXLAN Tunnel
This method requires a user to manually set the source and destination IP addresses of a VXLAN tunnel to the IP addresses of the local and remote VTEPs respectively. This means that a static VXLAN tunnel is manually established between the local and remote VTEPs.
For CloudEngine series switches, the above configuration is performed on a Network Virtualization Edge (NVE) interface, for example:
interface Nve1 //Create logical interface NVE 1.
source 1.1.1.1 //Configure the IP address of the source VTEP. The IP address of a loopback interface is recommended here.
vni 5000 head-end peer-list 2.2.2.2
vni 5000 head-end peer-list 2.2.2.3
#
In the configurations of vni 5000 head-end peer-list 2.2.2.2 and vni 5000 head-end peer-list 2.2.2.3, there are two remote VTEPs that belong to VNI 5000, and their IP addresses are 2.2.2.2 and 2.2.2.3 respectively. According to the two configurations, the local VTEP will generate the following information for the specific VNI:
<HUAWEI> display vxlan vni 5000 verbose
BD ID : 10
State : up
NVE : 288
Source Address : 1.1.1.1
Source IPv6 Address : -
UDP Port : 4789
BUM Mode : head-end
Group Address : -
Peer List : 2.2.2.2 2.2.2.3
IPv6 Peer List : -
According to the Peer List field in the preceding command output, the local VTEP can determine which remote VTEPs belong to the same BD (or same VNI). This determines the scope of the same large Layer 2 broadcast domain. When the VTEP receives broadcast, unknown unicast, and multicast (BUM) packets, it replicates the packets and sends them to the remote VTEPs in the peer list (this is like broadcasting packets within a VLAN). Therefore, the preceding information is also called the ingress replication list. When the VTEP receives known unicast packets, it determines the VXLAN tunnel that the packets will go through based on the local MAC address table. In this case, the remote VTEPs in the peer list function like the outbound interfaces in the MAC address table.
In subsequent packet forwarding, you will see how packet forwarding is performed based on the ingress replication list on a VXLAN network.
Automatically Establishing a VXLAN Tunnel
Automatic VXLAN tunnel establishment depends on Ethernet virtual private network (EVPN). For details, see What Is EVPN.
Which Tunnel Should Packets Enter?
There may be more than one VXLAN tunnel that belongs to the same BD. For example, in the mentioned ingress replication list, the source VTEP (1.1.1.1) has two remote VTEPs (2.2.2.2 and 2.2.2.3). So another question arises: which tunnel are packets supposed to enter?
In basic Layer 2 and Layer 3 forwarding, Layer 2 forwarding relies on the MAC address table and Layer 3 forwarding relies on the Forwarding Information Base (FIB) table. If there is no required MAC address entry, a host broadcasts an ARP request packet to obtain the remote MAC address. This implementation is similar on a VXLAN network. In the next section, we will learn the packet forwarding process on a VXLAN network. It will help us determine the tunnel that packets enter.
What Types of VXLAN Gateways Are Available?
Layer 2 and Layer 3 VXLAN Gateways
Layer 2 VXLAN gateway: connects terminals to a VXLAN network and enables intra-subnet communication on the same VXLAN network.
Layer 3 VXLAN gateway: enables inter-subnet communication on a VXLAN network and external network access.
Centralized and Distributed VXLAN Gateways
Layer 3 VXLAN gateways can be further categorized into centralized and distributed gateways.
Centralized VXLAN Gateway
In centralized VXLAN gateway networking, the Layer 3 gateway is deployed on only one device. All traffic sent across subnets is forwarded through the Layer 3 gateway, implementing centralized traffic management.
- Advantage: Inter-subnet traffic can be centrally managed, and gateway deployment and management are simple.
- Disadvantages:
- Forwarding paths are not optimal. Inter-subnet Layer 3 traffic of the same Layer 2 gateway must be transmitted to the centralized Layer 3 gateway for forwarding (shown by the blue dashed line in the preceding figure).
- The ARP entry specification is a bottleneck. ARP entries must be generated for terminals on the centralized Layer 3 gateway. However, the Layer 3 gateway has only a limited number of ARP entries, impeding data center network expansion.
Distributed VXLAN Gateway
Deploying distributed VXLAN gateways addresses the problems in centralized VXLAN gateway deployment. In the spine-leaf networking, leaf nodes function as VTEPs to establish VXLAN tunnels and each can be used as a Layer 3 VXLAN gateway (also a Layer 2 VXLAN gateway). Spine nodes are unaware of the VXLAN tunnels and only forward VXLAN packets between leaf nodes. In the following figure, Server1 and Server2 are on different network segments but both connected to the same leaf node. When Server1 and Server2 communicate with each other, their traffic is forwarded only through this leaf node, but not through any spine node.
The following describes the spine and leaf nodes in distributed gateway deployment:
- Spine node: is used to implement high-speed IP forwarding.
- Leaf node:
- Functions as a Layer 2 VXLAN gateway and connects to physical servers or VMs, allowing tenants to access VXLAN segments.
- Functions as a Layer 3 VXLAN gateway to perform VXLAN encapsulation and decapsulation, allowing for inter-subnet communication and external network access.
- Flexible deployment: A leaf node can function as both a Layer 2 VXLAN gateway and a Layer 3 VXLAN gateway.
- Improved network expansion capabilities: A distributed VXLAN gateway (leaf node) only needs to learn the ARP entries of servers attached to itself, whereas a centralized Layer 3 VXLAN gateway needs to learn the ARP entries of all servers on a network. Therefore, the ARP entry specification is no longer a bottleneck on distributed VXLAN gateways.
On a VXLAN network where distributed gateways are deployed, BGP EVPN is recommended as the VXLAN control plane. For details about BGP EVPN, see What Is EVPN.
How Are Packets Forwarded on a VXLAN Network?
This section uses centralized VXLAN gateway networking where VXLAN tunnels are established manually as an example to describe how to implement intra-subnet and inter-subnet communication.
For details about the packet forwarding process on a VXLAN network where distributed gateways are deployed using BGP EVPN, see What Is EVPN.
Intra-Subnet Communication in Centralized VXLAN Gateway Networking
As shown in Figure 1-12, VM_A, VM_B, and VM_C are on the subnet 10.1.1.0/24, and belong to VNI 5000. VM_A wants to communicate with VM_C.
Because this is the first communication with VM_C, VM_A does not have VM_C's MAC address. Therefore, VM_A broadcasts an ARP request packet to obtain VM_C's MAC address.
Next we will see the forwarding process of ARP request and reply packets to learn how a MAC address is learned.
ARP Request Packet Forwarding
Figure 1-13 shows the process of forwarding ARP request packets.
- VM_A broadcasts an ARP request packet to obtain VM_C's MAC address. In the packet, the source MAC address is MAC_A, the destination MAC address is all Fs, the source IP address is IP_A, and the destination IP address is IP_C.
- After VTEP_1 receives the packet, it determines that the packet needs to enter a specific VXLAN tunnel based on the configuration of the Layer 2 sub-interface on the physical interface (Port_1) that receives this packet. VTEP_1 then identifies the BD and VNI to which the packet belongs. In addition, VTEP_1 learns the MAC address entry (MAC address MAC_A, VNI 5000, and inbound interface Port_1), and saves the entry to the local MAC address table. After that, VTEP_1 replicates the packet and sends it to each remote VTEP in the ingress replication list, and performs VXLAN encapsulation on the packet.
In the VXLAN encapsulated packets, the outer source IP address is the IP address of the local VTEP (VTEP_1), the outer destination IP address is the IP address of the remote VTEP (VTEP_2 for VM_B or VTEP_3 for VM_C), the outer source MAC address is the MAC address of the local VTEP, and the outer destination MAC address is the MAC address of the next-hop device towards the destination IP network. After VXLAN encapsulation, the packets are transmitted on the IP network according to the outer MAC and IP addresses, until they arrive at their remote VTEPs.
- After the packets arrive at VTEP_2 and VTEP_3, the VTEPs decapsulate the packets to obtain the original packet sent by VM_A. VTEP_2 and VTEP_3 learn the MAC address entry (VM_A's MAC address, VNI 5000, and the remote VTEP's IP address IP_1), and save the entry to the local MAC address table. Then VTEP_2 and VTEP_3 process the packets based on the Layer 2 sub-interface configuration and broadcast them in the corresponding Layer 2 domain.
After VM_B and VM_C receive the ARP request packet, they check whether the destination IP address is their local IP address. VM_B finds that the destination IP address of the packet is not the local IP address, and discards the packet. VM_C finds that the destination IP address is the local IP address, and responds with an ARP reply packet. Next, let's look at how the ARP reply packet is forwarded.
ARP Reply Packet Forwarding
Figure 1-14 shows the process of forwarding ARP reply packets.
- VM_C unicasts an ARP reply packet in response to the received ARP request packet because it has learned VM_A's MAC address. In the reply packet, the source MAC address is MAC_C, the destination MAC address is MAC_A, the source IP address is IP_C, and the destination IP address is IP_A.
- After VTEP_3 receives the ARP reply packet from VM_C, VTEP_3 identifies the VNI to which the packet belongs (the identification process is similar to that in step 2). VTEP_3 learns the MAC address entry (MAC_C, VNI 5000, and inbound interface Port_3), and saves the entry to the local MAC address table. VTEP_3 then encapsulates the packet.
In the VXLAN encapsulated packet, the outer source IP address is the IP address of the local VTEP (VTEP_3), the outer destination IP address is the IP address of the remote VTEP (VTEP_1), the outer source MAC address is the MAC address of the local VTEP, and the outer destination MAC address is the MAC address of the next-hop device towards the destination IP network.
After VXLAN encapsulation, the packet is transmitted on the IP network according to the outer MAC and IP addresses, until it arrives at the remote VTEP.
- After the packet arrives at VTEP_1, VTEP_1 decapsulates the packet to obtain the original packet sent by VM_C. VTEP_1 learns the MAC address entry (VM_C's MAC address, VNI 5000, and remote VTEP's IP address IP_3), and saves the entry to the local MAC address table. VTEP_1 then sends the decapsulated packet to VM_A.
VM_A and VM_C have learned each other's MAC address. After that, VM_A and VM_C will communicate in unicast mode. The VXLAN encapsulation and decapsulation process of unicast packets is similar to that shown in Figure 1-14.
Inter-Subnet Communication in Centralized VXLAN Gateway Networking
As shown in Figure 1-16, VM_A is on the subnet 10.1.10.0/24 and belongs to VNI 5000, while VM_B is on the subnet 10.1.20.0/24 and belongs to VNI 6000. The Layer 3 gateway addresses of VM_A and VM_B are the IP addresses of interfaces BDIF 10 and BDIF 20 on VTEP_3, respectively. VTEP_3 has routes to 10.1.10.0/24 and 10.1.20.0/24. VM_A wants to communicate with VM_B.
A BDIF interface is similar to a VLANIF interface. That is, a BDIF interface is a logical Layer 3 interface created based on a BD to achieve communication between different subnets or between VXLAN and non-VXLAN networks.
Because this is the first communication with VM_B that resides on a different subnet, VM_A needs to send an ARP broadcast packet to request the MAC address of the gateway (BDIF 10). After obtaining the MAC address of the gateway, VM_A sends a data packet to the gateway. Then the gateway broadcasts an ARP request packet to obtain the MAC address of VM_B. After obtaining the MAC address of VM_B, the gateway sends the data packet to VM_B. The MAC address learning process is the same as that described in Intra-Subnet Communication in Centralized VXLAN Gateway Networking. Assume that VM_A and VM_B have learned their gateways' MAC addresses and that the gateways have learned VM_A and VM_B's MAC addresses. The following describes how data packets are sent from VM_A to VM_B.
On the network shown in the preceding figure:
- VM_A sends a data packet to its gateway. In the packet, the source MAC address is MAC_A, the destination MAC is MAC_10 (the gateway BDIF 10's MAC address), the source IP address is IP_A, and the destination IP address is IP_B.
- After VTEP_1 receives the data packet, it identifies the VNI (VNI 5000) to which the packet belongs, and encapsulates the packet according to the corresponding MAC address entry. In the encapsulated packet, the outer source IP address is the local VTEP's IP address (IP_1), the outer destination IP address is the remote VTEP's IP address (IP_3), the outer source MAC address is the local VTEP's MAC address (MAC_1), and the outer destination MAC address is the MAC address of the next-hop device towards the destination IP network.
After VXLAN encapsulation, the packet is transmitted on the IP network according to the outer MAC and IP addresses, until it arrives at the remote VTEP.
- After the packet arrives at VTEP_3, VTEP_3 decapsulates the packet to obtain the original packet sent by VM_A. VTEP_3 then processes the packet as follows:
- VTEP_3 finds that the destination MAC address of the packet is the MAC address of BDIF 10 on itself and the destination IP address is IP_B (10.1.20.1), so it searches the routing table for the next hop of the route to IP_B.
- VTEP_3 discovers that the next hop is 10.1.20.10 and the outbound interface is BDIF 20. VTEP_3 then searches the ARP table, changes the source MAC address of the original packet to BDIF 20's MAC address (MAC_20), and changes the destination MAC address to VM_B's MAC address (MAC_B).
- When the packet arrives at BDIF 20, BDIF 20 finds that the packet needs to enter the VXLAN tunnel (in VNI 6000) and encapsulates the packet according to the MAC address table. In the encapsulated packet, the outer source IP address is the local VTEP's IP address (IP_3), the outer destination IP address is the remote VTEP's IP address (IP_2), the outer source MAC address is the local VTEP's MAC address (MAC_3), and the outer destination MAC address is the MAC address of the next-hop device towards the destination IP network.
After VXLAN encapsulation, the packet is transmitted on the IP network according to the outer MAC and IP addresses, until it arrives at the remote VTEP.
- After the packet arrives at VTEP_2, VTEP_2 decapsulates the packet to obtain the inner data packet, and sends it to VM_B.
The process of sending a reply packet from VM_B to VM_A is similar to the process described here.
A Layer 3 gateway is required for communication between VXLAN and non-VXLAN networks. The implementation differs from that in Figure 1-16: Packets are encapsulated on the VXLAN network but not on the non-VXLAN network. After the packets from the VXLAN network enter the gateway and are decapsulated, these packets will be forwarded as ordinary unicast packets.
How to Configure VXLAN?
For details about the commands, parameters, precautions, and configuration examples for configuring VXLAN on Huawei CloudEngine switches, visit Huawei technical support website, select a CloudEngine switch model, open the product documentation of the CloudEngine switch, and choose Configuration > VXLAN Configuration Guide.