SR-MPLS TE
Segment Routing-MPLS Traffic Engineering (SR-MPLS TE) is a new Multiprotocol Label Switching (MPLS) Traffic Engineering (TE) tunneling technique implemented based on an Interior Gateway Protocol (IGP) extension. The controller calculates a path for an SR-MPLS TE tunnel and forwards a computed label stack to the ingress configured on a forwarder. The ingress uses the label stack to generate an LSP in the SR-MPLS TE tunnel. Therefore, the label stack is used to control the path along which packets are transmitted on a network.
SR-MPLS TE Advantages
SR-MPLS TE tunnels are capable of meeting the requirements for rapid development of software-defined networking (SDN), which Resource Reservation Protocol-TE (RSVP-TE) tunnels are unable to meet. Table 2-20 describes the comparison between SR-MPLS TE and RSVP-TE.
Item |
SR-MPLS TE |
RSVP-TE |
---|---|---|
Label allocation |
The extended IGP assigns and distributes labels. Each link is assigned only a single label, and all LSPs share the label, which reduces resource consumption and maintenance workload of label forwarding tables. |
MPLS allocates and distributes labels. Each LSP is assigned a label, which consumes a great number of labels resources and results in heavy workloads maintaining label forwarding tables. |
Control plane |
An IGP is used, which reduces the number of protocols to be used. |
RSVP-TE is used, and the control plane is complex. |
Scalability |
High scalability. Tunnel information is carried in packets, so an intermediate device cannot discern an SR-MPLS TE tunnel. This eliminates the need to maintain tunnel status information. Forwarding entries are only maintained, improving scalability. |
Poor scalability. It needs to maintain tunnel status information and forwarding entries. |
Path adjustment and control |
A service path can be controlled by operating a label only on the ingress. Configurations do not need to be delivered to each node, which improves programmability. When a node in the path fails, the controller recalculates the path and updates the label stack of the ingress node to complete the path adjustment. |
Whether it is a normal service adjustment or a passive path adjustment of a fault scenario, the configurations must be delivered to each node. |
Related Concepts
Label Stack
A label stack is a set of Adjacency Segment labels in the form of a stack stored in a packet header. Each Adjacency SID label in the stack identifies an adjacency to a local node, and the label stack describes all adjacencies along an SR-MPLS TE LSP. In packet forwarding, a node searches for an adjacency mapped to each Adjacency Segment label in a packet, removes the label, and forwards the packet. After all labels are removed from the label stack, the packet is sent out of an SR-MPLS TE tunnel.
Stick Label and Stick Node
If a label stack depth exceeds that supported by a forwarder, the label stack cannot carry all adjacency labels on a whole LSP. In this situation, the controller assigns multiple label stacks to the forwarder. The controller delivers a label stack to an appropriate node and assigns a special label to associate label stacks to implement segment-based forwarding. The special label is a stitching label, and the appropriate node is a stitching node.
The controller assigns a stitching label at the bottom of a label stack to a stitching node. After a packet arrives at the stitching node, the stitching node swaps a label stack associated with the stitching label based on the label-stack mapping. The stitching node forwards the packet based on the label stack for the next segment.
Topology Collection and Label Allocation
Network Topology Collection Modes
Network topology information is collected in either of the following modes:
A forwarder runs IGP to collect network topology information and uses BGP-LS to report the information to the controller.
Both the controller and forwarders run IGP. Each forwarder floods network topology information to one another. Each forwarder reports the information to the controller.
Label Allocation Modes
A forwarder runs an IGP to assign labels and runs a BGP-LS to report label information to a controller. SR-MPLS TE mainly uses adjacency labels (adjacency segment), and node labels can also be used. Adjacency labels are assigned by the ingress. They are valid locally and unidirectional. The node labels are manually configured and globally valid. Adjacency labels and node labels are advertised using IGP. In Figure 2-32, adjacency label 9003 identifies the PE1-to-P3 adjacency and is assigned by PE1. Adjacency label 9004 identifies the P3-to-PE1 adjacency and is assigned by P3.
IGP SR is enabled on PE1, PE2, and P1 through P4 to establish IGP neighbor relationships between each pair of directly connected nodes. In SR-capable IGP instances, each outbound IGP interface is assigned an SR Adjacency Segment label. SR IGP advertises the Adjacency Segment labels across a network. P3 is used as an example. In Figure 2-32, IGP-based label allocation is as follows:
- P3 runs IGP to apply for a local dynamic label for an adjacency. For example, P3 assigns adjacency label 9002 to the P3-to-P4 adjacency.
- P3 runs IGP to advertise the adjacency label and flood it across the network.
- P3 uses the label to generate a label forwarding table.
- After the other nodes on the network run IGP to learn the Adjacency Segment label advertised by P3, the nodes do not generate local forwarding tables.
PE1, P1, P2, P3, and P4 assign and advertise adjacency labels in the same way as P3 does. The label forwarding table is then generated on each node. A node establishes a BGP-LS neighbor relationship with the controller, generates topology information, including SR labels, and reports topology information to the controller.
SR-MPLS TE Tunnel Establishment
SR-MPLS TE Tunnel
Segment Routing Traffic Engineering (SR-MPLS TE) runs the SR protocol and uses TE constraints to create a tunnel.
In Figure 2-33, a primary LSP is established along the path PE1->P1->P2->PE2, and a backup path is established along the path PE1->P3->P4->PE2. The two LSPs have the same tunnel ID of an SR-MPLS TE tunnel. The LSP originates from the ingress, passes through transit nodes, and is terminated at the egress.
SR-MPLS TE tunnel establishment involves configuring and establishing an SR-MPLS TE tunnel. Before an SR-MPLS TE tunnel is created, IS-IS/OSPF neighbor relationships must be established between forwarders to implement network-layer connectivity, to assign labels, and to collect network topology information. Forwarders send label and network topology information to the controller, and the controller uses the information to calculate paths. If no controller is available, enable the CSPF path computation function on the ingress of an SR-MPLS TE tunnel so that a forwarder runs CSPF to compute a path.
SR-MPLS TE Tunnel Configuration
SR-MPLS TE tunnel attributes are used to create tunnels. An SR-MPLS TE tunnel can be configured on a controller or a forwarder.
An SR-MPLS TE tunnel is configured on a controller.
The controller runs NETCONF to deliver tunnel attributes to a forwarder (as shown in Figure 2-34). The forwarder runs PCEP to delegate the tunnel to the controller for management.
An SR-MPLS TE tunnel is manually configured on a forwarder.
The forwarder delegates LSPs to the controller for management.
SR-MPLS TE Tunnel Establishment
If a service (for example, VPN) is bound to an SR-MPLS TE tunnel, a device establishes an SR-MPLS TE tunnel based on the following process, as shown in Figure 2-34.
The controller uses SR-MPLS TE tunnel constraints and Path Computation Element (PCE) to calculate paths and combines adjacency labels into a label stack. The label stack is the calculation result.
If the label stack depth exceeds the upper limit supported by a forwarder, the label stack can only carry some labels, and the controller needs to divide a label stack into multiple stacks for an entire path.
In Figure 2-34, the controller calculates a path PE1->P3->P1->P2->P4->PE2 for an SR-MPLS TE tunnel. The path is mapped to two label stacks {1003, 1006, 100} and {1005, 1009, 1010}. Label 100 is a stitching label, and the others are adjacency labels.
The controller delivers the tunnel configuration information and label stack to the forwarder through NETCONF and PCEP, respectively.
In Figure 2-34, the process of delivering label stacks on the controller is as follows:- The controller delivers label stack {1005, 1009, 1010} to P1 and assigns a stitching label of value 100 associated with the label stack. Label 100 is the bottom label in the label stack on PE1.
- The controller delivers label stack {1003, 1006, 100} to the ingress PE1.
- The forwarder uses the delivered tunnel configurations and label stacks to establish an LSP for an SR-MPLS TE tunnel.
An SR-MPLS TE tunnel does not support MTU negotiation. Therefore, the MTUs configured on nodes along the SR-MPLS TE tunnel must be the same. For a manually configured SR-MPLS TE tunnel, you can use the mtu command to manually configure the MTU under the tunnel. If you do not manually configure the MTU, the default MTU value is 1500 bytes. On the manual SR-MPLS TE tunnel, the smallest value in the following values takes effect: MTU of the tunnel, MPLS MTU of the tunnel, MTU of the outbound interface, and MPLS MTU of the outbound interface.
SR-MPLS TE Data Forwarding
A forwarder operates a label in a packet based on the label stack mapped to the SR-MPLS TE LSP, searches for an outbound interface hop by hop based on the top label of the label stack, and uses the label to guide the packet to the tunnel destination address.
SR-MPLS TE Data Forwarding (Adjacency)
The ingress A adds a label stack of {1003, 1006, 100}. The ingress A uses the outer label of 1003 in the label stack to match against an adjacency and finds A-B adjacency as an outbound interface. The ingress A strips off label 1003 from the label stack {1003, 1006, 100} and forwards the packet downstream through A-B outbound interface.
Node B uses the outer label of 1006 in the label stack to match against an adjacency and finds B-C adjacency as an outbound interface. Node B strips off label 1006 from the label stack {1006, 100}. The pack carrying the label stack {100} travels through the B-to-C adjacency to the downstream node C.
After stitching node C receives the packet, it identifies stitching label 100 by querying the stitching label entries, swaps the label for the associated label stack {1005, 1009, 1010}. Stitching node C uses the top label 1005 to search for an outbound interface connected to the C-to-D adjacency and removes label 1005. Stitching node C forwards the packet carrying the label stack {1009, 1010} along the C-to-D adjacency to the downstream node D. For more details about stick label and stick node, see SR-MPLS TE.
After nodes D and E receive the packet, they treat the packet in the same way as node B. Node E removes the last label 1010 and forwards the data packet to node F.
- Egress F receives the packet without a label and forwards the packet along a route that is found in a routing table.
The preceding information shows that after adjacency labels are manually specified, devices strictly forward the data packets hop by hop along the explicit path designated in the label stack. This forwarding method is also called strict explicit-path SR-MPLS TE.
SR-MPLS TE Data Forwarding (Node+Adjacency)
SR-MPLS TE in strict path mode does not support load balancing if equal-cost paths exist. To overcome these drawbacks, node labels are introduced to SR-MPLS TE paths.
Node A finds an A-B outbound interface based on label 1003 on the top of the label stack. Node A removes label 1003 and forwards the packet to the next hop node B.
Similar to node A, node B finds the outbound interface mapped to label 1006 on the top of the label stack. Node B removes label 1006 and forwards the packet to the next hop node C.
Similar to node A, node C finds the outbound interface mapped to label 1005 on the top of the label stack. Node C removes label 1006 and forwards the packet to the next hop node D.
Node D processes label 101 on the top of the label stack. This label is to perform load balancing. Traffic packets are balanced on links based on 5-tuple information.
After receiving node label 101, nodes E and G that are at the penultimate hops removes labels and forwards packets to node F to complete the E2E traffic forwarding.
The preceding information shows that after adjacency and node labels are manually specified, a device can forward the data packets along the shortest path or load-balance the data packets over paths. The paths are not fixed, and therefore, this forwarding method is called loose explicit-path SR-MPLS TE.
SR-MPLS TE Tunnel Reliability
SR-MPLS TE tunnel reliability techniques include hot standby (HSB), in addition to TI-LFA FRR.
SR-MPLS TE Hot Standby
HSB indicates that once a primary LSP is established, an HSB LSP is established immediately. The HSB LSP remains in the hot backup state. The HSB LSP protects an entire LSP and is an E2E traffic protection measure.
In Figure 2-37, HSB is configured on the ingress A. After the ingress A creates the primary LSP, the ingress A immediately creates an HSB LSP. An SR-MPLS TE tunnel contains two LSPs. If the ingress detects a primary LSP fault, the ingress switches traffic to the HSB LSP. After the primary LSP recovers, the ingress A switches traffic back to the primary LSP. During the process, the SR-MPLS TE tunnel remains Up.
BFD for SR-MPLS TE
- BFD for SR-MPLS TE LSP: SR-MPLS TE LSPs rely on BFD for link detection. To prevent traffic loss in the case of a primary SR-MPLS TE LSP failure, BFD for SR-MPLS TE LSP can be configured, but a backup LSP must be available. BFD for SR-MPLS TE LSP supports both static and dynamic BFD sessions:
- Static BFD session: The local and remote discriminators are manually specified. The local discriminator of the local node must be equal to the remote discriminator of the remote node. The remote discriminator of the local node must be equal to the local discriminator of the remote node. A discriminator inconsistency causes a failure to establish a BFD session. After the BFD session is established, the interval at which BFD packets are received and the interval at which BFD packets are sent can be modified.
- Dynamic BFD session: The local and remote discriminators do not need to be manually specified. After the SR-MPLS TE tunnel goes Up, a BFD session is triggered. The devices on both ends of a BFD session to be established negotiate the local discriminator, remote discriminator, interval at which BFD packets are received, and interval at which BFD packets are sent.
A BFD session is bound to an SR-MPLS TE LSP. This means that a BFD session is established between the ingress and egress. A BFD packet is sent by the ingress and forwarded to the egress through an LSP. The egress responds to the BFD packet. A BFD session on the ingress can rapidly detect the status of the path through which the LSP passes.
If a link fault is detected, the BFD module notifies the forwarding plane of the fault. The forwarding plane searches for a backup SR-MPLS TE LSP and switches traffic to the backup SR-MPLS TE LSP.
- BFD for SR-MPLS TE tunnel: BFD for SR-MPLS TE tunnel must be used with BFD for SR-MPLS TE LSP.
- BFD for SR-MPLS TE LSP controls the status of the primary/backup LSP switchover. BFD for SR-MPLS TE tunnel checks actual status of tunnels.
If BFD for SR-MPLS TE tunnel is not configured, the default tunnel status keeps Up, and the effective status cannot be determined.
If BFD for SR-MPLS TE tunnel is configured and the BFD status is set to administrative Down, the BFD session does not work, and the tunnel interface status is unknown.
BFD for SR-MPLS TE tunnel is configured and the BFD status is not set to administrative Down, the tunnel interface status is inconsistent with the BFD status.
The interface status of an SR-MPLS TE tunnel keeps consistent with the status of BFD for SR-MPLS TE tunnel. The BFD session goes Up slowly because of BFD negotiation. If a new label stack is delivered for a tunnel in the Down state and the BFD for this tunnel goes Up, the process takes 10 to 20 seconds. As a result, hard tunnel convergence is delayed if no protection is enabled for the tunnel.
BFD for SR-MPLS TE (one-arm mode): A Huawei device on the ingress cannot use BFD for SR-MPLS TE LSP to communicate with a non-Huawei device on the egress. In this situation, no BFD session can be established. In this case, one-arm BFD for SR-MPLS TE can be used.
On the ingress, enable BFD and specify the one-arm mode to establish a BFD session. After the BFD session is established, the ingress sends BFD packets to the egress through transit nodes along an SR-MPLS TE tunnel. After the forwarding plane receives BFD packets, it removes MPLS labels and searches for a route matching the destination IP address of the ingress. The forwarding plane on the egress loops back the BFD packets to the ingress. The ingress processes the BFD packets. This process is the one-arm detection mechanism.
In the following example, VPN traffic recurses to an SR-MPLS TE LSP, in the scenario where BFD for SR-MPLS TE LSP is used.
A, CE1, CE2, and E are deployed on the same VPN, and CE2 advertises a route to E. PE2 assigns the VPN label to E. PE1 installs the route to E and the VPN label. The path of the SR-MPLS TE tunnel from PE1 to PE2 is PE1 -> P4 -> P3 -> PE2, and the label stack is {9004, 9003, 9005}. When A sends a packet destined for E, PE1 finds the packet's outbound interface based on label 9004 and adds label 9003, label 9005, and the inner VPN label assigned by PE2. Configure BFD to monitor the SR-MPLS TE tunnel. If BFD enters the DetectDown state, the VPN recurses to another SR-MPLS TE tunnel.
SR-MPLS TE Load Balancing
SR-MPLS TE Load Balancing
SR-MPLS TE guides data packet forwarding based on the label stack that the ingress encapsulates into the data packets. Each adjacency label by default identifies a specified adjacency, which means that load balancing cannot be performed if equal-cost links exist. To address the preceding problem, SR-MPLS TE introduces a parallel adjacency label to identify multiple equal-cost links.
In Figure 2-39, two links between nodes B and C and the link between nodes B and D are three equal-cost links. The same adjacency SID (for example, 1001 in Figure 2-39) can be assigned to these links. Such an adjacency SID is called a parallel adjacency label. Like common labels, the parallel adjacency label is also used to compute a path.
When the data packets carrying the parallel label arrive at node B, node B parses the parallel adjacency label and performs the hash algorithm to load-balance the traffic over the three links, which efficiently uses network resources and prevents network congestion.
A parallel adjacency label is generated without affecting the allocation of existing adjacency labels between IGP neighbors. After the parallel adjacency label is configured, the device advertises multiple adjacency labels to the same adjacency.
If BFD for SR-MPLS TE is used and an SR-MPLS TE parallel adjacency label is used, BFD packets can be load-balanced, whereas each BFD packet is hashed to a single link. If the link fails, BFD detects a link Down event even if the other links keep working, which poses the risk of false alarms.
DSCP-based Tunneling for IP Packets to Enter SR-MPLS TE Tunnels
LDP over SR-MPLS TE
Background
Devices can divert packets to SR-MPLS TE tunnels with matching differentiated service code points (DSCPs), which is a TE tunnel selection method. Unlike the traditional method of load balancing services on TE tunnels, DSCP priority-based forwarding gives higher-priority services higher service quality.
Existing networks face a challenge that they may fail to provide exclusive high-quality transmission resources for higher-priority services. This is because the policy for selecting TE tunnels is based on public network routes or VPN routes, which causes a node to select the same tunnel for services with the same destination IP or VPN address but with different priorities.
With this function enabled, a PE can forward IP packets to tunnels based on DSCP values. Class-of-service based tunnel selection (CBTS) supports only eight priorities: BE, AF1, AF2, AF3, AF4, EF, CS6, and CS7. CBTS maps service traffic against eight priorities based on a configured traffic policy. Compared with the CBTS-based tunneling, DSCP-based tunneling maps IP traffic's DSCP values to SR-MPLS TE tunnels and supports more refined priority management (0 to 63) so that SR-MPLS TE can be more flexibly deployed based on services.
Implementation
DSCP values can be specified on the tunnel interface of a tunnel to which services recurse so that the tunnel carries services of one or more priorities. Services with specified priorities can only be transmitted on such tunnels, not be load-balanced by all tunnels to which they may recurse. The service class attribute of a tunnel can also be set to "default" so that the tunnel transmits mismatching services with other priorities that are not specified.
- Set the DSCP attribute for the SR-MPLS TE tunnel. Assume that the DSCP attributes of SR-MPLS TE tunnels are 15 through 20, 5 through 10, and default.
- Based on traffic characteristics of video services (DSCP value in IP packets), the PE maps video traffic to SR-MPLS TE 1 and voice traffic to SR-MPLS TE 2. According to the characteristics of Ethernet data services (DSCP values in IP packets), the PE forwards traffic with the DSCP value set to "default" along SR-MPLS TE 3.
The default DSCP attribute is not a mandatory setting. If the default attribute is not configured, mismatching services will be transmitted along a tunnel that is assigned no DSCP attribute. If such a tunnel does not exist, these services will be transmitted along a tunnel that is assigned the smallest DSCP value.
Usage Scenario
- SR-MPLS TE tunnel load balancing on the public network, LDP over SR-MPLS TE, or SR-MPLS TE tunnels (non-load balancing) are configured on a PE.
- IP/L3VPN, including IPv4 and IPv6 services, is configured on a PE.
- In the VLL, VPLS, and BGP LSP over SR-MPLS TE scenarios, DSCP-based tunneling for IP packets is not supported.
- This function is not supported on a P.