SR-MPLS TE
Definition
Segment Routing-Traffic Engineering (SR-MPLS TE) is a new MPLS TE tunneling technique implemented using SR as the control signaling. The controller calculates a path for an SR-MPLS TE tunnel and delivers a computed label stack completely matching the path to a forwarder. The forwarder, which is the ingress node of the tunnel, uses the label stack to control the path along which packets are transmitted on a network.
SR-MPLS TE Advantages
Due to complexity of the control protocol, Resource Reservation Protocol-TE (RSVP-TE) cannot meet requirements of rapid development of software-defined networking (SDN). SR-MPLS TE outperforms RSVP-TE in this situation. Table 5-4 describes the comparison between SR-MPLS TE and RSVP-TE.
Item |
SR-MPLS TE |
RSVP-TE |
---|---|---|
Control plane |
SR, which is an extension to IGP, is used as the control signaling. The control plane is simple. There is no need for a dedicated MPLS control protocol, reducing the total number of used protocols. |
RSVP-TE is used as the control protocol of MPLS, and the control plane is complex. |
Label distribution |
Each link is distributed only a single label, which is shared by all LSPs. This reduces resource consumption and maintenance workload of label forwarding tables. |
A label is distributed to each LSP. Such a large number of labels for multiple LSPs result in several label forwarding tables, and maintaining these tables creates a heavy workload. |
Path adjustment and control |
An intermediate device does not perceive SR-MPLS TE tunnel information. A service forwarding path can be controlled by operating a label only on the ingress node. Configurations do not need to be delivered to each node. When a node on a path fails, the controller re-calculates the path and updates the label stack of the ingress node. |
Configurations must be delivered to each node. |
Related Concepts
Label Stack
A label stack is a set of link labels in the form of a stack, used to identify a complete label switched path (LSP). Each link label in the stack identifies a specific link, and the label stack from top to bottom describes all links along an SR-MPLS TE LSP. During packet forwarding, a node searches for a link mapped to each link label at the label stack top in a packet, removes the label, and forwards the packet. After all link labels are removed from the label stack, the packet is transmitted through an SR-MPLS TE tunnel to the tunnel destination.
Stitching Label and Stitching Node
If the label stack depth exceeds the upper limit supported by a forwarder, the label stack cannot carry all link labels of a whole LSP. In this situation, the controller must divide the entire path's labels into multiple label stacks and distribute a special label to associate adjacent label stacks, so that multiple label stacks are associated to identify a whole LSP. The special label is a stitching label, and the node that the stitching label resides is a stitching node.
The controller distributes a stitching label to the stitching node, pushes the stitching label into the bottom of the upstream label stack of the LSP, and associates the stitching label with the adjacent downstream label stack. Different from link labels, the stitching label cannot be used to identify links. When a packet is forwarded to the stitching node according to the upstream label stack of the LSP, a new label stack is used to replace the stitching label according to the association between the stitching label and the downstream label stack to guide packet forwarding in the downstream direction of the LSP.
Label Distribution
Distributed by forwarders
A forwarder runs an IGP (only IS-IS is currently supported) to distribute labels and report label information to the controller.
IS-IS SR is enabled on PE1, PE2, and P1 through P4 to establish IS-IS neighbor relationships between each pair of directly connected nodes. SR-capable IS-IS instances assign SR link labels to all IS-IS outbound interfaces. The link labels are flooded to the entire network through the extended IS-IS SR protocol. The following uses P3 as an example to describe the process of IS-IS-based label distribution according to the network in Figure 5-7.
- P3 runs IS-IS to apply for a local dynamic label for a direct link. For example, P3 distributes link label 9002 to the P3-to-P4 link.
- P3 runs IS-IS to advertise the link label and floods it across the network.
- P3 uses the label to generate a label forwarding table.
- After the other nodes on the network run IS-IS to learn the link label advertised by P3, the nodes do not generate local forwarding tables.
PE1, PE2, P1, P2, and P4 distribute and advertise link labels in the same way as P3 does, and the label forwarding tables are generated on each node. One or more nodes establish IS-IS or BGP-LS neighbor relationships with the controller, and report topology information, including SR labels, to the controller.
Distributed by the controller
The controller delivers distributed SR labels to forwarders through NETCONF interfaces.
In Figure 5-8, IS-IS SR- capable forwarders establish IS-IS neighbor relationships with each other. The controller establishes IS-IS or BGP-LS neighbor relationships with the forwarders. IS-IS or BGP-LS reports the collected network topology information to the controller. The controller distributes a label to each link and uses NETCONF to deliver the labels to each forwarder which is the source node of a link. The forwarder then generates a link label forwarding table.
SR-MPLS TE Tunnel Establishment
SR-MPLS TE Tunnel
Segment Routing Traffic Engineering (SR-MPLS TE) runs the SR protocol and uses TE constraints to establish a tunnel.
In Figure 5-9, a primary LSP is established along the path PE1 -> P1 -> P2 -> PE2, and a backup path is established along the path PE1 -> P3 -> P4 -> PE2. The two LSPs form an SR-MPLS TE tunnel. The LSP originates from the ingress node, passes through transit nodes, and terminates at the egress node.
SR-MPLS TE tunnel establishment involves configuring and establishing an SR-MPLS TE tunnel. Before an SR-MPLS TE tunnel is established, IS-IS neighbor relationships must be established between forwarders, and IS-IS or BGP-LS neighbor relationships must be established between forwarders and the controller to implement network layer connectivity, distribute labels, and collect network topology information. Forwarders send label and network topology information to the controller, which uses the information to calculate paths. If no controller is available, enable the CSPF path computation function on the ingress node of an SR-MPLS TE tunnel so that a forwarder runs CSPF to compute a path.
SR-MPLS TE Tunnel Configuration
SR-MPLS TE tunnel attributes are used to establish tunnels. An SR-MPLS TE tunnel can be configured on the controller or a forwarder.
An SR-MPLS TE tunnel is configured on the controller.
The controller runs NETCONF to deliver tunnel attributes to a forwarder (as shown in Figure 5-10). The forwarder delegates the tunnel to the controller for management.
An SR-MPLS TE tunnel is configured on a forwarder.
The controller obtains tunnel attributes (as shown in Figure 5-10) from a forwarder, on which an SR-MPLS TE tunnel is configured. The forwarder delegates the tunnel to the controller for management.
Tunnel management on the controller includes tunnel path calculation, label stack generation, and tunnel maintenance.
SR-MPLS TE Tunnel Establishment
If a service (such as IPv4, VPN, or LDP) is imported to an SR-MPLS TE tunnel, an SR-MPLS TE tunnel is established based on the following process, as shown in Figure 5-10.
The controller uses SR-MPLS TE tunnel constraints and Path Computation Element (PCE) to calculate paths and combines link labels into a label stack (that is, the calculation result).
Stitching labels cannot be configured using commands, but can only be delivered by the controller through NETCONF.
If the label stack depth exceeds the upper limit supported by a forwarder, the label stack cannot carry all link labels of a whole path, and the controller needs to divide the entire path's labels into multiple label stacks.
As shown in Figure 5-10, the controller calculates a path PE1 -> P3 -> P1 -> P2 -> P4 -> PE2 for the SR-MPLS TE tunnel. The path is mapped to label stacks {1003, 1006, 100} and {1005, 1009, 1010}. Label 100 is a stitching label associated with the label stack {1005, 1009, 1010}, and the others are link labels.
The controller delivers the label stacks to the forwarders through NETCONF.
For the networking shown in Figure 5-10, the process of delivering label stacks by the controller is as follows:- The controller delivers stitching label 100 and label stack {1005, 1009, 1010} to the stitching node P1.
- The controller delivers label stack {1003, 1006, 100} to the ingress node PE1.
- The forwarders use the received label stacks to establish an LSP for the SR-MPLS TE tunnel.
SR-MPLS TE Data Forwarding
A forwarder operates a label in a packet based on the label stack mapped to the SR-MPLS TE LSP, searches for an outbound interface hop by hop based on the top label of the label stack, and uses the label to guide the packet to the tunnel destination address.
SR-MPLS TE Data Forwarding (Adjacency Label)
In Figure 5-11, an example is provided to describe the process of forwarding SR-MPLS TE data with manually specified adjacency labels.
Ingress node A receives a data packet, adds the label stack {1003, 1006, 100} to the data packet, matches the adjacent node B according to label 1003 on the top of the stack, searches the outbound interface, and removes label 1003. The packet carrying label stack {1006, 100} is forwarded to downstream node B through the A-to-B adjacency.
- After receiving the packet, node B searches for the adjacency matching top label 1006 in the label stack, finds that the corresponding outbound interface is the B-to-C adjacency, and removes label 1006. The pack carrying the label stack {100} is forwarded to downstream node C through the B-to-C adjacency.
- After receiving the packet, stitching node C identifies stitching label 100 by querying the stitching label entries, and swaps the label for the associated label stack {1005, 1009, 1010}. Stitching node C searches for the adjacency matching top label 1005 in the label stack, finds that the corresponding outbound interface is the C-to-D adjacency, and removes label 1005. The packet carrying label stack {1009, 1010} is forwarded to downstream node D through the C-to-D adjacency.
- After nodes D and E receive the packet, they forward the packet in the same way as node B. Node E removes the last label 1010 and forwards the data packet to node F.
- Egress F receives the packet without a label and forwards the packet according to a routing table.
The preceding information shows that after adjacency labels are manually specified, devices strictly forward the data packets hop by hop along the explicit path designated in the label stack. This forwarding method is also called strict explicit-path SR-MPLS TE.
SR-MPLS TE Data Forwarding (Node and Adjacency Labels)
SR-MPLS TE in strict path mode does not support load balancing if equal-cost paths exist. To overcome these drawbacks, node labels are introduced to SR-MPLS TE paths.
Node A finds an A-to-B outbound interface based on label 1003 on the top of the label stack. Node A removes label 1003 and forwards the packet to the next-hop node B.
- Similar to node A, node B finds the outbound interface mapped to label 1006 on the top of the label stack. Node B removes label 1006 and forwards the packet to the next-hop node C.
- Similar to node A, node C finds the outbound interface mapped to label 1005 on the top of the label stack. Node C removes label 1006 and forwards the packet to the next-hop node D.
- Node D processes label 101 on the top of the label stack. This label is to perform load balancing. Traffic packets are balanced on links based on 5-tuple information.
- After receiving node label 101, nodes E and G that are at the penultimate hops remove labels and forward packets to node F to complete the E2E traffic forwarding.
The preceding information shows that after adjacency and node labels are manually specified, a device can forward the data packets along the shortest path or load-balance the data packets over paths. The paths are not fixed, and therefore, this forwarding method is also called loose explicit-path SR-MPLS TE.
SR-MPLS TE Load Balancing
SR-MPLS TE Load Balancing
SR-MPLS TE guides data packet forwarding based on the label stack that the ingress node encapsulates into the data packets. By default, each adjacency label identifies a specified adjacency, which means that load balancing cannot be performed even if equal-cost links exist. To address the preceding problem, SR-MPLS TE introduces a parallel adjacency label to identify multiple equal-cost links.
In Figure 5-13, two links between nodes B and C and the link between nodes B and D are equal-cost links. The same adjacency SID (for example, 1001 in Figure 5-13) can be configured for these links. Such an adjacency SID is called a parallel adjacency label. Like common labels, the parallel adjacency label is also used in path calculation.
When the data packets carrying the parallel adjacency label arrive at node B, node B parses the parallel adjacency label and uses the hash algorithm to load balance the traffic over the three links, which efficiently uses network resources and prevents network congestion.
A parallel adjacency label is generated without affecting distribution of existing adjacency labels between IGP neighbors. After a parallel adjacency label is configured, the load balancing device advertises multiple adjacency labels to the same adjacency.
If BFD for SR-MPLS TE is used and an SR-MPLS TE parallel adjacency label is configured, BFD packets can be load balanced, whereas each BFD packet is hashed to a single link. If the link fails, BFD detects a link Down event even if the other links keep working. In this case, faults may be detected incorrectly.
SR-MPLS TE Reliability
CR-LSP Backup
CR-LSP backup provides end-to-end protection for an SR-MPLS TE tunnel. If the ingress node detects a failure of the primary CR-LSP, it switches traffic to a backup CR-LSP. Traffic moves back to the primary CR-LSP once it recovers.
An SR-MPLS TE tunnel supports CR-LSP backup in hot standby mode only. In this mode, a backup CR-LSP is set up immediately after the primary CR-LSP is set up. When the primary CR-LSP fails, traffic moves to the backup CR-LSP quickly.
Implementation
Path planning
Determine whether the paths of primary and hot-standby CR-LSPs partially overlap. A hot-standby CR-LSP can be established over an explicit path.
A hot-standby CR-LSP supports the following attributes:- Explicit path
- Hop limit
- Path overlapping
Backup CR-LSP setup
If a new tunnel configuration is committed or a tunnel goes Down, the ingress node attempts to establish a hot-standby CR-LSP, until a CR-LSP is successfully established.
Backup CR-LSP attribute modification
If attributes of a backup CR-LSP are modified, the ingress node uses the make-before-break mechanism to reestablish the backup CR-LSP with the updated attributes. After a backup CR-LSP has been successfully reestablished, traffic on the original backup CR-LSP (if it is transmitting traffic) moves to this new backup CR-LSP, and then the original backup CR-LSP is torn down.
Fault detection
SR-MPLS TE does not provide a fault detection mechanism. CR-LSP backup uses BFD for LSP to quickly detect faults.
Traffic switchover
After the primary CR-LSP fails, the ingress node attempts to switch traffic from the primary CR-LSP to a hot-standby CR-LSP.
Traffic switchback
Traffic switches back to a path based on priorities of the available CR-LSPs. Traffic will first switch to the primary CR-LSP. Traffic will preferentially switch to the primary CR-LSP, followed by the hot-standby CR-LSP.
BFD for SR-MPLS TE
Bidirectional Forwarding Detection (BFD) enables rapid fault detection on the SR-MPLS TE tunnel and SR-MPLS TE LSP to monitor the actual connectivity. If a fault occurs, the SR-MPLS TE tunnel reliability function triggers traffic switchover, improving the reliability of the entire network.
Background
SR-MPLS TE does not provide a connectivity detection mechanism. The status of an SR-MPLS TE tunnel and an SR-MPLS TE LSP is Up by default after the tunnel and LSP are established. Service traffic is lost continuously upon a path failure due to lack of a connectivity detection mechanism. Detection of faults on the SR-MPLS TE LSP and SR-MPLS TE tunnel must be completed by an additional protocol. BFD for SR-MPLS TE is an E2E rapid detection mechanism that can rapidly detect faults in links of an SR-MPLS TE tunnel.
Implementation Process
-
This method detects the SR-MPLS TE tunnel connectivity to obtain the real tunnel status. During establishment of an SR-MPLS TE tunnel, the tunnel interfaces cannot go Up if BFD negotiation fails.
-
This method detects the LSP connectivity to obtain the real LSP status. During establishment of an SR-MPLS TE LSP, the LSP cannot go Up if BFD negotiation fails. When the primary LSP fails, traffic is quickly switched to the backup LSP.
BFD for SR-MPLS TE Tunnel
If a fault occurs, BFD detects the fault on the SR-MPLS TE tunnel and sets the status of the associated tunnel interface to Down, implementing millisecond-level fault detection.
If the fault is rectified, the tunnel interface goes Up only after the BFD goes Up. The device performs BFD negotiation before determining that the BFD is Up, taking a long period of time. The tunnel status is Down before a new tunnel is established. The controller delivers new label stacks for BFD to go Up, and then the associated tunnel interface goes Up. The entire process takes a dozen of seconds. BFD for SR-MPLS TE tunnel has a long convergence time if no other protection mechanism is configured for the SR-MPLS TE tunnel.
BFD for SR-MPLS TE LSP
BFD for SR-MPLS TE LSP can rapidly detect faults on CR-LSPs and notify the forwarding plane of the faults to ensure a fast traffic switchover. Generally, BFD for SR-MPLS TE LSP and hot standby CR-LSP are used together.
Static BFD for SR-MPLS TE LSP: A static BFD session is bound to a CR-LSP. That is, a static BFD session is created between the ingress and egress nodes of a CR-LSP. To detect a backup CR-LSP, you also need to bind a BFD session to it.
Dynamic BFD for SR-MPLS TE LSP: After the BFD for SR-MPLS TE LSP capability is enabled on the ingress node and automatic BFD session creation is enabled on the egress node, the device dynamically creates a BFD session for a CR-LSP. Dynamic BFD for SR-MPLS TE LSP also automatically creates BFD sessions for all CR-LSPs and performs detection.
After a BFD session is created between the ingress and egress nodes of an SR-MPLS TE LSP, a BFD packet is sent by the ingress node and forwarded to the egress node along a CR-LSP. The egress node then responds to the BFD packet. The BFD session at the ingress node can rapidly detect the status of the link through which the LSP passes. If a link fault is detected, BFD notifies the forwarding plane of the fault. The forwarding plane switches service traffic to the backup CR-LSP and reports fault information to the control plane.
Other Functions
One-arm BFD for SR-MPLS TE:
When a device from another vendor is used as the egress node, BFD for SR-MPLS TE fails to create a common BFD session between a Huawei device and the non-Huawei device. BFD for SR-MPLS TE provides the one-arm echo mode to solve the problem.
Using a one-arm BFD session, the ingress node exchanges the source address and destination address in an IP packet header when encapsulating a BFD packet. After the BFD packet is forwarded to the egress node, the egress node searches for a route based on the destination address in the packet and sends the packet back to the ingress node. The ingress node detects the BFD packet to implement the one-arm BFD detection.
SR-MPLS TE Group
An SR-MPLS TE group consists of several SR-MPLS TE tunnels that forward service packets with specified CoS values, which is used to implement differentiated services for packets of multiple service types in the scenario where some Equal-cost multi-path routing (ECMP) outbound interfaces are SR-MPLS TE tunnel interfaces.
The ECMP outbound interfaces may be SR-MPLS TE tunnel interfaces, RSVP TE tunnel interfaces, VLANIF interfaces, main interfaces, and Layer 3 sub-interfaces. The SR-MPLS TE group function can be configured on an SR-MPLS TE tunnel to implement priority-based load balancing of traffic based on the matching between the class of service (CoS) value of the SR-MPLS TE tunnel and that of the service packet.
CoS Value of a Tunnel
To carry packets with different priorities, SR-MPLS TE tunnels have nine CoSs that are listed in ascending order of priority: default, BE, AF1, AF2, AF3, AF4, EF, CS6, and CS7. The priorities BE, AF1, AF2, AF3, AF4, EF, CS6, and CS7 map to eight internal priorities of the device. The priority default can map to any internal priority. One SR-MPLS TE tunnel can have one or more CoS values.
CoS Value of a Service Packet
Packet priority mapping
For a packet entering the tunnel, the DSCP or EXP field in the packet header is used to map the forwarding priority according to the QoS mapping profile. The forwarding priority-based mapping is the CoS value of the packet.
MQC re-marking
For a packet entering the tunnel, the internal priority is specified according to 5-tuple information of the packet. The specified internal priority is the CoS value of the packet.
Tunnel Selection Principles for an SR-MPLS TE Group
Figure 5-15 shows the principles for selecting an SR-MPLS TE tunnel from an SR-MPLS TE group.
- If the SR-MPLS TE group has an SR-MPLS TE tunnel that matches the CoS value of the packet, the tunnel is used for packet forwarding. Otherwise, go to step 2.
- If the SR-MPLS TE group has an SR-MPLS TE tunnel with the CoS value default, the tunnel is used for packet forwarding. Otherwise, go to step 3.
- If the SR-MPLS TE group has an SR-MPLS TE tunnel that is not configured with the CoS value, the tunnel is used for packet forwarding. Otherwise, go to step 4.
- If the ECMP outbound interface has another type of interface with no priority configured interface (for example, a VLANIF interface), the interface of this type is preferred for packet forwarding. Otherwise, go to step 5.
- The SR-MPLS TE tunnel with the lowest CoS value in the SR-MPLS TE group is selected for packet forwarding.