MPLS TE Reliability
Introduction to MPLS TE Reliability
If attributes of a working MPLS TE tunnel, such as bandwidth, are modified, a new path is set up for the tunnel using modified attributes, and service traffic is switched to the new path. Reliability technologies are required to prevent or minimize packet loss in the process.
If a node or link on a working MPLS TE tunnel fails, reliability technologies are required to set up a backup CR-LSP and switch traffic to the backup CR-LSP, while minimizing packet loss in this process.
When a node on a working MPLS TE tunnel encounters a control plane failure but its forwarding plane is still working properly, reliability technologies are required to ensure nonstop traffic forwarding during fault recovery on the control plane.
Reliability Technology |
Description |
Function |
---|---|---|
Tunnel attribute update reliability |
Ensures reliable traffic transmission when a CR-LSP is set up because of attribute updates. |
|
Fault detection |
Rapidly detects MPLS TE network faults and triggers protection switching. |
|
Traffic protection |
Network-level reliability: provides end-to-end path protection and local protection. |
|
Device-level reliability: ensures that nonstop forwarding when the control plane fails on a node. |
Make-Before-Break
The make-before-break mechanism prevents traffic loss during a traffic switchover between two CR-LSPs. This mechanism improves MPLS TE tunnel reliability.
Background
Any change in link or tunnel attributes causes a CR-LSP to be reestablished using new attributes. Traffic is then switched from the previous CR-LSP to the new CR-LSP. If a traffic switchover is triggered before the new CR-LSP is set up, some traffic is lost. The make-before-break mechanism prevents traffic loss.
Implementation
The make-before-break mechanism sets up a new CR-LSP and switches traffic to it before the original CR-LSP is torn down. This mechanism helps minimize data loss and reduces bandwidth consumption. Make-before-break is implemented using the shared explicit (SE) resource reservation style.
The new CR-LSP may compete with the original CR-LSP for bandwidth on some shared links. The new CR-LSP cannot be established if it fails the competition. The make-before-break mechanism allows the system to reserve bandwidth used by the original CR-LSP for the new one, without calculating the reserved bandwidth on shared links. Additional bandwidth is required if links on the new path do not overlap the links on the original path.
In Figure 4-19, the maximum reservable bandwidth on each link is 60 Mbit/s. A CR-LSP has been set up along Path 1 (Router_1 -> Router_2 -> Router_3 -> Router_4) with the bandwidth of 40 Mbit/s.
A new CR-LSP needs to be set up along Path 2 (Router_1 -> Router_5 -> Router_3 -> Router_4) to forward data through the lightly loaded Router_5. The available bandwidth of the link Router_3 -> Router_4 is only 20 Mbit/s, not enough for the new path. The make-before-break mechanism can be used in this situation to allow the new CR-LSP to use the bandwidth of the link between Router_3 and Router_4 reserved for the original CR-LSP. After the new CR-LSP is established, traffic switches to the new CR-LSP, and the original CR-LSP is torn down.
The make-before-break mechanism can also be used to increase tunnel bandwidth. If the reservable bandwidth of a shared link increases to the required value, a new CR-LSP can be established.
On the network shown in Figure 4-19, the maximum reservable bandwidth on each link is 60 Mbit/s. A CR-LSP has been set up along Path 1 with the bandwidth of 30 Mbit/s.
A new CR-LSP needs to be set up along Path 2 to forward data through the lightly loaded Router_5, and the path bandwidth needs to increase to 40 Mbit/s. The available bandwidth of the link Router_3 -> Router_4 is only 30 Mbit/s. The make-before-break mechanism can be used in this situation. This mechanism allows the new CR-LSP to use the bandwidth of the link between Router_3 and Router_4 reserved for the original CR-LSP, and reserves an additional bandwidth of 10 Mbit/s for the new path. After the new CR-LSP is set up, traffic is switched to the new CR-LSP, and the original CR-LSP is torn down.
Switching and Deletion Delays
If a node is busy but its upstream or downstream node is idle, a CR-LSP may be torn down before a new CR-LSP is established, causing a temporary traffic interruption.
The make-before-break mechanism uses switching and deletion delay timers to prevent temporary traffic interruption. When the two timers are configured, the system switches traffic to a new CR-LSP after the switching delay time, and then deletes the original CR-LSP after the deletion delay time.
RSVP Hello
RSVP Hello mechanism is used to rapidly detect reachability between RSVP nodes and trigger path protection provided by TE FRR. In addition, a node can use the RSVP Hello mechanism to detect whether a neighboring node is in Restart state so it can help the neighboring node in implementing RSVP GR.
Background
RSVP Refresh messages can synchronize PSB and RSB between nodes, monitor reachability between RSVP neighbors, and maintain RSVP neighbor relationships.
This soft state mechanism detects neighbor relationships using Path and Resv messages. The detection speed is low and a link failure cannot promptly trigger a service traffic switchover. RSVP Hello is introduced to solve this problem.
Implementation
RSVP Hello is implemented as follows:
Hello handshake
As shown in Figure 4-20, LSRA and LSRB are directly connected.
When RSVP Hello is enabled on the interface of LSRA, LSRA sends a Hello Request message to LSRB.
If LSRB is enabled with RSVP Hello, LSRB replies to LSRA with a Hello ACK message after receiving the Hello Request message.
After LSRA receives the Hello ACK message from LSRB, LSRA determines that the neighbor LSRB is reachable.
Neighbor loss detection
After a successful Hello handshake, LSRA and LSRB exchange Hello messages. If LSRA receives no Hello ACK message from LSRB after sending three consecutive Hello Request messages to LSRB, LSRA considers the neighbor LSRB lost. TE FRR is triggered and LSRA restarts an RSVP Hello handshake.
Neighbor restart detection
After LSRA detects the loss of the neighbor LSRB (they are both RSVP GR capable), LSRA waits for the Hello Request message carrying a GR extension from LSRB. After receiving this message, LSRA helps LSRB restore RSVP state information and sends a Hello ACK message to LSRB. LSRB receives the Hello ACK message from LSRA and knows that LSRA is helping it implement GR. LSRA and LSRB exchange Hello messages to maintain the restored GR status.
If GR is disabled but TE FRR is enabled on LSRA, LSRA switches traffic to the bypass CR-LSP to ensure uninterrupted traffic transmission when detecting loss of the neighbor LSRB.
If GR is enabled on LSRA, LSRA preferentially uses RSVP GR to ensure uninterrupted traffic transmission on the forwarding plane upon a control plane failure.
Usage Scenario
RSVP Hello applies to scenarios with TE FRR or RSVP GR enabled.
CR-LSP Backup
CR-LSP backup provides end-to-end protection for an MPLS TE tunnel. If the ingress node detects a failure of the primary CR-LSP, it switches traffic to a backup CR-LSP. After the primary CR-LSP recovers, traffic switches back to the primary CR-LSP.
Concepts
CR-LSP backup functions include hot standby, ordinary backup, and the best-effort path:
Hot standby: A hot-standby CR-LSP is set up immediately after the primary CR-LSP is set up. When the primary CR-LSP fails, traffic switches to the hot-standby CR-LSP.
Ordinary backup: An ordinary backup CR-LSP can be set up only after a primary CR-LSP fails. The ordinary backup CR-LSP takes over traffic when the primary CR-LSP fails.
Best-effort path: If both the primary and backup CR-LSPs fail, a best-effort path is set up and takes over traffic.
In Figure 4-21, the primary CR-LSP is set up over the path PE1 -> P1 -> P2 -> PE2, and the backup CR-LSP is set up over the path PE1 -> P3 -> PE2. When both CR-LSPs fail, PE1 sets up a best-effort path PE1 -> P4 -> PE2 to take over traffic.
A best-effort path has no bandwidth reserved for traffic, but has an affinity and a hop limit configured to control the nodes it passes.
Implementation
CR-LSP backup deployment
Determine the paths, and bandwidth values. Table 4-15 lists CR-LSP backup deployment items.Table 4-15 CR-LSP backup deploymentItem Hot Standby
Ordinary Backup
Best-Effort Path
Path Determine whether the paths of primary and hot-standby CR-LSPs partially overlap. A hot-standby CR-LSP can be established over an explicit path.
A hot-standby CR-LSP supports the following attributes:- Explicit path
- Affinity attribute
- Hop limit
- Path overlapping
The path of an ordinary CR-LSP can partially overlap the path of the primary CR-LSP, no matter whether the ordinary CR-LSP is set up along an explicit or implicit path.
An ordinary backup CR-LSP supports the following attributes:- Explicit path
- Affinity attribute
- Hop limit
A best-effort path is automatically calculated by the ingress node.
A best-effort path supports the following attributes:- Affinity attribute
- Hop limit
Bandwidth A hot-standby CR-LSP has the same bandwidth as a primary CR-LSP by default. Dynamic bandwidth protection can ensure that a hot-standby CR-LSP does not use additional bandwidth when it is not transmitting traffic.
An ordinary backup CR-LSP has the same bandwidth as a primary CR-LSP.
A best-effort path is only a protection path that does not have reserved bandwidth.
Configuration combination The hot-standby CR-LSP can be used together with a best-effort path to protect the primary CR-LSP.
The ordinary CR-LSP can only be used alone to protect the primary CR-LSP.
-
Table 4-16 CR-LSP backup modesBackup Mode
Description
Advantage
Shortcoming
Hot standby A hot-standby CR-LSP is set up over a separate path immediately after a primary CR-LSP is set up. A rapid traffic switchover can be performed. If dynamic bandwidth adjustment is disabled, additional bandwidth needs to be reserved for a hot-standby CR-LSP. Ordinary backup The system attempts to set up an ordinary backup CR-LSP if a primary CR-LSP fails. No additional bandwidth is needed. Ordinary backup performs a traffic switchover slower than hot standby. Best-effort path The system establishes a best-effort path over an available path if both the primary and backup CR-LSPs fail. Establishing a best-effort path is easy and a few constraints are needed. Some quality of service (QoS) requirements cannot be met. Backup CR-LSP setup
Multiple CR-LSP backup methods may be supported for a tunnel. The ingress node uses these methods in turn until a CR-LSP is successfully established.
If new tunnel configuration is committed or a tunnel goes Down, the ingress node attempts to establish a hot-standby CR-LSP, an ordinary backup CR-LSP, and a best-effort path in turn, until a CR-LSP is successfully established.
Backup CR-LSP attribute modification
If attributes of a backup CR-LSP are modified, the ingress node uses the make-before-break mechanism to reestablish the backup CR-LSP with the updated attributes. After that backup CR-LSP has been successfully reestablished, traffic on the original backup CR-LSP (if it is transmitting traffic) switches to this new backup CR-LSP, and then the original backup CR-LSP is torn down.
Fault detection
CR-LSP backup supports the following fault detection functions:- Default error signaling mechanism of RSVP-TE: The fault detection speed is relatively slow.
- Bidirectional forwarding detection (BFD) for CR-LSP: This function is recommended because it implements fast fault detection.
Traffic switchover
After the primary CR-LSP fails, the ingress node attempts to switch traffic from the primary CR-LSP to a hot-standby CR-LSP. If the hot-standby CR-LSP is unavailable, the ingress node attempts to switch traffic to an ordinary backup CR-LSP. If the ordinary backup CR-LSP is unavailable, the ingress attempts to switch traffic to a best-effort path.
Traffic switchback
Traffic switches back to a path based on priorities of the available CR-LSPs. Traffic will first switch to the primary CR-LSP. If the primary CR-LSP is unavailable, traffic will switch to the hot-standby CR-LSP. The ordinary CR-LSP has the lowest priority.
Dynamic Bandwidth Protection for Hot-standby CR-LSPs
Hot-standby CR-LSPs support dynamic bandwidth protection. The dynamic bandwidth protection function allows a hot-standby CR-LSP to obtain bandwidth resources only after the hot-standby CR-LSP takes over traffic from a faulty primary CR-LSP. This function improves bandwidth efficiency and reduces network costs.
- If the primary CR-LSP fails, traffic immediately switches to the hot-standby CR-LSP with 0 bit/s bandwidth. The ingress node uses the make-before-break mechanism to establish a hot-standby CR-LSP.
- After the new hot-standby CR-LSP has been successfully established, the ingress node switches traffic to this CR-LSP and tears down the hot-standby CR-LSP with 0 bit/s bandwidth.
- After the primary CR-LSP recovers, traffic switches back to the primary CR-LSP. The hot-standby CR-LSP then releases the bandwidth, and the ingress node establishes another hot-standby CR-LSP with 0 bit/s bandwidth.
TE FRR
Traffic engineering fast reroute (TE FRR) provides link protection and node protection for MPLS TE tunnels. If a link or node fails, TE FRR rapidly switches traffic to a backup path, minimizing traffic loss.
Background
A link or node failure triggers a primary/backup CR-LSP switchover. The switchover is not completed until the IGP routes of the backup path converge, CSPF calculates a new path, and a new CR-LSP is established. Traffic is lost during this process.
TE FRR technology can prevent traffic loss during a primary/backup CR-LSP switchover. After a link or node fails, TE FRR establishes a CR-LSP that bypasses the faulty link or node. The bypass CR-LSP can then rapidly take over traffic to minimize loss. At the same time, the ingress node reestablishes a primary CR-LSP.
Concepts
Table 4-17 explains the components shown in Figure 4-22.
Concept |
Description |
---|---|
Primary CR-LSP |
Protected CR-LSP. |
Bypass CR-LSP |
CR-LSP protecting the primary CR-LSP. A bypass CR-LSP is usually in idle state and does not forward service traffics. If the bypass CR-LSP is required to forward service data, it must be assigned sufficient bandwidth. |
PLR |
Point of local repair, ingress node of a bypass CR-LSP. The PLR can be the ingress node but not the egress node of the primary CR-LSP. |
MP |
Merge point, egress node of a bypass CR-LSP. It must be on the path of the primary CR-LSP but cannot be the ingress node of the primary CR-LSP. |
Classified by |
Type |
Description |
---|---|---|
Protected object |
Link protection |
In Figure 4-23 below, the primary CR-LSP passes through the direct link between the PLR (LSRB) and MP (LSRC). Bypass LSP 1 can protect this link, which is called link protection. |
Node protection |
In Figure 4-23 below, the primary CR-LSP passes through LSRC between the PLR (LSRB) and MP (LSRD). Bypass LSP 2 can protect LSRC, which is called node protection. |
|
Bandwidth |
Bandwidth protection |
A bypass CR-LSP is assigned bandwidth higher than or equal to the primary CR-LSP bandwidth, so that the bypass CR-LSP protects the path and bandwidth of the primary CR-LSP. |
Non-bandwidth protection |
A bypass CR-LSP has no bandwidth and protects only the path of the primary CR-LSP. |
|
Implementation |
Manual protection |
A bypass CR-LSP is manually configured and bound to a primary CR-LSP. |
Auto protection |
An auto FRR-enabled node automatically establishes a bypass CR-LSP. The node binds the bypass CR-LSP to a primary CR-LSP if the node receives an FRR protection request and the FRR topology requirements are met. |
A bypass CR-LSP supports the combination of protection modes. For example, manual protection, node protection, and bandwidth protection can be implemented together on a bypass CR-LSP.
Implementation
TE FRR is implemented as follows:
Setup of a primary CR-LSP
A primary CR-LSP is set up in the same way as a common CR-LSP except that the ingress node adds flags into the SESSION_ATTRIBUTE object in a Path message. For example, the local protection desired flag indicates that the primary CR-LSP requires a bypass CR-LSP, and the bandwidth protection desired flag indicates that the primary CR-LSP requires bandwidth protection.
Binding between a bypass CR-LSP and the primary CR-LSP
FRR TE searches for a suitable bypass CR-LSP for the primary CR-LSP. A bypass CR-LSP can be bound to a primary CR-LSP only if the primary CR-LSP has a local protection desired flag. The binding process is completed before a CR-LSP switchover.
Before binding a bypass CR-LSP to a primary CR-LSP, the PLR must obtain the following from the Record Route Object (RRO) in the received Resv message: the outbound interface of the bypass CR-LSP, the next hop label forwarding entry (NHLFE), the label switching router (LSR) ID of the MP, the label allocated by the MP, and the protection type.
The PLR on the primary CR-LSP already knows its next hop (NHOP) and next NHOP (NNHOP). If the egress LSR ID of the bypass CR-LSP is the same as the NHOP LSR ID, the bypass CR-LSP provides link protection. If the egress LSR ID of the bypass CR-LSP is the same as the NNHOP LSR ID, the bypass CR-LSP provides node protection. In Figure 4-24, bypass LSP 1 protects the link between LSRB and LSRC, and bypass LSP 2 protects the node between LSRB and LSRD.If multiple bypass CR-LSPs are established, the PLR checks whether the bypass CR-LSP protect bandwidth, their implementations, and protected objects in sequence. Bypass CR-LSPs providing bandwidth protection are preferred over those that do not provide bandwidth protection. Manual bypass CR-LSPs are preferred over auto bypass CR-LSPs. Bypass CR-LSPs providing node protection are preferred over those providing link protection. Figure 4-24 shows two bypass CR-LSPs. If both the bypass CR-LSPs provide bandwidth protection and are manually configured, bypass LSP 2 is bound to the primary CR-LSP. (Bypass LSP 2 provides node protection, and bypass LSP 1 provides link protection.) If bypass LSP 1 provides bandwidth protection but bypass LSP 2 does not, bypass LSP 1 is bound to the primary CR-LSP.
After the binding is complete, the primary CR-LSP's NHLFE records the bypass CR-LSP's NHLFE index and an inner label that the MP allocates to the upstream node on the primary CR-LSP. This label is used to forward traffic during a primary/backup CR-LSP switchover.
Fault detection
- Link protection uses a link layer protocol to detect and report faults. The speed of fault detection at the link layer depends on the link type.
- Node protection uses a link layer protocol to detect link faults. If no fault occurs on a link, RSVP Hello or BFD for RSVP is used to detect faults on the protected node.
As soon as a link or node fault is detected, an FRR switchover is triggered.In node protection, only the link between the protected node and the PLR is protected. The PLR cannot detect faults on the link between the protected node and the MP.
Link fault detection, BFD, and RSVP Hello mechanisms detect a failure at descending speeds.
Switchover
When the primary CR-LSP fails, service traffic and RSVP messages are switched to the bypass CR-LSP, and the switchover event is advertised to the upstream nodes. Upon receiving a data packet, the PLR pushes an inner label and an outer label into the packet. The inner label is allocated by the MP to the upstream node on the primary CR-LSP, and the outer label is allocated by the next hop on the bypass CR-LSP to the PLR. The penultimate hop of the bypass CR-LSP pops the outer label and forwards the packet with only the inner label to the MP. The MP forwards the packet to the next hop along the primary CR-LSP according to the inner label.
Figure 4-25 shows nodes on the primary and bypass CR-LSPs, labels allocated to the nodes, and behaviors that the nodes perform. The bypass CR-LSP provides node protection. If LSRC or the link between LSRB and LSRC fails, the PLR (LSRB) swaps the inner label 1024 to 1022, pushes the outer label 34 into a packet, and forwards the packet to the next hop along the bypass CR-LSP. The lower part of Figure 4-25 shows the packet forwarding process after a TE FRR switchover.Switchback
After a TE FRR switchover is complete, the ingress node of the primary CR-LSP reestablishes the primary CR-LSP using the make-before-break mechanism. Service traffic and RSVP messages are switched back to the primary CR-LSP after the primary CR-LSP is successfully reestablished. The reestablished primary CR-LSP is called a modified CR-LSP. The make-before-break mechanism allows the original primary CR-LSP to be torn down only after the modified CR-LSP is set up successfully.
FRR does not take effect if multiple nodes fail simultaneously. After data is switched from the primary CR-LSP to the bypass CR-LSP, the bypass CR-LSP must remain Up to ensure data forwarding. If the bypass CR-LSP fails, the protected data cannot be forwarded using MPLS, and the FRR function fails. Even if the bypass CR-LSP is reestablished, it cannot forward data. Data forwarding will be restored only after the primary CR-LSP restores or is reestablished.
Other Functions
Card removal protection
The card removal protection function protects the physical outbound interface of a primary CR-LSP on a PLR. When the card where the physical outbound interface of the primary CR-LSP resides is removed from the PLR, MPLS TE traffic is rapidly switched to the bypass CR-LSP. After the card is reinstalled, MPLS TE traffic is switched back to the primary CR-LSP if the physical outbound interface of the primary CR-LSP is Up.
If a card with a configured TE tunnel interface is removed, tunnel information is lost. To implement card removal protection, configure the bypass tunnel interface and the physical outbound interface of a bypass CR-LSP on a different card than those of the primary CR-LSP. You are advised to configure the bypass tunnel interfaces of a PLR on the Main Processing Unit (MPU).
After you configure the bypass tunnel interface on the MPU of a PLR, the physical outbound interface of the primary CR-LSP becomes Stale when the interface card fails or is removed. The FRR-protected primary CR-LSP is not deleted. When the card is reinstalled, the physical outbound interface is restored and the primary CR-LSP is reestablished.
N:1 protection
TE FRR supports N:1 protection mode, in which a bypass CR-LSP protects multiple primary CR-LSPs.
Cooperation Between CR-LSP Backup and TE FRR
Combination of CR-LSP backup and TE FRR
CR-LSP ordinary backup and TE FRR: TE FRR can rapidly detect a link failure and switch traffic to the bypass CR-LSP. When both primary and bypass CR-LSPs fail, a backup CR-LSP is established to take over traffic.
CR-LSP hot standby and TE FRR: TE FRR can rapidly detect a link failure and switch traffic to the bypass CR-LSP. Link failure information is then sent to the tunnel ingress node through a signaling protocol and traffic is switched to a backup CR-LSP.
Association between CR-LSP backup and TE FRR
After TE FRR local protection and backup CR-LSP end-to-end protection are deployed, the system supports associated protection of bypass and backup CR-LSPs. After association between CR-LSP backup and TE FRR is enabled:
If CR-LSP ordinary backup is enabled, the following situations occur:
When the protected link or node fails, TE FRR switches traffic to the bypass CR-LSP and attempts to restore the primary CR-LSP and to set up a backup CR-LSP.
After the backup CR-LSP is set up successfully but the primary CR-LSP has not restored, traffic is switched to the backup CR-LSP.
After the primary CR-LSP restores successfully, traffic is switched back to the primary CR-LSP, regardless of whether traffic is transmitted along the bypass or backup CR-LSP.
If the backup CR-LSP fails to be set up and the primary CR-LSP is not restored, traffic is transmitted along the bypass CR-LSP.
If CR-LSP hot standby is enabled, the following situations occur:
When the protected link or node fails and the backup CR-LSP is Up, traffic is switched to the bypass CR-LSP and then immediately to the backup CR-LSP. At the same time, the ingress node attempts to restore the primary CR-LSP.
If the backup CR-LSP is Down, traffic is switched in the same manner as in ordinary backup mode.
In CR-LSP hot standby mode, the ingress node attempts to set up a backup CR-LSP while the primary CR-LSP is Up. After the backup CR-LSP is created successfully, more bandwidth is occupied. In CR-LSP ordinary backup mode, the ingress node starts to set up a backup CR-LSP only when the primary CR-LSP is in FRR-in-use state. No more bandwidth is occupied when the primary CR-LSP is working properly. Therefore, association between CR-LSP ordinary backup and TE FRR is recommended.
SRLG
Shared risk link group (SRLG) is a constraint to calculating a backup or a bypass CR-LSP on a network with CR-LSP hot standby or TE FRR configured. SRLG prevents bypass and primary CR-LSPs from being set up on links with the same risk level, which enhances TE tunnel reliability.
Background
A network administrator often uses CR-LSP hot standby or TE FRR technology to ensure MPLS TE tunnel reliability. However, CR-LSP hot standby or TE FRR may fail in real-world application.
In Figure 4-26, Path 1 is the primary CR-LSP and Path 2 is the bypass CR-LSP. The link between P1 and P2 requires TE FRR protection.
Core nodes P1, P2, and P3 on the backbone network are connected by a transport network device. In Figure 4-26, the top diagram is an abstract version of the actual topology below. NE1 is a transport network device. During network construction and deployment, two core nodes may share links on the transport network. For example, the yellow links in Figure 4-26 are shared by P1, P2, and P3. A shared link failure affects primary and bypass CR-LSPs and makes FRR protection invalid. To enable TE FRR to protect the CR-LSP, bypass and primary CR-LSPs must be set up over links of different risk levels. SRLG technology can be deployed to meet this requirement.
However, an SRLG is a set of links that share the same risks. If one of the links fails, other links in the group may fail as well. Therefore, protection fails even if other links in the group function as the hot standby or bypass CR-LSP for the failed link.
Implementation
SRLG is a link attribute, expressed by a numeric value. Links with the same SRLG value belong to a single SRLG.
The SRLG value is advertised to the entire MPLS TE domain using IGP TE. Nodes in a domain can then obtain SRLG values of all the links in the domain. The SRLG value is used in CSPF calculations together with other constraints such as bandwidth.
MPLS TE SRLG works in either of the following modes:
Strict mode: The SRLG value is a mandatory constraint when CSPF calculates paths for hot standby and bypass CR-LSPs.
Preferred mode: The SRLG value is an optional constraint when CSPF calculates paths for hot standby and bypass CR-LSPs. If CSPF fails to calculate a path based on the SRLG value, CSPF excludes the SRLG value when recalculating the path.
BFD for MPLS TE
Bidirectional Forwarding Detection (BFD) can quickly detect faults in an MPLS TE tunnel and trigger a traffic switchover when a fault is detected, improving network reliability.
Background
In most cases, MPLS TE uses TE FRR, CR-LSP backup, and TE tunnel protection group to enhance network reliability. These technologies detect faults using the RSVP Hello or RSVP Srefresh mechanism, but the detection speed is slow. When a Layer 2 device such as a switch or hub exists between two nodes, the traffic switchover speed is even slower, leading to traffic loss. BFD uses the fast packet transmission mode to quickly detect faults on MPLS TE tunnels, so that a service traffic switchover can be triggered quickly to better protect the MPLS TE service.
Concepts
Based on BFD session setup modes, BFD is classified into the following types:
Static BFD: Local and remote discriminators of BFD sessions are manually configured.
Dynamic BFD: Local and remote discriminators of BFD sessions are automatically allocated by the system.
For details about BFD, see "BFD Configuration" in Huawei AR Series IOT Gateway Configuration Guide - Reliability.
Implementation
-
BFD for Resource Reservation Protocol (RSVP) detects faults on links between RSVP nodes in milliseconds. BFD for RSVP applies to TE FRR networking where a Layer 2 device exists between the PLR and its RSVP neighbor along the primary CR-LSP.
-
BFD for CR-LSP can rapidly detect faults on CR-LSPs and notify the forwarding plane of the faults to ensure a fast traffic switchover. BFD for CR-LSP is usually used together with a hot-standby CR-LSP.
-
When an MPLS TE tunnel functions as a virtual private network (VPN) tunnel on the public network, BFD for TE tunnel detects faults in the entire TE tunnel. This triggers traffic switchovers for VPN applications including VPN FRR and virtual leased line (VLL) FRR.
BFD for RSVP
When Layer 2 devices exist between neighboring RSVP nodes, the two nodes can detect a link failure based only on the RSVP Hello mechanism. Several seconds are required to complete a switchover. This results in the loss of a great deal of data.
BFD for RSVP detects faults in milliseconds on links between RSVP neighboring nodes. BFD for RSVP applies to the TE FRR networking where Layer 2 devices exist between the PLR and its RSVP neighbor along the primary CR-LSP, as shown in Figure 4-27.
BFD for RSVP can share BFD sessions with BFD for OSPF, BFD for IS-IS, or BFD for Border Gateway Protocol (BGP). Therefore, the local node selects the minimum parameter values among the shared BFD session as the local BFD parameters. The parameters include the transmit interval, the receive interval, and the local detection multiplier.
BFD for CR-LSP
BFD for CR-LSP can rapidly detect faults on CR-LSPs and notify the forwarding plane of the faults to ensure a fast traffic switchover. BFD for CR-LSP usually works with a hot-standby CR-LSP or tunnel protection group.
A BFD session is bound to a CR-LSP. That is, a BFD session is set up between ingress and egress nodes. A BFD packet is sent by the ingress node and forwarded to the egress node along a CR-LSP. The egress node then responds to the BFD packet. The BFD session at the ingress node can rapidly detect the status of the path through which the LSP passes.
Upon detecting a link failure, BFD notifies the forwarding plane of the failure. The forwarding plane searches for a backup CR-LSP and switches traffic to it. The forwarding plane then reports fault information to the control plane. If dynamic BFD for CR-LSP is used, the control plane creates a BFD session for the backup CR-LSP. If static BFD for CR-LSP is used, a BFD session can be configured for the backup CR-LSP.
BFD for TE Tunnel
BFD detects faults in the entire TE tunnel and triggers traffic switchovers for VPN applications such as VPN FRR.
BFD for CR-LSP notifies a TE tunnel of faults and triggers service switchovers between CR-LSPs in the TE tunnel. Unlike BFD for CR-LSP, BFD for TE tunnel notifies VPN applications of faults and triggers service switchovers between TE tunnel interfaces.
Differences
Detection Technology |
Detection Object |
Deployment Position |
Usage Scenario |
BFD Session Mode |
---|---|---|---|---|
BFD for RSVP |
RSVP neighboring relationship |
Two neighboring nodes of an RSVP session |
Associating with TE FRR |
Dynamic |
BFD for CR-LSP |
CR-LSP |
Ingress and egress nodes |
Associating with a hot-standby CR-LSP |
|
BFD for TE Tunnel |
MPLS TE tunnel |
Ingress and egress nodes |
Associating with VPN FRR or VLL FRR |
Static |
RSVP GR
RSVP Graceful Restart (GR) ensures uninterrupted traffic transmission on the forwarding plane when traffic is switched to the control plane upon a node failure.
Background
Concepts
RSVP GR is a fast state recovery mechanism for RSVP-TE. As one of the high-reliability technologies, RSVP GR is designed based on non-stop forwarding (NSF).
The GR process involves GR restarter and GR helper routers. The GR restarter restarts the protocol and the GR helper assists in the process.
- Hello message with GR extensions: is used to detect the neighbor's GR status.
- GR Path message: is sent downstream and carries information about the last Path update.
- Recovery Path message: is sent upstream and carries information about the last received Path message.
Implementation
RSVP GR detects the GR status of a neighbor using RSVP Hello extensions.
RSVP GR is implemented as follows:
In Figure 4-30, after the GR restarter triggers a GR, it stops sending Hello messages to its neighbors. If a GR helper does not receive Hello messages for three consecutive intervals, it considers that the neighbor is performing a GR and retains all forwarding information. Meanwhile, the GR restarter interface cards continue to transmit services and to wait for the GR restarter to complete the process.
After the GR restarter starts, it receives Hello messages from neighbors and sends Hello messages in response. Upstream and downstream nodes process Hello messages in different ways:
When the upstream GR helper receives a Hello message, it sends a GR Path message downstream to the GR restarter.
When the downstream GR helper receives a Hello message, it sends a Recovery Path message upstream to the GR restarter.
When receiving the GR Path message and the Recovery Path message, the GR restarter reestablishes the path state block (PSB) and reservation state block (RSB) of the CR-LSP based on the two messages. Information about the CR-LSP on the local control plane is restored.
If the downstream GR helper cannot send Recovery Path messages, the GR restarter reestablishes the local PSB and RSB using only GR Path messages.