LDP Reliability
Overview of LDP Reliability
Reliability Technology |
Description |
Function |
---|---|---|
Fault detection |
Rapidly detects faults on LDP LSPs of an MPLS network and triggers protection switching. |
|
Traffic protection |
Ensures that traffic is switched to the backup LDP LSP and minimizes packet loss when a working LDP LSP fails. |
|
Ensures nonstop forwarding on the forwarding plane when the control plane fails on a node. |
BFD for LDP LSP
Bidirectional Forwarding Detection (BFD) can quickly detect faults on an LDP LSP and trigger a traffic switchover upon an LDP LSP failure to improve network reliability.
Background
As shown in Figure 2-7, an LSR periodically sends Hello messages to its neighboring LSRs to advertise its existence on the network and maintain adjacencies. An LSR creates a Hello timer for each neighbor to maintain an adjacency. Each time the LSR receives a Hello message, the LSR resets the Hello timer. If the Hello timer expires before the LSR receives a new Hello message, the LSR considers that the adjacency is terminated. This mechanism cannot detect link faults quickly, especially when a Layer 2 device is deployed between LSRs.
BFD can quickly detect faults on an LDP LSP and trigger a traffic switchover upon an LDP LSP failure, minimizing packet loss and improving network reliability.
Implementation
BFD for LDP LSP is implemented by establishing a BFD session between two nodes on both ends of an LSP and binding the session to the LSP. BFD rapidly detects LSP faults and triggers a traffic switchover. When BFD monitors a unidirectional LDP LSP, the reverse path of the LDP LSP can be an IP link, an LDP LSP, or a traffic engineering (TE) tunnel.
A BFD session that monitors LDP LSPs is negotiated in static mode: The negotiation of a BFD session is performed using the local and remote discriminators that are manually configured for the BFD session to be established. On a local LSR, you can bind an LSP with a specified next-hop IP address to a BFD session with a specified peer IP address.
BFD uses the asynchronous mode to check LSP continuity. That is, the ingress and egress periodically send BFD packets to each other. If one end does not receive BFD packets from the other end within a detection period, BFD considers the LSP Down and sends an LSP Down message to the LSP management (LSPM) module.
LDP-IGP Synchronization
Background
LDP-IGP synchronization is used to synchronize the status between LDP and an IGP to minimize the traffic loss time if a network fault triggers the LDP and IGP switching.
On a network with active and standby links, if the active link fails, IGP routes and an LSP are switched to the standby link. After the active link recovers, IGP routes are switched back to the active link before LDP convergence is complete. In this case, the LSP along the active link takes time to make preparations, such as adjacency restoration, before being established. As a result, LSP traffic is discarded. If an LDP session or adjacency between nodes fails on the active link, the LSP along the active link is deleted. However, the IGP still uses the active link, and as a result, LSP traffic cannot be switched to the standby link, and is continuously discarded.
LDP-IGP synchronization supports only OSPFv2 and IS-IS for IPv4.
According to the fundamentals of LDP-IGP synchronization, an IGP cost value is set to delay a route switchback until LDP convergence is complete. Before the LSP along the active link is established, the LSP along the standby link is retained, so that the traffic continues to be forwarded through the standby link. The backup LSP is torn down only after the primary LSP is established successfully.
LDP-IGP synchronization timers are as follows:
Hold-max-cost timer
Delay timer
Implementation
- In Figure 2-9, on a network with active and standby links, after the active link recovers, an attempt is made to switch traffic back from the standby link to the active link. Revertive traffic is discarded because the backup LSP becomes unavailable after the IGP convergence is complete but the primary LSP is not established. In this situation, you can configure LDP-IGP synchronization to delay the IGP route switchback until LDP convergence is complete. Before the primary LSP is converged, the backup LSP is retained, so that the traffic continues to be forwarded through the backup LSP until the primary LSP is successfully established. Then the backup LSP is torn down. The process is as follows:
A link fault is rectified.
An IGP advertises the maximum cost of the active link, delaying the IGP route switchback.
Traffic is still forwarded along the backup LSP.
After the LDP session and adjacency are successfully established, Label Mapping messages are exchanged to instruct the IGP to start synchronization.
The IGP advertises the normal cost of the active link and converges to the original path. The LSP is reestablished and the forwarding entries are delivered within milliseconds.
- If the LDP session or adjacency between nodes on the active link fails, the primary LSP is deleted, but the IGP still uses the active link. As a result, LSP traffic cannot be switched to the standby link, and traffic is continuously discarded. In this situation, you can configure LDP-IGP synchronization. If an LDP session or adjacency fails, LDP informs the IGP that the LDP session or adjacency is faulty. In this case, the IGP advertises the maximum cost of the faulty link. The route is switched to the standby link, and the LSP is also switched to the standby link. The process is as follows:
The LDP session or adjacency between nodes on the active link is faulty.
LDP informs the IGP that the LDP session or adjacency along the active link is faulty. The IGP then advertises the maximum cost of the active link.
IGP routes are switched to the standby link.
The LSP is reestablished along the standby link, and forwarding entries are delivered.
LDP-IGP synchronization state transition mechanism
After LDP-IGP synchronization is enabled on an interface, an IGP queries the status of related interfaces, LDP sessions, and LDP adjacencies based on the process shown in Figure 2-10. Then, the interface enters a state based on the query result. Then, the state transition is performed, as shown in Figure 2-10.When different IGP protocols are used, the preceding states are different.- When OSPF is used, the status transits based on the flowchart shown in Figure 2-10.
- When IS-IS is used, no Hold-normal-cost state is involved. After the Hold-Max-Cost timer expires, IS-IS advertises the normal cost value of the interface link, but the Hold-max-cost state is still displayed.
Usage Scenario
Benefits
LDP-IGP synchronization reduces the packet loss rate during an active/standby link switchover and improves the reliability of an entire network.
Auto LDP FRR
Auto LDP fast reroute (FRR) provides link backup on an MPLS network. When the primary LSP fails, traffic is quickly switched to the backup LSP, minimizing traffic loss.
Background
On an MPLS network, when the primary link fails, IP FRR ensures fast IGP route convergence and switches traffic to the backup link. However, a new LSP needs to be established, which causes traffic loss. If the LSP fails (for some reason other than a primary link failure), traffic is restored until a new LSP is established, causing traffic interruption for a long time. Auto LDP FRR is used on an MPLS network to address these issues.
Auto LDP FRR, using the liberal label retention mode of LDP, obtains a liberal label, assigns a forwarding entry to the label, and then delivers the forwarding entry to the forwarding plane as the backup forwarding entry for the primary LSP. When the interface goes Down (as detected by the interface itself or by BFD) or the primary LSP fails (as detected by BFD), traffic is quickly switched to the backup LSP.
Concepts
Auto LDP FRR: This automatic approach depends on IP FRR. A backup LSP can be established and its forwarding entries can be delivered only when the source of the liberal label matches the backup route. That is, the liberal label is obtained from the outbound interface and next hop of the backup route, and the backup LSP triggering conditions are met. By default, LDP LSP setup is triggered by a 32-bit backup route.
Implementation
In liberal label retention mode, an LSR can receive a Label Mapping message of an FEC from any neighboring LSR. However, only the Label Mapping message sent by the next hop of the FEC can be used to generate a label forwarding table for LSP setup. In contrast, Auto LDP FRR can generate an LSP as the backup of the primary LSP based on Label Mapping messages that are not from the next hop of the FEC. Auto LDP FRR establishes forwarding entries for the backup LSP and adds the forwarding entries to the forwarding table. If the primary LSP fails, traffic is switched to the backup LSP quickly to minimize traffic loss.
In Figure 2-12, the optimal route from LSR_1 to LSR_2 is LSR_1-LSR_2. A suboptimal route is LSR_1-LSR_3-LSR_2. After receiving a label from LSR_3, LSR_1 compares the label with the route from LSR_1 to LSR_2. Because LSR_3 is not the next hop of the route from LSR_1 to LSR_2, LSR_1 stores the label as a liberal label. If a route is available for the source of the liberal label, LSR_1 assigns a forwarding entry to the liberal label as the backup forwarding entry, and then delivers this forwarding entry to the forwarding plane with the primary LSP. In this way, the primary LSP is associated with the backup LSP.
Auto LDP FRR is triggered when an interface failure is detected by the interface itself or BFD, or a primary LSP failure is detected by BFD. After Auto LDP FRR is complete, traffic is switched to the backup LSP using the backup forwarding entry. Then the route is converged from LSR_1-LSR_2 to LSR_1-LSR_3-LSR_2. An LSP is established on the new path (the original backup LSP) and the original primary LSP is deleted. Traffic is forwarded along the new LSP of LSR_1-LSR_3-LSR_2.
Usage Scenario
Figure 2-12 shows a typical application environment of Auto LDP FRR. Auto LDP FRR functions well in a triangle topology but may not take effect in some situations in a rectangle topology.
As shown in Figure 2-13, if the optimal route from LSR_1 to LSR_4 is LSR_1-LSR_2-LSR_4 (with no other route for load balancing), LSR_3 receives a liberal label from LSR_1 and is bound to Auto LDP FRR. If the link between LSR_3 and LSR_4 fails, traffic is switched to the route of LSR_3-LSR_1-LSR_2-LSR_4. No loop occurs in this situation.
However, if optional routes from LSR_1 to LSR_4 are available for load balancing (LSR_1-LSR_2-LSR_4 and LSR_1-LSR_3-LSR_4), LSR_3 may not receive a liberal label from LSR_1 because LSR_3 is a downstream node of LSR_1. Although LSR_3 receives a liberal label and is configured with Auto LDP FRR, traffic may still be forwarded to LSR_3 after the traffic switching, leading to a loop. The loop exists until the route from LSR_1 to LSR_4 is converged to LSR_1-LSR_2-LSR_4.
LDP GR
LDP Graceful Restart (GR) ensures uninterrupted traffic transmission during a protocol restart or active/standby switchover because the forwarding plane is separated from the control plane.
Background
On an MPLS network, when the GR Restarter restarts a protocol or performs an active/standby switchover, label forwarding entries on the forwarding plane are deleted, interrupting data forwarding.
LDP GR can address this issue and therefore improve network reliability. During a protocol restart or active/standby switchover, LDP GR retains label forwarding entries because the forwarding plane is separated from the control plane. The device still forwards packets based on the label forwarding entries, ensuring data transmission. After the protocol restart or active/standby switchover is complete, the GR Restarter can restore to the original state with the help of the GR Helper.
Concepts
- GR Restarter: has GR capability and restarts a protocol.
- GR Helper: assists in the GR process as a GR-capable neighbor of the GR Restarter.
The device can function only as the GR Helper.
- Forwarding State Holding timer: specifies the duration of the LDP GR process.
- Reconnect timer: controls the time during which the GR Helper waits for LDP session reestablishment. After a protocol restart or active/standby switchover occurs on the GR Restarter, the GR Helper detects that the LDP session with the GR Restarter is Down. The GR Helper then starts this timer and waits for the LDP session to be reestablished before the timer expires.
- Recovery timer: controls the time during which the GR Helper waits for LSP recovery. After the LDP session is reestablished, the GR Helper starts this timer and waits for the LSP to recover before the timer expires.
Implementation
Figure 2-14 shows LDP GR implementation.
LDP GR works as follows:
- An LDP session is set up between the GR Restarter and GR Helper. The GR Restarter and GR Helper negotiate GR capabilities during LDP session setup.
- When restarting a protocol or performing an active/standby switchover, the GR Restarter starts the Forwarding State Holding timer, retains label forwarding entries, and sends an LDP Initialization messages to the GR Helper. When the GR Helper detects that the LDP session with the GR Restarter is Down, the GR Helper retains label forwarding entries of the GR Restarter and starts the Reconnect timer.
- After the protocol restart or active/standby switchover, the GR Restarter reestablishes an LDP session with the GR Helper. If an LDP session is not reestablished before the Reconnect timer expires, the GR Helper deletes label forwarding entries of the GR Restarter.
- After the GR Restarter reestablishes an LDP session with the GR Helper, the GR Helper starts the Recovery timer. Before the Recovery timer expires, the GR Restarter and GR Helper exchange Label Mapping messages over the LDP session. The GR Restarter restores forwarding entries with the help of the GR Helper, and the GR Helper restores forwarding entries with the help of the GR Restarter. After the Recovery timer expires, the GR Helper deletes all forwarding entries that have not been restored.
- After the Forwarding State Holding timer expires, the GR Restarter deletes label forwarding entries and the GR is complete.
LDP NSR
LDP Non-Stop Routing (LDP NSR) ensures nonstop data transmission on the control plane and forwarding plane when an active/standby switchover occurs on a device, without the help of neighboring nodes. For details about NSR, see NSR in CloudEngine 12800 and 12800E Series Switches Configuration Guide - Reliability.
LDP NSR is enabled on the device by default and does not need to be configured.
LDP control block
LSP forwarding entries
Cross connect (XC) information that describes the cross connection between a forwarding equivalence class (FEC) and an LSP
Labels, including the following types:
- LDP LSP labels on a public network
- Virtual Circuit (VC) labels in Martini Virtual Private LAN Service (VPLS) networking
Local-and-Remote LDP Session
A local node can set up both local and remote LDP adjacencies with an LDP peer. That is, the peer is maintained by both local and remote LDP adjacencies.
As shown in Figure 2-15, when the local LDP adjacency is deleted because the link associated with the adjacency fails, the type of the peer may change but the peer status remains unchanged. Depending on the adjacency type, the peer type can be local, remote, or local-and-remote.
When the link is faulty or recovering, the peer type may change as well as the corresponding session type. However, the session stays Up in this process and is not deleted or set to Down.
A typical application of local-and-remote LDP is with a Layer 2 virtual private network (L2VPN). As shown in Figure 2-15, L2VPN services are configured on PE_1 and PE_2. When the direct link between PE_1 and PE_2 is disconnected and then recovers, the changes in the peer and session types are as follows:
- PE_1 and PE_2 have MPLS LDP enabled and establish a local LDP session. Then PE_1 and PE_2 are configured as remote peers and establish a remote LDP session. PE_1 and PE_2 maintain both local and remote adjacencies. In this case, a local-and-remote LDP session exists between PE_1 and PE_2. L2VPN messages are transmitted over this LDP session.
- When the physical link between PE_1 and PE_2 goes Down, the local LDP adjacency goes Down. The route between PE_1 and PE_2 is reachable through P, so the remote LDP adjacency is still Up. The session type changes to a remote session. Since the session is still Up, L2VPN is uninformed of the session type change and does not delete the session. This avoids the neighbor disconnection and recovery process and therefore reduces the service interruption time.
- When the physical link between PE_1 and PE_2 recovers, the local LDP adjacency goes Up. The session is restored to a local-and-remote and remains Up. Again L2VPN is not informed of the session type change and does not delete the session. This reduces the service interruption time.