OSPF GR
Routers generally operate with the control plane and forwarding plane separated. When the network topology remains stable, a restart of the control plane does not affect the forwarding plane, and the forwarding plane can still forward data properly. This separation ensures non-stop service forwarding.
In graceful restart (GR) mode, the forwarding plane continues to direct data forwarding after a routing protocol restarts. The actions on the control plane, such as re-establishment of neighbor relationships and route calculations, do not affect the forwarding plane. Network reliability is improved because service interruption caused by route flapping is prevented.
Basic Concepts of OSPF GR
GR is a technology used to ensure normal traffic forwarding and non-stop forwarding of key services during the restart of routing protocols.
Unless otherwise stated, GR described in this section refers to the GR technology defined in RFC 3623.
GR is one of high availability (HA) technologies, comprising a set of comprehensive techniques, such as fault-tolerant redundancy, link protection, faulty node recovery, and traffic engineering. As a fault-tolerant redundancy technology, GR is widely used to ensure non-stop forwarding of key services during active/standby switchover and system upgrade.
The following concepts are involved in GR:
Grace-LSA
OSPF supports GR by flooding Grace-LSAs. Grace-LSAs are used to inform neighbors of the GR time, cause, and interface address when the GR starts and ends.
Role of a router during GR
Restarter: is the router that restarts. The Restarter can be configured to support totally GR or partly GR.
Helper: is the router that helps the Restarter. The Helper can be configured to support planned GR or unplanned GR or to selectively support GR based on a configured policy.
Conditions that cause GR
Unknown: indicates that GR is triggered for an unknown reason.
Software restart: indicates that GR is triggered by commands.
Software reload/upgrade: indicates that GR is triggered by software restart or upgrade.
Switch to redundant control processor: indicates that GR is triggered by an abnormal active/standby switchover.
GR period
The GR period cannot exceed 1800 seconds. OSPF routers can exit GR before GR timeout regardless of successful or failed GR.
Classification of OSPF GR
Totally GR: When a neighbor of a router does not support GR, the router exits GR.
Partly GR: When a neighbor does not support GR, only the interface associated with this neighbor exits GR, whereas the other interfaces perform GR normally.
Planned GR: Commands are manually configured to restart a router or perform an active/standby switchover for the router. Before the restart or switchover, the Restarter sends a Grace-LSA.
Unplanned GR: A router restarts or performs an active/standby switchover due to a fault. The router performs the switchover, without sending a Grace-LSA in advance, and then enters the GR state after the standby board goes Up. The process of unplanned GR after the standby board goes Up is the same as that of planned GR.
GR Process
A router starts GR.
In planned GR mode, when commands are run to trigger an active/standby switchover, the Restarter sends a Grace-LSA to all neighbors to notify them of the GR start, period, and cause, and then performs the switchover.
In unplanned GR, the Restarter does not send any Grace-LSA.
The Restarter sends a Grace-LSA immediately after the standby board goes Up, informing neighbors of the GR start, period, and cause. The Restarter then sends five consecutive Grace-LSAs to each neighbor to ensure that neighbors can receive a Grace-LSA. Sending five consecutive Grace-LSAs is proposed by vendors and has not been defined by OSPF.
The Grace-LSA is sent to notify neighbors that the Restarter enters GR. During GR, neighbors keep neighbor relationships with the Restarter so that other routers cannot detect the switchover of the Restarter.
Figure 5-21 shows the GR process.
The router exits GR.
Table 5-16 Reasons that a router exits GRExecution of GR
Restarter
Helper
Success
Before GR times out, the Restarter re-establishes neighbor relationships with all neighbors existing before the active/standby switchover.
After the Helper receives the Grace-LSA with the Age being 3600s from the Restarter, the neighbor relationship between the Helper and Restarter enters the Full state.
Failure
GR times out, and all neighbor relationships are not recovered.
Router-LSAs or Network-LSAs sent by the Helper causes the bidirectional check failure on the Restarter.
The status of the interface that functions as the Restarter changes.
The Restarter receives one-way Hello packets from the Helper.
The Restarter receives the Grace-LSA that is generated by another router on the same network segment. Only one router can perform GR on the same network segment at a time.
The Restarter's neighbors on the same network segment have different DRs or BDRs (because of the topology changes).
The Helper does not receive a Grace-LSA from Restarter before the neighbor relationship expires.
The status of the interface that functions as the Helper changes.
The Helper receives LSAs inconsistent with those in the local LSDB from another router. This situation can be excluded after the Helper is configured not to perform strict LSA check.
The Helper receives Grace-LSAs from two routers on the same network segment at the same time.
Neighbor relationships between the Helper and other neighbors change.
Comparison Between GR Mode and Non-GR Mode
Switchover in Non-GR Mode |
Switchover in GR Mode |
---|---|
|
|