OSPF Graceful Shutdown

Reloading a core router in a high-availability network is always a tricky proposition. Even if you tweak the routing protocol hello timers (or use fast L2 mechanisms to detect next-hop loss), it still takes a few seconds for the routing protocols to converge. For example, when using OSPF, the adjacent routers have to detect the neighbor loss, change their router LSAs, flood them (LSA flooding is rate-limited), the changed LSAs have to be propagated across the whole area and all routers in the area have to run SPF (which is also rate-limited).

It would be much better if you could gracefully take a router offline by increasing the OSPF cost on all its interfaces, thus forcing an OSPF SPF run while the router is still capable of forwarding the traffic (resulting in no packet loss).

The OSPF stub router advertisement (as this feature is officially called) documented in RFC 3137 is implemented in Cisco IOS release 12.2(4)T and 12.3. To force the router into stub status (prior to reboot/shutdown), use the max-metric router-lsa router configuration command. This command will change the OSPF metric for all non-stub interfaces in the router LSA to 65535.

The infinite metric in the router LSA does not force the other routers to ignore the path, just nudge them into using alternate paths. The other routers in the network will thus select alternate OSPF paths (if they exist), but not the potential non-OSPF paths. Those will be selected only after the actual router reboot/shutdown.

This is a sample router LSA after the max-metric router-lsa has been configured:

b1#show ip ospf data router 172.16.0.21

OSPF Router with ID (172.16.0.21) (Process ID 1)

Router Link States (Area 0)

Exception Flag: Announcing maximum link costs
LS age: 18
Options: (No TOS-capability, DC)
LS Type: Router Links
Link State ID: 172.16.0.21
Advertising Router: 172.16.0.21
LS Seq Number: 80000003
Checksum: 0x88B2
Length: 72
Number of Links: 4

Link connected to: a Stub Network
(Link ID) Network/subnet number: 172.16.0.21
(Link Data) Network Mask: 255.255.255.255
Number of TOS metrics: 0
TOS 0 Metrics: 1

Link connected to: another Router (point-to-point)
(Link ID) Neighboring Router ID: 172.16.0.11
(Link Data) Router Interface address: 172.16.1.2
Number of TOS metrics: 0
 TOS 0 Metrics: 65535

Link connected to: a Stub Network
(Link ID) Network/subnet number: 172.16.1.0
(Link Data) Network Mask: 255.255.255.252
Number of TOS metrics: 0
TOS 0 Metrics: 50

Link connected to: a Transit Network
(Link ID) Designated Router address: 192.168.0.6
(Link Data) Router Interface address: 192.168.0.5
Number of TOS metrics: 0
 TOS 0 Metrics: 65535

14 comments:

  1. I have read few of the topics here and i found them quite useful. I have a question! Once someone asked me how would i go about routing protocol conversion! Say the current routing protocol is EIGRP and the goal is convert it to pure OSPF with the minimal amount of network disruption in both the HQ and the branch offices! Of course that was an awkward and interesting question at the same time! Interesting in a sense that it brings some problem without showing how the current network look like! and awkward in a sense that it would need a lot of thinking, understanding, and analysis of the current network design and goals and therefore cannot be answered on the spot!! I know for sure that there are some general guidelines and best practices which i would appreciate if you could shed some light on this topic!

    Thank you very much!
  2. to the first commenter checkout Vijay Gill's presentation of how ATDN converted their backbone from OSPF to ISIS http://www.nanog.org/mtg-0310/gill.html

    On our tier 1 ISP network prior to a router reload we set ISIS overload which is similar to the OSPF command noted above. We also shut down all eBGP sessions. We then reboot the router. After the router comes back and ISIS and MPLS-TE are reestablished we bring eBGP back up.
  3. There's a similar technique for IS-IS. What about EIGRP?
  4. There is no similar functionality for EIGRP. The only thing you could do is to lower the bandwidth on all EIGRP-enabled interfaces.
  5. A side remark for the first commenter: I just wrote an article describing a potential routing protocol migration scenario (it's a bit different from the ATDN one, as it involves EIGRP-to-OSPF migration). If you'd like to receive it before it's published, go to my bio page and send me a message.
  6. Will the max-metric also work for external routes (thinking about the default here)?

    Thanks!
  7. This is a great question (and a tricky one). Graceful shutdown (setting interface metric to 64K) will only influence E1 routes, as their cost is added to the internal cost. If you announce the default as an E2 route, the internal cost to the ASBR is ignored.

    When I get some time, I'll produce router printouts documenting this.
  8. How do you do this gracefully with BGP? I know I can shut the interface but there's still a lag..
  9. Configure an outbound prefix-list that filters all prefixes and do "clear ip bgp * soft out". This will ensure all prefixes advertised by the BGP router are withdrawn.
  10. how can we enable the gracefull shutdown using OSPF or Eigrp?
  11. OSPF:
    Read this wiki page ... http://wiki.nil.com/OSPF_graceful_shutdown
    ... and this article ... http://www.nil.si/ipcorner/OSPFGracefulShutdown/

    EIGRP has no similar functionality. You could use "distribute-list out" with EIGRP.
  12. EIGRP has graceful shutdown if you set all the K values to 255.
  13. No, it doesn't. Setting all K values to 255 disrupts all adjacencies, resulting in potential packet loss until the network converges, while changing OSPF metric does disrupt the existing forwarding path until the network reconverges.
Add comment
Sidebar