Improve BGP Startup Time on Cisco IOS
I like using Cisco IOS for my routing protocol virtual labs1. It uses a trivial amount of memory2 and boots relatively fast. There was just one thing that kept annoying me: Cisco IOS release 15.x takes forever to install local routes in the BGP table and even longer to select the best routes and propagate them3.
I finally found the culprit: bgp update-delay nerd knob. Here’s what the documentation has to say about it:
When BGP is started, it waits a specified period of time for its neighbors to be established themselves and to begin sending their initial updates. Once that period is complete, or when the time expires, the best path is calculated for each route, and the software starts sending advertisements out to its peers.
Why would you want to do that? Back to documentation…
This behavior improves convergence time because, if the software were to start sending advertisements out immediately, it would have to send extra advertisements if it later received a better path for the prefix from another peer.
That makes perfect sense. It would be ridiculous to waste CPU cycles selecting the best routes based on incomplete information and telling everyone else about your wonderful “discovery” while the BGP table is still being updated with incoming messages. It’s much better to keep collecting incoming updates until the neighbors tell you they’re done (assuming they know what End-of-RIB marker is)4 and then do your job.
However, that procedure works well when a single router is restarted. As netlab starts lab devices in parallel and configures BGP on them at the same time, all BGP routers wait for everyone else to send them the updates. Obviously nothing happens until the bgp update-delay timer expires, everyone gives up, selects the current best routes (which happen to be locally originated routes), and forwards them to the neighbors… and we have fully synchronized BGP tables in a second.
The default value of bgp update-delay is 120 seconds – not unreasonable if you expect to receive full Internet routing table from a few BGP neighbors, but definitely way too long for a virtual lab. Adding bgp update-delay 5 to netlab Cisco IOS BGP configuration template turned a major annoyance back into a lovely experience.
Finally, keep in mind that what works best in a lab might not be suitable for production deployment. Don’t tweak this nerd knob in a production network unless you have an extremely good reason to do it, see also the comment by Mehdi SFAR.
… although I made a decision to use Arista vEOS/cEOS whenever possible when creating blog posts. Repeatability is crucial – anyone can download Arista VM/container software while Cisco keeps hiding IOSv like it would be its crown jewels. ↩︎
… as opposed to resource hogs like Nexus OS or several Junos platforms. ↩︎
… compared to what I’d been experiencing 25 years ago when I was still teaching my BGP course (the precursor of CBCR course) at Cisco Europe. ↩︎
Or wait for two keepalives, see the comment by Jeff Tantsura. ↩︎
What about BIRD on Linux? I started using it for my "vendor-neutral" posts, and I have been very positively surprised about its versatility.
Please note - (somewhat vendor dependent) when GR is not negotiated - EoR is not sent; most BGP implementations will use 1st (well, technically 2nd) keepalive as an indicator that the sender is done and best path can be run
yes - I have just experienced this with IOSXR and it is very annoying as we wanted to track EoR to measure the BGP convergence time throughout major faults and build some stats/data to then analyse.
In a major router vendor, EoR is only available for GR and Enhanced Route Refresh (clear bgp soft).
Having said that, RFC 4742 says:
" Although the End-of-RIB marker is specified for the purpose of BGP graceful restart, it is noted that the generation of such a marker upon completion of the initial update would be useful for routing convergence in general, and thus the practice is recommended. "
As you correctly mentioned, it's crucial to be cautious when modifying this timer as doing so may result in unexpected consequences.
For example, in a multihomed design where the CE receives the default route from both PEs, if the primary PE restarts and establishes the CE-PE peering, it may start sending the Default route before VPNv4 convergence is complete, potentially causing traffic to be blackholed for some time.
Therefore, depending on the number of peerings and routes involved, maintaining or even increasing the update-delay timer may be necessary in certain use cases.
BTW, a quick point regarding the update-delay timer: The update-delay timer waits after the FIRST neighbor is established before starting its calculation.
I just wanted to share this use case with the readers.
FRR has two modes for this
datacentermode where the update delay is set to 0( just start doing the work ) and
traditionalwhere the update delay is 120 seconds as you have experienced.
update-delay is 0 In FRR regardless of which mode is selected.
Another production use-case for which this timer is useful is to avoid sending aggregates before sending all of their contributors too, this way avoiding blackholing.