Routing Protocols: Use the Best Tool for the Job

When I wrote about my sample OSPF+BGP hands-on lab on LinkedIn, someone couldn’t resist asking:

I’m still wondering why people use two routing protocols and do not have clean redistribution points or tunnels.

Ignoring for the moment the fact that he missed the point of the blog post (completely), the idea of “using tunnels or redistribution points instead of two routing protocols” hints at the potential applicability of RFC 1925 rule 4.

As anyone who ever had to configure two-way route redistribution knows, it’s one of the hardest things to get right and one of the most “exciting” things to troubleshoot on Sunday at 2 AM. It’s so bad that I recommended running BGP in parallel with OSPF to anyone who had to deal with MPLS Service Providers offering BGP as the only PE-CE routing protocol.

Coming back to the original question: does it make sense to run BGP on top of OSPF instead of just using one or the other? As always, the correct answer is “it depends,” this time on (A) what problem you’re trying to solve and (B) what the best tools are to solve that problem.

The “two napkins” team designed BGP to be a global endpoint reachability protocol. It does that job wonderfully and scales to millions of routes exchanged between tens of thousands of autonomous systems. It was, however, never designed to be fast or to be focused on topology discovery.

OSPF is just the opposite. It auto-discovers network topology, reacts (relatively) quickly to changes in network topology, and does its best to propagate the news as soon as possible. Compare that to the conservative approach taken by BGP:

  • Receive the changes
  • Recalculate the local routing table
  • Compose the outbound update based on the changes in the local routing table
  • Inform the upstream neighbors.

Why don’t we use OSPF everywhere? It can’t carry too many routes (redistributing the BGP table into OSPF is great fun), and you cannot use it to implement a hop-by-hop policy – all routers in the same area unconditionally share the same information.

But didn’t the hyperscalers prove that you can build BGP-only data center fabrics? Sure they did, and Petr Lapukhov had good reasons to build a BGP-only data center fabric at Microsoft. You can do the same, but it’s not trivial unless you’re using an operating system designed for that particular use case (Cumulus Linux with FRR). Does it make sense? Maybe not – you’re not Microsoft or Facebook, and your network might not have the same scaling problems, regardless of what the vendors would like you to believe.

Add “minor” details like vendors that love running EVPN over IBGP, and all of a sudden, you’re in the twisted IBGP-over-EBGP territory just because someone insisted on not using two routing protocols where they should have.

There’s a reason every craftsmaster has a toolbox full of various weird (sometimes even homemade) tools – there’s the best tool for every job. The difference between a craftsmaster and an amateur wannabe trying to solve everything with hammer (or a Swiss Army Knife) is that one of them thinks about the tool to use for every job they face1. It’s only in networking that some people think using a single protocol to solve every challenge they face makes them heroes.

More to Explore

I did a series of podcasts with routing protocol gurus trying to figure out whether or not BGP is the best answer when looking for a data center routing protocol:

Prefer webinars over podcasts? Start with Routing Protocols part of How Networks Really Work, and explore the best ways to use BGP and OSPF in leaf-and-spine fabrics and in EVPN networks.

Need even more information? Explore our extensive BGP in Data Center Fabrics series.


  1. They both know what the best tool for the job is though. One of them would always pick a hammer. ↩︎

2 comments:

  1. There is always a limit to every maxim. If you only use BGP, you only need to know BGP well. If you use OSPF and BGP, you need to learn both of them and be sufficiently savvy with them. Using a different tool for every problem may be the first step to create a huge mess. If OSPF is only marginally better than BGP for your problem, BGP should be the solution.

  2. @Vincent: I should set up a "just say in data center" Twitter account along the lines of https://twitter.com/justsaysinmice and use it every time someone says "you should use only BGP for your routing".

    On a more serious note, while I agree with you in principle, the only time I'd recommend someone to use BGP only in data center would be if they're using Cumulus Linux. Every other implementation is too overloaded with nerd knobs.

    Also, using the best tool for the job doesn't mean create a NASCAR slide of tools you use just to boost your CV ;)... and yet again, craftmasters intuitively get that, it's only in networking (or maybe in IT in general) that we have to argue about common sense.

Add comment
Sidebar