Leaf-and-Spine Fabric Myths (Part 2)

The next set of Leaf-and-Spine Fabric Myths listed by Evil CCIE focused on BGP:

BGP is the best choice for leaf-and-spine fabrics.

I wrote about this particular one here. If you’re not a BGP guru don’t overcomplicate your network. OSPF, IS-IS, and EIGRP are good enough for most environments. Also, don’t ever turn BGP into RIP with AS-path length serving as hop count.

BGP is a good choice as it allows granular policy control

As anyone who ever tried to implement a consistent QoS policy across a large network knows maintaining a policy and adjusting it to changing conditions quickly becomes a huge operational burden. In most data center environments it’s cheaper to buy more bandwidth.

Also, don’t believe in the magic powers of intent-based networking. I explained the drawbacks of this idea in the Network Automation Concepts webinar, here’s the TL&DW summary: it will probably cost you more than buying faster leaf-to-spine links.

Finally, the next time a $vendor Sales Engineer makes this argument, quote RFC 1925 Rule 4 and ask him when was the last time he was maintaining a complex policy in a production environment for more than a few years (and how painful it was).

BGP has less churn

Assuming you care. Most enterprise environments need layer-2 fabrics anyway, which usually translates into VXLAN overlay these days. It doesn’t matter whether you use static VXLAN ingress replication lists or EVPN – the core OSPF process doesn’t need more than a single router LSA per switch.

BGP scales far better

In principle, it’s true. In practice, most environments don’t need that scale. I asked that same question several times in RIFT and OpenFabric podcasts and we came to an agreement that decent OSPF implementations shouldn’t have a problem with a few hundred switches in the same area. Dinesh Dutt also had quite a few things to say about the BGP-is-my-hammer-where’s-the-nail approach.

Now What?

As far as I can see, you have (approximately) four options when selecting the underlay routing protocol:

  • You don’t need more than two switches, making all design discussions moot;
  • You select OSPF, IS-IS or EBGP based on the last blog post you found on the Internet… and get the fabric you deserve ;)
  • You use EBGP because you trust your $vendor;
  • You figure out how data center fabrics really work (if you do them right, they are nothing more than simple IP networks), and then use the routing and switching knowledge you already have to design them.

If you decide to take the red pill (last option), you might find the Leaf-and-Spine Fabric Architectures and EVPN Technical Deep Dive webinars useful. They are both part of Standard ipSpace.net subscription.

Latest blog posts in BGP in Data Center Fabrics series


  1. The OSPF cannot support large CLOS fabrics is one of the most frustrating myths out there.
    Building and running a fabric using OSPF that scales beyond 10k devices is not that difficult.
    No nerd knobs are required just some careful design choices.

  2. Hi Ivan! Would You use eBGP only solutions having Cumulus Linux within 2-tier 8 rack small fabric with L2 extended (vxlan/evpn) requirement?
    1. Obviously using my own argument the answer is NO. However (like any other vendor) Cumulus has tested certain scenarios more thoroughly than others, and they started with EBGP-only, preferably on unnumbered interfaces, so I'd expect them to have fewer bugs in that particular design.

      Also, compared to most everyone else, BGP is a breeze to configure in FRR. Specify neighbors by interfaces and whether they're internal or external, and you're done.
    2. So I sensed You correctly �� Thanks.
Add comment