Here’s a question I got from an attendee of my Building Next-Generation Data Center online course:
As far as I understood […] it is obsolete nowadays to build a new DC fabric with routing on the host using BGP, the proper way to go is to use IGP + SDN overlay. Is my understanding correct?
Ignoring for the moment the fact that nothing is ever obsolete in IT, the right answer is it depends… this time on answer(s) to two seemingly simple questions “what services are we offering?” and “what connectivity problem are we trying to solve?”.
You can find plenty of material on the topic in Define the Services and Requirements part of Building Next-Generation Data Center online course. While you have to be enrolled in the course to access every single video in that module, some of them are accessible with Standard ipSpace.net subscription, and I included plenty of links to free resources.
Let’s start with the services.
Your data center infrastructure might have to support VM connectivity including on-demand virtual networks, BYOA (Bring Your Own Addresses), and IP mobility. These requirements were traditionally implemented with VLANs, requiring layer-2 fabrics using STP and/or MLAG, or fabric-side encapsulation into PBB or VXLAN. Large(r) deployments typically use hypervisor-side overlay virtual networking, resulting in the need for a single IP address per hypervisor host or per host uplink interface.
Alternatively, you might be providing connectivity for Docker containers that are hidden behind a single IP address (per host) or residing within an IP prefix advertised by the host – that’s how some ipvlan deployments work, and how Docker implemented IPv6 connectivity the last time I checked.
Obviously you’d need more than just connectivity – every container solution implements some sort of traffic filtering to limit inter-container connectivity. ipSpace.net subscribers can find more details in Containers and Docker webinars.
If you need a prefix per host (effectively reinventing CLNS), running a routing protocol on the host is the most convenient thing to do… but what if you need a single IP address per host?
In the ideal world, every host uplink would have a different IP address belonging to a different subnet, and all we’d have to do on the infrastructure side would be to build a simple layer-3 fabric.
Unfortunately, the hypervisor vendors still have problems spelling networking regardless of how much they talk about software-defined networking, and so every overlay virtual networking solution I’ve seen so far expects to have either a single IP address per host (reachable over multiple uplinks) or multiple IP addresses within the same subnet (yeah, like that’s an improvement).
ipSpace.net subscribers can find way more details in VMware NSX Technical Deep Dive and Leaf-and-Spine Fabrics webinars; if you want to add interactive discussions and mentoring to your learning process, go for the Designing and Building Data Center Fabrics online course.
The challenge we have to solve is thus how do you tell the network it can reach the same IP address over multiple interfaces? Two options that come to mind are LAG toward the host (requiring MLAG or equivalent between two ToR switches), or a routing protocol on the host advertising the overlay networking VTEP IP address as a host route to the network.
Now we know when having a routing protocol on the host makes sense. Next question: which one should you use? I’d never use OSPF or IS-IS as they allow a misconfigured host to mess up the whole network. What’s left? RIP or BGP. I would go for BGP ;)