Routing in Data Center: What Problem Are You Trying to Solve?

Monday, September 10, 2018 07:49 +0200

Routing in Data Center: What Problem Are You Trying to Solve?

Here’s a question I got from an attendee of my Building Next-Generation Data Center online course:

As far as I understood […] it is obsolete nowadays to build a new DC fabric with routing on the host using BGP, the proper way to go is to use IGP + SDN overlay. Is my understanding correct?

Ignoring for the moment the fact that nothing is ever obsolete in IT, the right answer is it depends… this time on answer(s) to two seemingly simple questions “what services are we offering?” and “what connectivity problem are we trying to solve?”.

You can find plenty of material on the topic in Define the Services and Requirements part of Building Next-Generation Data Center online course. While you have to be enrolled in the course to access every single video in that module, some of them are accessible with Standard ipSpace.net subscription, and I included plenty of links to free resources.

Let’s start with the services.

Your data center infrastructure might have to support VM connectivity including on-demand virtual networks, BYOA (Bring Your Own Addresses), and IP mobility. These requirements were traditionally implemented with VLANs, requiring layer-2 fabrics using STP and/or MLAG, or fabric-side encapsulation into PBB or VXLAN. Large(r) deployments typically use hypervisor-side overlay virtual networking, resulting in the need for a single IP address per hypervisor host or per host uplink interface.

Alternatively, you might be providing connectivity for Docker containers that are hidden behind a single IP address (per host) or residing within an IP prefix advertised by the host – that’s how some ipvlan deployments work, and how Docker implemented IPv6 connectivity the last time I checked.

Obviously you’d need more than just connectivity – every container solution implements some sort of traffic filtering to limit inter-container connectivity. ipSpace.net subscribers can find more details in Containers and Docker webinars.

If you need a prefix per host (effectively reinventing CLNS), running a routing protocol on the host is the most convenient thing to do… but what if you need a single IP address per host?

In the ideal world, every host uplink would have a different IP address belonging to a different subnet, and all we’d have to do on the infrastructure side would be to build a simple layer-3 fabric.

Unfortunately, the hypervisor vendors still have problems spelling networking regardless of how much they talk about software-defined networking, and so every overlay virtual networking solution I’ve seen so far expects to have either a single IP address per host (reachable over multiple uplinks) or multiple IP addresses within the same subnet (yeah, like that’s an improvement).

ipSpace.net subscribers can find way more details in VMware NSX Technical Deep Dive and Leaf-and-Spine Fabrics webinars; if you want to add interactive discussions and mentoring to your learning process, go for the Designing and Building Data Center Fabrics online course.

The challenge we have to solve is thus how do you tell the network it can reach the same IP address over multiple interfaces? Two options that come to mind are LAG toward the host (requiring MLAG or equivalent between two ToR switches), or a routing protocol on the host advertising the overlay networking VTEP IP address as a host route to the network.

Now we know when having a routing protocol on the host makes sense. Next question: which one should you use? I’d never use OSPF or IS-IS as they allow a misconfigured host to mess up the whole network. What’s left? RIP or BGP. I would go for BGP ;)

Final tip: if you’re wondering what equivalent of MLAG might be, you’ll find the answers in Data Center Fabric Architectures and EVPN Technical Deep webinars.

4 comments:

Anonymous 10 September 2018 11:00

Do you mean a misconfigured host won't mess up a whole network with BGP?

Ivan Pepelnjak 10 September 2018 14:28

Try figuring out why I wrote what I wrote... and if it doesn't work, please follow the "never use OSPF or IS-IS" link, that blog post might give you an answer. There's a reason I'm spending time linking to other sources of information while writing a blog post.

Anonymous 12 September 2018 16:29

NSX still doesn't support dual host NICs even with DHCP?

Ivan Pepelnjak 12 September 2018 16:33

It supports dual-host NICs. It even creates VTEP for every uplink... but everyone is telling me they're supposed to be in the same subnet. More in NSX Technical Deep Dive webinar (I added a slide explaining this literally a few days ago).

Recent posts in the same categories

design

data center

fabric

4 comments: