Overlay-to-Underlay Network Interactions: Document Your Hidden Assumptions
If you listen to the marketing departments of overlay virtual networking vendors, it looks like the world is a simple place: you deploy their solution on top of any IP fabric, and it all works.
You’ll hear a totally different story from the physical hardware vendors: they’ll happily serve you a healthy portion of FUD, hoping you swallow it whole, and describe in gory details all the mishaps you might encounter on your virtualization quest.
The funny thing is they’re all right (not to mention the really fun part when FUDders change sides ;).
If we forget the obvious stunts performed by VMware vice presidents on VMworld stage telling us how VXLAN is the perfect DCI solution, because it’s clear whoever dreamed up their claims never encountered laws of physics or even glanced at the fallacies of distributed computing, the fact remains that things aren’t as simple (or as complex) as they look because we’re all working with hidden assumptions, and it’s impossible to debug a problem (or design a good solution) until you document all the hidden assumptions.
Most overlay virtual networking solutions assume they can ignore the underlay network and pretend it’s a uniform IP transport fabric with plenty of bandwidth:
Assumption#1: endpoints are equidistant (any two endpoints get the same connectivity between them).
Corollary#1: It doesn’t matter where one places a new VM.
Assumption#2: The fabric has enough bandwidth.
Corollary#2: We can ignore QoS.
A leaf-and-spine fabric nicely fits these requirements assuming (see, even more assumptions):
- Leaf-to-spine oversubscription is not too high;
- Elephant flows (example: backup) don’t interfere with mice flows (user traffic).
Tangential note on QoS: the only scalable way to do it is to mark traffic in end-user VMs or on the virtual switches, and we know how well that works in most environments. Within a single leaf-and-spine fabric it’s easier, and probably cheaper in the long term, to throw more bandwidth at the problem.
If you deploy an overlay virtual networking solution on an equidistant IP fabric, and manage to separate the storage traffic from the VM traffic (EVO:RAIL uses different uplinks for VSAN and VM traffic for a reason), you’ll do just fine. If you transport storage and VM traffic on the same server uplinks, the hidden assumptions start breaking down, although some baseline QoS on the server uplinks will probably save the day.
Before you start claiming that transporting VM and VSAN traffic on separate uplinks creates two SPOFs, dig into vSphere manuals and figure out how you can solve that challenge with failover policy on port groups.
Traditional three tier data center architectures aren’t equidistant, so expect some problems there… and do I even have to mention how well overlay virtual networks stretched over a DCI link work in practice? See also Bad Ideas.
If Everything Else Fails, Read the Manual
The hidden assumptions are not so hidden. They’re pretty well documented in NSX for Multiple Hypervisors Design Guide and NSX for vSphere Design Guide, but I guess one cannot expect keynote speakers, marketers, and other people with loud opinions to bother with technical details.