Let’s Pretend We Run Distributed Storage over a Thick Yellow Cable
One of my friends wanted to design a nice-and-easy layer-3 leaf-and-spine fabric for a new data center, and got blindsided by a hyperconverged vendor. Here’s what he wrote:
We wanted to have a spine/leaf L3 topology for an NSX deployment but can’t do that because the Nutanix servers require L2 between their nodes so they can be in the same cluster.
I wanted to check his claims, but Nutanix doesn’t publish their documentation (I would consider that a red flag), so I’m assuming he’s right until someone proves otherwise (note: whitepaper is not a proof of anything ;).
Update 2017-11-22: VSAN release 6.6 no longer needs IP multicast.
Anyway, VMware VSAN had the same limitations, then relaxed that to IP multicast within the cluster and finally got it right in VSAN release 6.6. Not everyone can upgrade the moment new software releases come out; I happen to know someone who's running NSX (with VXLAN) on top of another layer of VXLAN (on Nexus 9000) just to meet the stupid physical L2 requirements.
Interestingly, at least some comparable open-source solutions work happily without layer-2 connectivity or IP multicast (or you wouldn’t be able to deploy them in AWS).
Speaking of leaf-and-spine fabrics and VXLAN: hundreds of networking engineers watched webinars describing them in details, and you’ll find tons of background information, designs, and even hands-on exercises in the new Designing and Building Data Center Fabrics online course. If you want to know whether hyperconverged infrastructure and distributed storage makes sense, there’s no better source than Howard Marks’ presentation from the Building Next-Generation Data Center online course.
Back to thick yellow cable devotees. My friend couldn’t help but wonder:
The overall question would be: why would hyperconverged manufacturers have to rely on L2 to build clusters…?
Because they don't understand networking (or don’t care) and don’t trust DNS? Because they think autodiscovery with IP multicast or proprietary broadcast-like protocols is better than properly configuring storage cluster?
Their main selling quote is that they are “ahead” of the game with their solution but I only see drawback from a networking standpoint …
Keep in mind that they don't talk to networking people when selling their solution. Once the solution is sold and the networking engineer asks "what were they smoking when they were designing this stuff" and “why didn’t you involve the networking team before making the purchase” (after taming the MacGyver reflex), he's the bad guy hindering progress.
The two scenarios above are published as "validated" by Nutanix at http://next.nutanix.com/t5/Nutanix-Connect-Blog/Nutanix-Validates-Two-Crucial-Deployment-Scenarios-with-VMware/ba-p/7580
Thanks,
Erik
Two weeks ago I was part of a so called "design" session with a major VMware and storage guy of the biggest system house in our country. They told me we need layer 2 connectivity for VMware Vmotion. So, now I bought two Trident 2+ stackable switches for each rack of those 2 datacenters to get VXLAN up and running