Let’s Pretend We Run Distributed Storage over a Thick Yellow Cable

Wednesday, November 22, 2017 08:09 +0100

Let’s Pretend We Run Distributed Storage over a Thick Yellow Cable

One of my friends wanted to design a nice-and-easy layer-3 leaf-and-spine fabric for a new data center, and got blindsided by a hyperconverged vendor. Here’s what he wrote:

We wanted to have a spine/leaf L3 topology for an NSX deployment but can’t do that because the Nutanix servers require L2 between their nodes so they can be in the same cluster.

I wanted to check his claims, but Nutanix doesn’t publish their documentation (I would consider that a red flag), so I’m assuming he’s right until someone proves otherwise (note: whitepaper is not a proof of anything ;).

Update 2017-11-22: VSAN release 6.6 no longer needs IP multicast.

Anyway, VMware VSAN had the same limitations, then relaxed that to IP multicast within the cluster and finally got it right in VSAN release 6.6. Not everyone can upgrade the moment new software releases come out; I happen to know someone who's running NSX (with VXLAN) on top of another layer of VXLAN (on Nexus 9000) just to meet the stupid physical L2 requirements.

Interestingly, at least some comparable open-source solutions work happily without layer-2 connectivity or IP multicast (or you wouldn’t be able to deploy them in AWS).

Speaking of leaf-and-spine fabrics and VXLAN: hundreds of networking engineers watched webinars describing them in details, and you’ll find tons of background information, designs, and even hands-on exercises in the new Designing and Building Data Center Fabrics online course. If you want to know whether hyperconverged infrastructure and distributed storage makes sense, there’s no better source than Howard Marks’ presentation from the Building Next-Generation Data Center online course.

Back to thick yellow cable devotees. My friend couldn’t help but wonder:

The overall question would be: why would hyperconverged manufacturers have to rely on L2 to build clusters…?

Because they don't understand networking (or don’t care) and don’t trust DNS? Because they think autodiscovery with IP multicast or proprietary broadcast-like protocols is better than properly configuring storage cluster?

Their main selling quote is that they are “ahead” of the game with their solution but I only see drawback from a networking standpoint …

Keep in mind that they don't talk to networking people when selling their solution. Once the solution is sold and the networking engineer asks "what were they smoking when they were designing this stuff" and “why didn’t you involve the networking team before making the purchase” (after taming the MacGyver reflex), he's the bad guy hindering progress.

12 comments:

Anonymous 22 November 2017 14:26

Nutanix has "published" that it can be used over NSX. The option to not need L2 in the underlay ist called "Scenario 2 - NSX for the Nutanix CVM and User VMs" in http://next.nutanix.com/t5/Nutanix-Connect-Blog/VMware-NSX-on-Nutanix-Build-a-Software-Defined-Datacenter/ba-p/7590

The two scenarios above are published as "validated" by Nutanix at http://next.nutanix.com/t5/Nutanix-Connect-Blog/Nutanix-Validates-Two-Crucial-Deployment-Scenarios-with-VMware/ba-p/7580

Thanks,
Erik

Ivan Pepelnjak 22 November 2017 17:09

I would consider that "just because you could doesn't mean that you should" scenario. In any case, it's just moving the problem by creating another layer of indirection. RFC 1925 has plenty to say about that as does RFC 6670.

Anonymous 22 November 2017 20:40

"Scenario 2" is not supported by VMware because it places a storage vmkernel adapter on a NSX Logical Switch (VXLAN).

vPackets 23 November 2017 11:49

So, I am the guy who talked about this with Ivan and we had that option but VMware would heavily recommend against it .. :)

Anonymous 27 November 2017 17:42

Nutanix doesnt need storage VMK adapters - There is lots of misinformation / ignorance on what Nutanix needs vs what NSX needs here.

Unknown 22 November 2017 14:34

Nutanix also uses IPV6 for cluster discovery on that shared L2 segment.

dj 22 November 2017 16:32

As of the 6.6 release, vSAN no longer requires multicast https://pubs.vmware.com/Release_Notes/en/vsan/66/vmware-virtual-san-66-release-notes.html

That's what you get for relying on VMware Technical Marketing documents :( Thank you - fixed.

Wes Felter 22 November 2017 18:48

When I built a L3 leaf-spine pod with one subnet per rack, L2 still worked within each rack. Since hyperconverged clusters are unlikely to be larger than one rack, most deployments might not have to worry about these L2/L3 issues.

Anonymous 22 November 2017 21:14

>The overall question would be: why would hyperconverged manufacturers have to rely on L2 to build clusters…?
Two weeks ago I was part of a so called "design" session with a major VMware and storage guy of the biggest system house in our country. They told me we need layer 2 connectivity for VMware Vmotion. So, now I bought two Trident 2+ stackable switches for each rack of those 2 datacenters to get VXLAN up and running

Caddo 22 November 2017 21:42

Btw vMotion does not require L2 since vSphere 6, if you provide the proper routing config for the vMotion vmkernel interfaces between hosts that need to move VMs around.

Anonymous 27 November 2017 21:00

Curious to what the use case is for creating a cluster spanning a L3 domain? How big clusters are you planning on building?

Recent posts in the same categories

design

data center

fabric

12 comments: