I love listening to the Packet Pushers (the best networking podcast there is) on my way to work, and I know what to expect in every SDN-focused episode: an “I’m sick and tired of using CLI to manually provision VLANs” rant. Sure, we’re all in the same boat, but did you ever do something to get rid of that problem?
After all, you don’t need more than a few tens of VLANs in a typical enterprise data center or private cloud (clouds with thousands of tenants are obviously a totally different story) and most vendors (with two notable exceptions: the biggest one and the one lamenting the lack of networking innovation) have some sort of VMware-focused automatic edge port VLAN provisioning, from on-switch solutions like VM Tracer (Arista) or Automatic Migration of Port Profiles (Brocade) to network management applications (like Junos Space). Are you using them? If not, why not? What’s stopping you?
I’m describing various VM-aware networking solutions in numerous webinars, including Introduction to Virtual Networking, VMware Networking Technical Deep Dive, Cloud Computing Networking and Data Center Fabric Architectures.
Update 2016-02-20: The two vendors I mentioned above implemented hypervisor-aware solutions in the meantime. More details in the Data Center Fabric Architectures webinar.
But let’s assume you’re unfortunate and use switches that have no hypervisor integration tools (which is about three quarters of the market unless you want to bite into the ultimate lock-in and deploy VM-FEX). Would it be THAT hard to write an application that would read the LLDP or CDP tables on ToR switches (populated by LLDP or CDP updates from the vSphere hosts), build a connectivity table, and allow server/hypervisor administrators to provision their own VLANs (within limits) on server-facing switch ports? I know that any one of our interns could do it in a week (given reasonably complete functional specs), but we never did it, because doing automatic VLAN provisioning in our IT infrastructure is simply not worth the effort.
Assuming we’re truly sick-and-tired of manual VLAN provisioning in enterprise data centers, there must be other reasons we’re not deploying the vendor-offered features or rolling out our own secret sauce. It might have to do with the critical impact of the networking gear.
Let’s assume you manage to mess up a server configuration with Puppet – you lose a server, and hopefully you’re using a cluster or a scale-out application, so the impact is negligible.
If a vSphere host crashes, you lose all the VMs running on it. That could be 50-100 VMs if you’re using recent high-end server, but if you care about their availability, you have an HA cluster and they get restarted automatically.
Now imagine the vendor-supplied or home-brewed pixie dust badly misconfigures or crashes a ToR switch. Worst case (switch hangs and links to servers are not lost), you lose connectivity to tens of physical servers, which could mean a few thousands VMs; best case those same VMs lose half the bandwidth.
Faced with this reality, it’s understandable we’re scared of software automatically configuring our networking infrastructure. Now please help me understand how that’s going to change with third-party SDN applications.