VLANs are the wrong abstraction for virtual networking
Are you old enough to remember the days when operating systems had no file system? Fortunately I never had to deal with storing files on one of those (I was using punch cards), but miraculously you can still find the JCL DLBL/EXTENT documentation online.
On the other hand, you probably remember the days when a SCSI LUN actually referred to a physical disk connected to a computer, not an extensible virtual entity created through point-and-click exercise on a storage array.
You might wonder what the ancient history has to do with virtual networking. Don’t worry we’re getting there in a second ;)
When VMware started creating their first attempt at server virtualization software, they had readily available storage abstractions (file system) and CPU abstraction (including MS-DOS support under Windows, but the ideas were going all the way back to VM operating system on IBM mainframes).
Creating virtual storage and CPU environments was thus a no-brainer, as all the hard problems were already solved. Most server virtualization solutions use the file system recursively (virtual disk = file on a file system) and abstract the CPU by catching and emulating privilege-mode instructions (things got way easier with modern CPUs supporting virtualization in hardware). There was no readily-available networking abstraction, so they chose the simplest possible option: VLANs (after all, it’s simple to insert a 12-bit tag into a packet and pretend it’s no longer your problem).
The “only” problem with using VLANs is that they aren’t the right abstraction. Instead of being like files on a file system, VLANs are more like LUNs on storage arrays – someone has to provision them. You could probably imagine how successful the server virtualization would be if you’d have to ask storage administrators for a new LUN every time you need a virtual disk for a new VM.
So every time I see how the “Software-Defined Data Center [...] provides unprecedented automation, flexibility, and efficiency to transform the way you deliver IT” I can’t help but read “it took us more than a decade to figure out the right abstraction.” Virtual networking is nothing else but another application riding on top of IP (storage and voice people got there years before).
More information
If you’re attending Interop Las Vegas, drop by my Overlay Virtual Networking Explained session (and use DISPEAKER marketing code to get 25% discount on registration fees), or register for the Network Infrastructure for Cloud Computing workshop. If not, don’t worry – there will be an overlay networking webinar in September/October timeframe.
The 802.1Q VLANs are wrong abstraction because they tightly couple virtual constructs (virtual networks = files) with physical reality (VLAN = LUN).
Outside of the limitation of the number (4K is not enough), if we had created tools and protocols that would hide all this, perhaps a VLAN would have been just fine. Kind of like what is happening now in overlays :-)
No amount of translation, mapping and provisioning will change the basic facts: like SAN and storage arrays have no business being involved in file creation and directory lookups, networking devices and transport fabrics have no business being tightly coupled to inter-hypervisor communication.
Ivan - an IP phone call is a VLAN that gets priority routing due to the time budget being so small. Are you saying VM's should speak via an API? The problems show up because you can have a fail-over VM in another data center with the same IP? Or the server can move to another part of the network ala 'vmotion' . . . The network needs to catch up or the problem needs to get fixed like DNS resolution, arp cache or whatever the issue is when the VM moves. Maybe the applications just need to do federation so it will matter less where it is. Maybe just by making the app less IP bound.
As for "priority routing" - do you get that when writing a critical file content onto your disk? Do you care? Why not? ... and how is that relevant to virtual networks within a cloud-scale data center?
The issue many virtual environments struggle with is that network solutions refuse to integrate. I would love to create a "virutal switch" and publish it to the physical network. Need a new segment(VLAN), FW policy, route, etc. to support a new workload? Why not provision it as part of the VM deployment?
VLANs may or may not be the right solution. It is what we have at this time. IPv6 will change some of this, but segmentation is required by legacy security controls.
If we look at networks as a highway/road system, then the controls lie at the edge (home/driveway) with monitors in the flow. Conversely, todays' networks assume the edge to be dumb with intellegience centralized.