VMware vSwitch – the baseline of simplicity

If you’re looking for a simple virtual switch, look no further than VMware’s venerable vSwitch. It runs very few control protocols (just CDP or LLDP, no STP or LACP), has no dynamic MAC learning, and only a few knobs and moving parts – ideal for simple deployments. Of course you have to pay for all that ease-of-use: designing a scalable vSwitch-based solution is tough (but then it all depends on what kind of environment you’re building).

How did it all start?

As always, there’s a bit of a history there. Like many other disruptive technologies (including Netware and Windows networking), VMware entered the enterprise networks “under the radar” – geeks playing with it and implementing totally undercover solutions.

It was important in those days to be able to connect an ESX host to the network with minimum disruption (even if you had sub-zero networking skills). The decision to avoid STP and implement split-horizon switching made perfect sense; running STP in an ESX host would get you banned in a microsecond. vSwitch is also robust enough that you can connect it to a network that was “designed” by someone who got all his networking skillz through Linksys Web UI … and it would still work.


vSwitch is like the bike I had as a kid: simple, robust and
virtually indestructible ... but it didn't get you very far.

The growing pains

In the meantime, VMware has grown from a tiny disruptive startup to a major IT company and THE major virtualization vendor (literally created that market), and became part of almost every virtualized data center … but the vSwitch has failed to grow up.

vSwitch got some scalability enhancements (distributed vSwitch), but only on the management plane; apart from a few features that are enabled in vDS and not in vSwitch, the two products use the same control/data plane. There’s some basic QoS (per-VM policing and 802.1p marking) and some support for network management and troubleshooting (Netflow, SPAN, remote SPAN). Still no STP nor LACP.

Lack of LACP is a particularly tough nut. Once you try to do anything a bit more complex, like proper per-session load balancing, or achieving optimum traffic flow in a MLAG environment, you have to carefully configure vSwitch and pSwitch just right. You can eventually squeeze the vSwitch into those spots, and get it to work, but it will definitely be a tight fit, and it won’t be nearly as reliable as it could have been were vSwitch to support proper control-plane protocols.

Is it just VMware?

Definitely not. Other virtual switches fare no better, and the Open vSwitch is no more intelligent without an external OpenFlow controller. At the moment, VMware’s vSwitch is probably still the most intelligent vSwitch shipping with a hypervisor.

The only reason XenServer supports LACP is because LACP support is embedded in the underlying Linux kernel … but even then the LACP-based bonding is not officially supported.

Multi-tenant support

vSwitch’s multi-tenant support reflects its typical use case (virtualized enterprise data center). The only virtual networking technology it supports is 802.1Q-based VLANs (using a single VLAN tag), limiting you to 4000 logical networks (assuming the physical switches can support that many VLANs). There’s also no communication between the virtual switches and adjacent physical switches – a vSwitch embedded in a vSphere host cannot tell the adjacent physical switch which VLANs it needs.

vCDNI and VXLAN (both scale much better and offer wider range of logical networks) are not part of vSwitch. vCDNI is an add-on module using VMsafe API and VXLAN currently exists only within Nexus 1000V.

On top of all that, vSwitch assumes “friends and family” environment. BPDUs generated by a VM can easily escape into the wild and trigger BPDU guard on upstream switches; it’s also possible to send tagged packets from VMs into the network (implementing VLAN hopping would take a few extra steps and a misconfigured physical network), and there’s no per-VM broadcast storm control. Using a vSwitch in a potentially hostile cloud environment is a risky proposition.

Scalability? No thanks.

There is an easy way to deploy vSwitch in the worst-case “any VM can be started on any hypervisor host” scenario – configure all VM-supporting VLANs on all switch-to-server access trunks, effectively turning the whole data center into a single broadcast domain. As hypervisor NICs operate in promiscuous mode, every hypervisor receives and processes every flooded packet, regardless of its VLAN and its actual target.


Who says you can't scale bikes as a public transport?
It might get messy, though ...

There are three factors that limit the scalability of such a design:

Reliance on bridging, which usually implies reliance on STP. STP is not necessarily a limiting factor; you can create bridged networks with thousands of ports without having a single blocked link if you have a well-designed spine & leaf architecture, and large core switches. Alternatively, you could trust emerging technologies like FabricPath or QFabric.

Single broadcast domain. I don’t want to be the one telling you how many hosts you can have in a broadcast domain, let’s turn to TRILL Problem and Applicability Statement (RFC 5556). Its section 2.6 (Problems Not Addressed) is very clear: a single bridged LAN supports around 1000 hosts. Due to physical NICs being in promiscuous mode and all VLANs being enabled on all access trunks, VLAN segmentation doesn’t help us; effectively we still have a single broadcast domain. We’re thus talking about ~1000 VMs (regardless of the number of VLANs they reside in).

I’m positive I’ll get comments along the lines of “I’m running 100.000 VMs in a single bridged domain and they work just fine”. Free soloing (rock climbing with zero protection) also works great until the first fall. Seriously, I would appreciate all data points you're willing to share in the comments.

Number of VLANs. Although vSphere supports the full 12-bit VLAN range, many physical switches don’t. The number of VLANs doesn’t matter in a traditional virtualized data center, with only a few (or maybe a few tens) security zones, but it’s a major showstopper in a public cloud deployment. Try telling your boss that your solution supports only around 1000 customers (assuming each customer wants to have a few virtual subnets) … after replacing all the switches you bought last year.

Conclusions

The vSwitch is either the best thing ever invented (if you’re running a small data center with a few VLANs) or a major showstopper (if you’re building an IaaS cloud). Use it in environments it was designed for and you’ll have a fantastically robust solution.

There are also a few things you can do in the physical network to improve the scalability of vSwitch-based networks; I’ll describe them in the next post.

More information

You’ll find a lot more information about virtualized networking in my webinars:

And don’t forget: you get access to all these webinars (and numerous others) if you buy the yearly subscription.

6 comments:

  1. Hey Ivan,
    With Dell Force10 switches you auto-provision VLANs on demand to the vSwitch.
    http://www.force10networks.com/products/OpenAutomationFramework.asp

    Cheers,
    Brad

    ReplyDelete
  2. Nice childhood bike. That could get me to work and back 8-)

    ReplyDelete
  3. Not to be critical...but at first I missed the juicy little nugget of "Automated VLAN Provisioning in virtual environments" since it was surrounded with marketing speak..and I usually gloss over that.

    Great info, thanks Brad.

    ReplyDelete
  4. Dell/Force10 looks compelling.. seriously... Stop making me think about a third vendor Brad. I only have so many brain cells left.

    ReplyDelete
  5. Hi!
    If you use Vmware's vShield Edge, app etc, this will not become a problem?

    ReplyDelete
  6. I guess for one data center I'll just break out the ole lattisnet 1k hub. :)

    ReplyDelete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.