vSphere 5.0 new networking features: disappointing

I was sort of upset that my vacations were making me miss the VMware vSphere 5.0 launch event (on the other hand, being limited to half hour Internet access served with early morning cappuccino is not necessarily a bad thing), but after I managed to get home, I realized I hadn’t really missed much. Let me rephrase that – VMware launched a major release of vSphere and the networking features are barely worth mentioning (or maybe they’ll launch them when the vTax brouhaha subsides).

I had a really hard time finding anything networking-related in a very long list of new features and enhancements and the very slim VMware’s white paper tells you how serious VMware is about improving their networking support. Their community pages complete the picture – while other blogs have exploded with detailed descriptions of new vSphere 5.0 goodies, the last entry in the VMware Networking Blog is a month old and describes Cisco Nexus 1000v. Anyhow, let’s look at the morsels we’ve got:

LLDP support ... years after everyone else had it, including Cisco.

Netflow support. This one might actually make those people that care about inter-VM traffic flows excited. Everyone else will probably continue using Netflow probes at the DC edge.

Port mirroring. Good one, particularly the ability to send the mirrored traffic to a VM on another host.

NETIOC enhancements. Now you can define your own traffic types that you can later use in queuing/shaping configuration. If my failing memory still serves me, we were able to configure ACL-based custom queuing in 1990.

802.1p tagging. Finally. 13 years after the standard was ratified.

And last but definitely note least, vShield Edge got static routing. Linux-based VM that is positioned as a L3 appliance providing NAT, DHCP and a few other features supports static routing. Why is that a new feature?

Interestingly, some VMware features that use the network transport got significantly better – HA was completely rewritten, vMotion supports multiple NICs and slows down hyper-active VMs, Intel’s software FCoE initiator is supported and ESXi has a firewall protecting the management plane – but the lack of networking innovation is stunning. Where’s EVB, SR-IOV or hypervisor pass-through like VM-FEX, not to mention MAC-over-IP? How about something as trivial as link aggregation with LACP? Is everyone but me (and maybe two other bloggers) happy configuring a wide range of VLANs spanning all ESXi hosts in the data center or is VMware simply not listening to the networking engineers? It looks like some people still believe every server has a very important storage adapter and a cumbersome NIC appendix.

More information

The details of VMware networking and its integration with the rest of the data center network are described in VMware Networking Deep Dive webinar (register here or buy a recording). If you want to learn more about modern data center architectures, buy a recording of my Data Center 3.0 for Networking Engineers webinar. Both webinars are also part of the yearly subscription package.

12 comments:

  1. Lack of LACP support has been a puzzle to me for such a long time. But most lazy ESX admins don't even try to do anything beyond load-balance based on virtual port ID anyway...

    ReplyDelete
  2. Port mirroring - Only between VMs :) What about traffic going from and to the real world?

    ReplyDelete
  3. Ivan Pepelnjak22 July, 2011 18:19

    The way I understood it, you can mirror the traffic going INTO and/or OUT OF a VM, which does include all the real world traffic.

    ReplyDelete
  4. Askar Kopbayev24 July, 2011 13:01

    The LACP support is not always important. For instance, with HP Virtual Connect you can't use it on your blades with ESXi. Another good example is Load Based Teaming in vDS introduced in vSphere 4.1 that provides better load-balancing than route based on IP hash and keeps your configuration simple.

    ReplyDelete
  5. Ivan Pepelnjak24 July, 2011 16:17

    Some engineers obviously found LAG-like load sharing useful (example: single VM generates more than a single uplink's worth of data) or VMware wouldn't have implemented IP-hash-based load balancing. Having LAG-like functionality and forcing everyone to use static port channel is "somewhat suboptimal".

    ReplyDelete
  6. Because HP can't deliver LACP means it isn't important? < LOL

    ReplyDelete
  7. Askar Kopbayev25 July, 2011 08:42

    Brad, if thousands of HP clients choose HP VC solution for vSphere that probably means LACP support is not really important for them, doesn't it?
    could you please provide real life examples where LACP support is an important feature for vSphere?
    Ivan has already provided an example where one VM would generate more traffic than one of the uplink has bandwidth, which is gettting less important now with more and more companies moving to 10G.

    ReplyDelete
  8. Ivan Pepelnjak25 July, 2011 11:13

    Askar, if thousands of engineers decide to use L2 inter-DC solution, it still doesn't make it a sound design, does it 8-)

    ReplyDelete
  9. Askar Kopbayev25 July, 2011 11:27

    Sure, Ivan. I am not saying it is the best solution, but if thousands of engineers decided to do that they might have some reasons for it and I would like to know these.

    Anyway, I am glad to learn real life situations where LACP is really important.

    ReplyDelete
  10. Ivan Pepelnjak25 July, 2011 11:50

    Most often "decided to do that" boils down to "it's the only thing that works" not "it's what's best for my network".

    Another reason we would really need LACP is described here:
    http://blog.ioshints.info/2011/01/vswitch-in-multi-chassis-link.html

    ReplyDelete
  11. If you are connected to a Cisco switch with LACP (mis)configured on the ESX-facing ports the member ports will transition to standalone ports since the ESX does not support LACP. Independent mode needs to be explicitly disabled (I believe it's "port-channel standalone-disable") to avoid the negative impact of this accidental mismatch which can be severe mac flap issue and mac learning suspensions on the switch if load-balancing algo on the vswitch is flow-based. In an environment where there are non-ESX servers using LACP this sort of operational mistake happens and can be quite bad.

    ReplyDelete
  12. Another misconfiguration error is if the multi-homed ESX using flow-based hashing is ported incorrectly. LACP System ID and Port ID prevents incorrectly porting cable. This issue involves multiple operational mistakes, but fates tend to collide eventually.

    ReplyDelete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.