Cisco & VMware: Merging the Virtual and Physical NICs

Virtual (soft) switches present in almost every hypervisor significantly reduce the performance of high-bandwidth virtual machines (measurements done by Cisco a while ago indicate you could get up to 38% more throughput if you tie VMs directly to hardware NICs), but as I argued in my “Soft Switching Might Not Scale, But We Need It” post, we need hypervisor switches to isolate the virtual machines from the vagaries of the physical NICs.

Engineering gurus from Cisco and VMware have yet again proven me wrong – you can combine VMDirectPath and vMotion if you use VM-FEX.


VM-FEX architecture

This is (approximately) how that marvel of engineering works (and you’ll find more details in this presentation):

  • You have to configure VM-FEX (which means that you can only use this trick if you have an UCS system with Palo chipset in the server blades).
  • Palo chipset emulates the registers and data structures used by the VMXNET3 paravirtualized device driver (and most VMs use VMXNET3 today due to its performance benefits). You can thus link a VM with VMXNET3 device driver directly to the physical hardware presented to the server by the Palo chipset (using VMDirectPath, for example).

Cisco was using VMDirectPath in the VM-FEX performance measurements; in most VM-FEX deployments you’d use the passthrough VEM to enable vMotion of the VMs using VM-FEX.

  • vSphere 5 introduced support for vMotion with VMDirectPath for VM-FEX NICs. This enhancement is crucial as it allows a VM using VM-FEX NIC without a VEM to be vMotioned to another host.

The trick VMware’s engineers used is very simple (conceptually, but I’m positive there are numerous highly convoluted implementation details): once you get a request to vMotion a VM, you freeze the VM, copy physical registers of the VM-FEX VIC to the data structures used by the hypervisor kernel implementation of VMXNET3 device, disconnect the VM from the physical hardware, and allow it to continue working through the virtual VMXNET3 device and VEM. Once the VM is moved to another ESX host, the contents of the VMXNET3 virtual device registers get copied to the physical NIC, and the VM yet again regains full access to the physical hardware.

Was it all just an alphabet soup?

Check out my virtualization webinars – they will help you get a decent foothold in the brave new world of server and network virtualization.

13 comments:

  1. Given the speed and reduction in CPU of this solution, what would be the compelling reason for using a Nexus 1000v instead of this provided you are using UCS? Inter-VM/Intra-Host traffic? Would the gains of VMDirectPath + VM-FEX give you comparable performance to intra-host VM traffic?

    ReplyDelete
  2. Ivan Pepelnjak02 April, 2012 08:24

    Good question ... can't see too many benefits as the VM vNIC gets directly connected to 6100.

    Intra-host traffic is obviously faster than traffic going through VM-FEX (and back). The question is: how often and how much intra-host traffic would you in your environment? It all depends on the applications.

    ReplyDelete
  3. - All the virtual firewalls that need to secure all the others virtual machines is a good example of needs for intra-VM communication.

    - All the virtual ADC market is another use case.

    - Actually all the intermediate VMs before reaching the Application VMs.

    ReplyDelete
  4. Jonathan Topping03 April, 2012 16:48

    Cisco isn't too great at describing this using a non-UCS cisco architecture(Nexus 5000s), is it even supported? Is it supported from a N5k to N2k-FEX to VM-FEX?

    ReplyDelete
  5. Ivan Pepelnjak03 April, 2012 16:51

    Adapter FEX works on Nexus 5000. VM-FEX not yet (UCS only).

    ReplyDelete
  6. VM-FEX capability is not just limited to UCS6100 or 6200, with C210 and P81E card with Nexus 5500 vm-fex can be configured.

    ReplyDelete
  7. Ivan Pepelnjak04 April, 2012 13:33

    You're probably talking about Adapter FEX support introduced in NX-OS 5.1(3)N1(1). If Nexus 5000 really supports VM-FEX I'd appreciate a link to the corresponding documentation. Thank you!

    ReplyDelete
  8. VM-FEX is supported by the Cisco Nexus 5500 Platform running Cisco NX-OS Release 5.1(3)N1(1) or later.

    http://www.cisco.com/en/US/docs/switches/datacenter/nexus5000/sw/layer2/513_n1_1/b_Cisco_n5k_layer2_config_gd_rel_513_N1_1_chapter_010101.html

    It won't be too much longer before VM-FEX is being implemented on non-Cisco NICs, and later on non-Cisco servers...

    ReplyDelete
  9. You can use 1000v for more than just switching; SPAN & Netflow, DSCP-to-CoS marking (CUCM for example), plus all of the add-ons to 1000v such as Virtual Security Gateway & vWAAS are a few examples. Third-party extensions to 1000v are coming as well.

    Even with VM-FEX you still have to install a VEM on ESXi. :)

    ReplyDelete
  10. Ivan Pepelnjak06 April, 2012 06:57

    Missed that one. Properly impressed ;) Thank you!

    ReplyDelete
  11. Ivan Pepelnjak06 April, 2012 07:01

    SPAN & Netflow: wouldn't they work with VM-FEX as well, configured on NX5K? Same for DSCP-to-CoS marking? vPath-based services are obviously a totally different story; there you need the Nexus 1000v.

    Finally, I was told the VM-FEX VEM is not exactly the same thing as Nexus 1000v VEM (but you do need a kernel module because you have to modify VDS behavior).

    ReplyDelete
  12. Yes, no and maybe...

    A VM-FEX virtual Ethernet interface can be configured as a SPAN source and destination just like a physical interface can, but the 5Ks and FIs have limitations on SPAN, and depending on the SPAN requirements this could cause a problem. With 1000v, a VSM supports 64 SPAN/ERSPAN sessions across all installed VEMs. You can send this traffic to the physical network or keep it "virtual" by using a virtual traffic analyzer such as the NAM virtual service blade on the Nexus 1010 "appliance".

    As for Netflow, the 5Ks and FIs don't support it, so Netflow requirements are another use-case for 1000v.

    QoS on 5Ks and FIs don't support matching/marking L3 DSCP values because they are L2 switches. When L3 capability is added (available now for 55XX, soon for 62XX) DSCP matching/marking is possible. I learned this when prepping UCS for a Unified Communications install. The UC applications mark traffic with DSCP values, and Cisco recommends honoring those values at L2 by translating them to CoS using 1000v.

    I am not 100% certain but I am pretty sure when I was looking at this for UCS a few months back they are the same VEM. To verify I'll download the latest releases for 1000v and VM-FEX and see how they compare.

    ReplyDelete
  13. Great Feature.
    Complete VM-FEX setup videos at http://ucsguru.com

    ReplyDelete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.