6WIND: Solving the Virtual Appliance Performance Issues

We all know that the performance of virtual networking appliances (firewalls, load balancers, routers ... running inside virtual machines) really sucks, right? Some vendors managed to offload the packet-intensive processing into the hypervisor kernel, getting way more bang for the buck, but that’s a pretty R&D-intensive undertaking.

We also know that The Real Men use The Real Hardware (ASICs and FPGAs) to get The Real Performance, right? Wrong!


Real Men use Real Hardware

It turns out the generic Intel chipset is extremely powerful ... assuming you know how to use it, and that’s what 6WIND has been doing for the last decade – among other things, they have acquired extensive experience with the Intel Data Plane Development Kit and wrote a full-blown TCP/IP stack on top of it.

So, what’s the big deal? It turns out nobody (including VMware) ever wanted to spend too much time getting the networking part of servers and operating systems right, so we’re burdened with a lasagna-like hodgepodge of just-good-enough-to-work layers, from device drivers through the whole Linux TCP/IP stack.

The 6WINDGate software complements the whole Linux networking stack with a high-performance multi-core implementation that supports all data-plane and control-plane protocols you’ll ever need (unless you’re still stuck with AppleTalk). Interestingly enough, they can coexist with the TCP/IP stack in the Linux kernel – you don’t have to patch or rebuild the kernel to improve the performance, the 6WINDGate software installs its own parallel stack and intercepts API calls (for example, Netlink calls).

Not surprisingly, you get amazing performance once you manage to squeeze the most out of the available hardware. They claim they can increase the performance of a Linux-based networking appliance by a factor of 10, and the performance scales linearly with the number of CPU cores allocated to run 6WINDGate fast path (which does make sense – packet/flow processing is a highly parallelized activity). They say they can share some performance results without an NDA – just contact them.

Summary: 6WIND claims you can get up to 10x more performance out of the existing commodity hardware by adding 6WINDGate stack to an existing bare-metal Linux-based appliance. But can they repeat that feat in the virtual space?

Entering the Virtual World

David Le Goff, 6WIND’s product manager, was kind enough to walk me through 6WIND’s recent Cloud Infrastructure announcement. Once you manage to get past the numerous layers of marketing messages, the solution is blindingly simple: if you want to have the performance you need, you have to bypass the hypervisor:

Step#1 – Install 6WINDGate software in the virtual appliance VM, and use your solution directly or integrate your own protocols with their fast path API;

Step#2 – Bypass the virtual switch and connect the data-path NICs of the virtual machine directly to the physical NICs using VMDirectPath or a Xen/Hyper-V equivalent.

Obviously you need plenty of physical NICs in the server running the virtual appliances (or you use UCS blades with Palo chipset and Adapter FEX), and you can’t vMotion the appliance, but the performance differences should more than offset any inconveniences.


You could use Adapter FEX (Cisco) or
an equivalent solution from other vendors (EVB/VEPA?)

Tying physical NICs to guest VMs and bypassing the hypervisor vSwitch usually results in a loss of functionality – for example, you have to do your own VLAN tagging, using either the capabilities of the VM’s software, or by configuring a VLAN access port on the first-hop switch (external switch tagging – EST – in VMware lingo), but 6WINDGate does include VLAN- and MPLS tagging and GRE tunneling, and I wouldn’t mind a few extra configuration steps for a 10-time increase in performance.

Finally, 6WIND can go one step further: if you have multiple VMs with 6WINDGate stack in the same hypervisor, they could bypass the hypervisor vSwitch. A VM using 6WINDGate can still communicate through vSwitch with other VMs or with the orchestration/management software while having high-speed communication with other VMs using 6WINDGate.

Intrigued? Interested? See a perfect fit between your cloudy needs and 6WIND’s software? Go talk to David.

Need more information?

11 comments:

  1. > A VM using 6WINDGate can still communicate through vSwitch with other VMs or with the orchestration/management software while having high-speed communication with other VMs using 6WINDGate.

    What about intercepting 6W to 6W communications, for example, for IPS/IDS purposes?
  2. Hi!
    Good question!
    Actually we have all the open & standard APIS to integrate with whatever OSS you use for network communication troubleshooting or security purposes . Meanwhile, working with some of those particular layers to hairpin the logs.
    Thanks!
  3. This would be much more interesting if the network connectivity of the other VMs was being funneled through a 6WIND virtual appliance. Otherwise, this has very limited usefulness.
  4. And this is exactly what we are doing... what did you understand?
  5. Interesting...

    There's another company working on similar kinds of network acceleration. Linerate Systems. http://lineratesystems.com/
  6. I thought so based on your comment on an earlier post. But this post seems to say that the 6WIND stack should be directly installed into each VM. Not that it's all that clear either way, but having a very strong virtualization background, I'm reading into it to try to figure out the details and a good portion of the wording had me leaning towards it being installed into every VM. That would make it very limiting considering the dependency on physical NICs, as well as how that would break VM mobility.

    Anyway, if VM traffic is being funneled through a 6WIND virtual appliance of sorts, then this gets interesting. I imagine such an appliance would need to run on every host, sort of like an extension of the virtual switch. I'm seen that setup before where each host has multiple vSwiches, one with physical uplinks and the appliance, and the other with the appliance (providing switching/routing capabilities) and the production VMs.

    Does this setup interfere with support for features such as vMotion/live migration?
  7. The way I understood David, you have to install 6WIND stack into each VM (was it not clear from the article?)

    Obviously having direct connectivity between VM and physical HW (NIC) breaks VM mobility.

    A 6WIND appliance would be a cool idea, but would still require a 6WIND stack in the "fast path" VMs. A decent stack in the hypervisor would be the right solutions ;)
  8. To ensure we are clear here: you need our stack only for high performance purposes (virtual appliance such as firewall, adc, gateways...). We still keep the communication with other VMs with no 6WINDGate (with e1000, VMXNet3,...)
    There are 2 concerns with SW appliances run over virtualization:
    - guest OS kernel over multicore does not scale linearly.
    - hypervisor kernel brings performance penalties as well.

    We bring performance improvments there. Obviously vMotion is maybe more difficult to implement but the question is do you need virtual appliance vmotioned when those VA could (and should) be HA implemented?
  9. Illustration:
    An use case with 15VMs protected by a virtual firewall on a host would require only one 6WINDgate stack on this vFirewall VA, that's all. We do not touch to the other VMs. :)
    Hope it is more clear!
  10. "A decent stack in the hypervisor would be the right solutions" > 4 hypervisors on the market = 4 virtual firewall. Regular updates from the hypervisor network APIs > bunch of virtual firewalls =-X
  11. Dedicated and optimised network CPU exists too

    http://www.tilera.com/products/processors/TILE-Gx_Family

    and they are supported by ... 6Wind !

    http://www.tilera.com/about_tilera/press-releases/6wind-releases-packet-processing-software-optimized-tilera%E2%80%99s-tilepro64-p
Add comment
Sidebar