VXLAN and EVB questions
Wim (@fracske) De Smet sent me a whole set of very good VXLAN- and EVB-related questions that might be relevant to a wider audience.
If I understand you correctly, you think that VXLAN will win over EVB?
I wouldn’t say they are competing directly from the technology perspective. There are two ways you can design your virtual networks: (a) smart core with simple edge (see also: voice and Frame Relay switches) or (b) smart edge with simple core (see also: Internet). EVB makes option (a) more viable, VXLAN is an early attempt at implementing option (b).
When discussing virtualized networks I consider the virtual switches in the hypervisors the network edge and the physical switches (including top-of-rack switches) the network core.
Historically, option (b) (smart edge with simple core) has been proven to scale better ... the largest example of such architecture is allowing you to read my blog posts.
Is it correct that EVB isn't implemented yet?
Actually it is – IBM has just launched its own virtual switch for VMware ESX (a competitor to Nexus 1000V) that has limited EVB support (the way I understand the documentation, it seems to support VDP, but not the S-component).
But VXLAN has its limitations – for example, only VXLAN-enabled VMs will be able to speak to each other.
Almost correct. VMs are not aware of VXLAN (they are thus not VXLAN-enabled). From VM NIC perspective the VM is connected to an Ethernet segment, which could be (within the vSwitch) implemented with VLANs, VXLAN, vCDNI, NVGRE, STT or something else.
At the moment, the only implemented VXLAN termination point is Nexus 1000v, which means that only VMs residing within ESX hosts with Nexus 1000V can communicate over VXLAN-implemented Ethernet segments. Some vendors are hinting they will implement VXLAN in hardware (switches), and Cisco already has the required hardware in Nexus 7000 (because VXLAN has the same header format as OTV).
VXLAN encapsulation will also take some CPU cycles (thus impacting your VM performance.
While VXLAN encapsulation will not impact VM performance per se, it will eat CPU cycles that could be used by VMs. If your hypervisor host has spare CPU cycles, VXLAN overhead shouldn’t matter, if you’re pushing it to the limits, you might experience performance impact.
However, the elephant in the room is the TCP offload. It can drastically improve I/O performance (and reduce CPU overhead) of network-intensive VMs. The moment you start using VXLAN, TCP offload is gone (most physical NICs can’t insert the VXLAN header during TCP fragmentation), and the overhead of the TCP stack increases dramatically.
If your VMs are CPU-bound you might not notice; if they generate lots of user-facing data, lack of TCP offload might be a killer.
I personally see VXLAN as a end to end solution where we can't interact on the network infrastructure anymore. For example, how would these VMs be able to connect to the first-hop gateway?
Today you can use VXLAN to implement “closed” virtual segments that can interact with the outside world only through VMs with multiple NICs (a VXLAN-backed NIC and a VLAN-backed NIC), which makes it perfect for environments where firewalls and load balancers are implemented with VMs (example: VMware’s vCloud with vShield Edge and vShield App). As said above, VXLAN termination points might appear in physical switches.
With EVB we would still have full control and could do the same things we’re doing today on the network infrastructure, and the network will be able to automatically provide the correct VLAN's on the correct ports.
That’s a perfect summary. EVB enhances today’s VLAN-backed virtual networking infrastructure, while VXLAN/vCDNI/NVGRE/STT completely change the landscape.
Is then the only advantage of VXLAN that you can scale better because you don't have the VLAN limitation?
VXLAN and other MAC-over-IP solutions have two advantages: they allow you to break through the VLAN barrier (but so do vCDNI, Q-in-Q or Provider Backbone Bridging), but they also scale better because the core network uses routing, not bridging. With MAC-over-IP solutions you don’t need novel L2 technologies (like TRILL, FabricPath, VCS Fabric or SPB), because they run over IP core that can be built with existing equipment using well-known (and well-tested) designs.
If you need to know more about network virtualization and data center technologies, you might find these webinars relevant:
- Start with Introduction to Virtualized Networking;
- Generic data center technologies and designs are described in Data Center 3.0 for Networking Engineer, large-scale network designs in the Data Center Fabric Architectures webinar.
- Learn everything there is to know about VMware’s vSwitch and other VMware-related networking solutions in VMware Networking Deep Dive.
- Want to know more about virtual network scalability? Check out the Cloud Computing Networking webinar.
And don’t forget: you get access to all these webinars (and numerous others) if you buy the yearly subscription.
All in all it doesn't seem to be a stable and working concept for me right now, except in the niche cases you've mentioned (virtual firewalls).
There seem to be more problems than just TCP offload -
1. # of multicast groups in the physical network. The # of vxlans you support increases the scale requirement of # of multicast groups your networking gear needs to support.
2. When you are using multicast the convergence of vm movement is still a function of your physical
3. Secure group joins and PimBidir support in majority of the networking gear today
This I think the security part will be swept under the carpet till it becomes a real issue. PimBiDir
support will become common only if vxlan catches up.
4. TCP offload details
Each of these features which save the CPU cycles are gone or you need a new NIC -
a. LSO, LRO
b. IP Checksum, UDP Checksum, TCP Checksum - both generation and testing
Again this will be swept under the carpet is my guess.
c. Path MTU
This probably will be dealt with pre-configuring the MTU in guest VMs and will be swept under
5. VxLAN still aspires to provide multiple VLAN like constructs to the guest VMs running on multiple
The details of how network is simulated, what networking protocols required to be supported
is left open to interpretation.
6. This one has been addressed now by Embrane but there was a lack of load balancers, firewalls
which need to go along with the vxlan solution. IPSec gateway is another example. However I
think these are opportunities if market really catches this trend.
The VDP based solution avoids most of these issues. So I am not sure why someone wants to use
vxLAN on their already deployed data center which will result in a low performance and throughput.
I see that STT avoids some of the TCP offload issues, but it seems like a clever hack. NVGRE avoids
reliance on multicast in the network but still has the same problems of TCP offload.
I think without a NIC which supports VxLAN (Cisco sure will do this to differentiate their servers and
disrupt the market) moving to VxLAN will be a disaster for customers.
Again opinionated, but would like to know your thoughts on each of these...
Keep in mind that VXLAN can be implemented in physical switches. This way, you can continue to use your paravirtualized TCP-offload NIC, and still get the scalability benefits of VXLAN.
VXLAN improves scalability in several ways. It gets you past the 4k vlan limit, and also avoids scaling limits in core MAC tables, provides a multi-path fabric, avoids spanning tree, and reduces the scope of broadcasts.
Finally, to route out of a VXLAN segment, you can either go through a multi-VNIC guest (as identified in the article), or, your friendly neighborhood top-of-rack switch can serve as the default gateway for a VXLAN and route unencapsulated traffic up and out, for extremely high performance. Of course, if you need FW/LB/NAT, then your friendly neighborhood top-of-rack switch might need an L4-7 education.
CTO and SVP Software Engineering
Arista Networks, Inc.
Am I right in understanding that your "VXLAN in physical switches helps you retain TCP offload" statement refers to a design where the hypervisor hosts would use VLANs and the VXLAN encapsulation would be done in the switches? That's definitely an interesting proposal, but faces the same "lack of control plane" problems as any other non-EVB proposal.
And I'm anxiously waiting for a public announcement of VXLAN support in physical switches 8-)
Just to let you know that EVB (with VEPA and VDP support) has been implemented in Junos 12.1 of Juniper Networks.