If you watched the Network Field Day videos, you might have noticed an interesting (somewhat one-sided) argument I had with Sunay Tripathi, CTO and co-founder of Pluribus Networks (start watching at around 32:00 to get the context). Let’s try to get the record straight.
Data plane performance of overlay virtual networks run with well-implemented virtual switches does not vary based on number of tenants, remote hypervisors (VTEPs), or virtual machines (MAC addresses or IP host routes). Hypervisor-based overlay virtual networks might have other scalability concerns, but forwarding performance is not one of them.
The crucial part of my argument was the difference between performance (how fast can you do something in software versus hardware) and scalability (how large can something grow). That nuance somehow got lost in the translation.
Early host-based VXLAN implementations actually had significant performance limitations (when someone calls 1Gbps ludicrous speed in 2014 you have to wonder how bad it was before that), and VMware clearly documented them in their technical white paper… but that topic will have to wait for another blog post.
Overlay Virtual Networking Scalability
In November 2014 I did a 2-hour public webinar on scalability challenges of overlay virtual networks, so you might want to watch that one (or at least the Distributed Data Plane video) first.
Disclosure: As you’ll notice in the introduction to each video, Nuage Networks sponsored the webinar, but I would never accept to work on a sponsored webinar if I didn’t believe in what I would be telling you. It’s impossible to buy integrity once you compromise it.
The scalability aspect of overlay virtual networking we’re discussing here is the scalability of hypervisor data plane – how many MACs, IPs and remote VTEPs can a hypervisor have and what’s the impact of large-scale environment on forwarding performance.
Layer-2 forwarding in hardware or software is extremely simple:
- Extract destination MAC address from the packet;
- Look up destination MAC address in a hash table. Hash table lookups have almost linear time when the table is sparsely populated, which is easier to achieve in software than in hardware;
- Send the packet to output port specified in the hash table entry, potentially adding tunnel encapsulation (VXLAN, PBB, VPLS…)
With proper implementation, the number of MAC addresses has absolutely no impact on the forwarding performance until the MAC hash table overflows, and the number of VTEPs doesn’t matter at all (VTEP information needed by encapsulation headers is referred to in the MAC address entries).
Layer-3 forwarding is similar to layer-2 forwarding, but requires more complex data structures… or not. In the case of distributed L3 forwarding one could combine ARP entries and connected subnets into host routes (that’s what most ToR switches do these days) and do a simple hash-based lookup on destination IP addresses. Longest-prefix matches (for non-connected destinations) would still require a walk down an optimized tree structure.
It’s obvious that the number of tenants present on a hypervisor has zero impact on performance (every tenant has an independent forwarding table), the number of hosts in tenant virtual network has almost no impact on performance (see the hosts routes and layer-2 forwarding above), and the longest-prefix match can usually be done in two to four lookups (or more for IPv6). In many implementations the number of lookups doesn’t depend on the size of the forwarding table.
From the forwarding performance standpoint a properly implemented virtual switch remains (almost) infinitely scalable. Suboptimal implementations might have scalability challenges, and every implementation eventually runs into controller scalability issues, which some vendors like Juniper (Contrail), Nuage (VSP) and Cisco (Nexus 1000V) solved with scale-out controller architecture.
Scalability of hypervisor-based overlay virtual networking might have been an issue in early days of technology. Talking about its challenges in 2015 is mostly FUD (physical-to-virtual connectivity is a different story).
Finally, the hardware table sizes (primarily the MAC and ARP table sizes) limit the scalability of hardware-based forwarding. Software-based forwarding has significantly higher limits (how many MAC addresses can you cram into 1GB of RAM?).
Want to know more?
- Read all the linked-to blog posts. Repeat for 2-3 levels of indirection ;)
- Watch the VMware NSX Architecture and Scaling Overlay Virtual Networks webinar;
- For even more details, watch the Overlay Virtual Networking webinar (which includes packet walks for all major hypervisor-based overlay virtual networking solutions).
More disclosure: Pluribus Networks presented @ NFD9.Presenting companies indirectly cover part of my travel expenses, but that never stopped me from expressing my own opinions. More…