Layer-2 and Layer-3 Switching in VMware NSX

All overlay virtual networking solutions look similar from far away: many provide layer-2 segments, most of them have some sort of distributed layer-3 forwarding, gateways to physical world are ubiquitous, and you might find security features in some products.

The implementation details (usually hidden behind the scenes) vary widely, and I’ll try to document at least some of them in a series of blog posts, starting with VMware NSX.

Layer-2 forwarding

VMware NSX supports traditional layer-2 segments with proper flooding of BUM (Broadcast, Unknown unicast, Multicast) frames. NSX controller downloads forwarding entries to individual virtual switches, either through OpenFlow (NSX for multiple hypervisors) or a proprietary protocol (NSX for vSphere). The forwarding entries map destination VM MAC addresses into destination hypevisor (or gateway) IP addresses.

On top of static forwarding entries downloaded from the controller, virtual switches perform dynamic MAC learning for MAC addresses reachable through layer-2 gateways.

Layer-3 forwarding

NSX implements a distributed forwarding model with shared gateway IP and MAC addresses, very similar to optimal IP forwarding offered by Arista or Enterasys. NSX virtual switches aren’t independent devices, so they don’t need independent IP addresses like physical ToR switches.

Layer-3 lookup is always performed by the ingress node (hypervisor host or gateway); packet forwarding from ingress node to egress node and destination host uses layer-2 forwarding. Every ingress node thus needs (for every tenant):

  • IP routing table;
  • ARP entries for all tenant’s hosts;
  • MAC-to-underlay-IP mappings for all tenant’s hosts (see layer-2 forwarding above).

NSX for vSphere implements layer-3 forwarding in a separate vSphere kernel module. The User World Agent (UWA) running within the vSphere host uses proprietary protocol (mentioned above) to report local IP-to-MAC mappings and get layer-3 forwarding information (routing tables) from the controller cluster. ARP entries are cached in the layer-3 forwarding kernel module, controller is queried on local ARP cache misses, and finally the ARP request might get flooded if the controller cannot provide the answer.

NSX for multiple hypervisors implements layer-3 forwarding data plane in OVS kernel module, but does not use OpenFlow to install forwarding entries.

A separate layer-3 daemon (running in user mode on the hypervisor host) receives forwarding information from NSX controller cluster through OVSDB protocol, and handles all ARP processing (sending ARP requests, caching responses …) locally.

More information


  1. I would be interested to a blog post comparing VMware NSX and Hyper-V HNV/NVGRE pros and cons of each and so on...
Add comment