Virtual Packet Forwarding in Hyper-V Network Virtualization

Last week I explained how layer-2 and layer-3 packet forwarding works in VMware NSX – a solution that closely emulates traditional L2 and L3 networks. Hyper-V Network Virtualization (HNV) is different – it’s almost a layer-3-only solution with only a few ties to layer-2.

HNV Architecture

Hyper-V Network Virtualization started as an add-on module (NDIS lightweight filter) for Hyper-V 3.0 extensible switch (it is fully integrated with the extensible switch in the Windows Server 2012 R2 – more about that in another blog post).

This blog post describes HNV packet forwarding in Windows Server 2012. The follow-up blog post documents the HNV behavior in Windows Server 2012 R2.

Hyper-V extensible switch is a layer-2-only switch; Hyper-V network virtualization module is a layer-3-only solution – an interesting mix with some unexpected side effects.

A distributed layer-3 forwarding architecture could use a single IP routing table to forward traffic between IP hosts. Similar to traditional IP routing solutions, the end-user would configure directly connected IP subnets and prefix routes (with IP next hops), and the virtual networking controller (or the orchestration system) would add host routes for every reachable host. Forwarding within the virtual domain would use host routes; forwarding toward external gateways would use configured IP next hops (which would be recursively resolved from host routes).

Hyper-V network virtualization cannot use a pure layer-3 solution due to layer-2 forwarding within the extensible switch – two VMs connected to the same VLAN within the same hypervisor would communicate directly (without HNV involvement) and would exchange MAC addresses through ARP requests. The same communication path has to exist after one of them is moved to a different hypervisor with Hyper-V live migration – HNV must thus support a mix of layer-2 and layer-3 forwarding.

Control plane setup

A distributed layer-2 + layer-3 forwarding architecture needs at least three tables to forward traffic:

  • IP routing table;
  • ARP table (mapping of IP addresses into MAC addresses);
  • MAC reachability information – outbound ports in pure layer-2 world or destination transport IP addresses in overlay virtual networks.

IP routing table is installed in the Hyper-V hosts with the New-NetVirtualizationCustomerRoute PowerShell cmdlet, ARP table and MAC reachability table are installed as CustomerIP-MAC-TransportIP triplets with the New-NetVirtualizationLookupRecord cmdlet.

Hyper-V Network Virtualization supports IPv4 and IPv6. An IP address mentioned in this blog posts means IPv4 or IPv6 address – but do keep in mind that you have to configure IPv4 and IPv6 network virtualization lookup records independently.

Intra-subnet packet forwarding

When the Hyper-V extensible switch receives a packet from a VM, it has to decide where to send it. At this point the extensible switch uses layer-2 forwarding rules:

  • If the destination MAC address exists within the same segment, send the packet to the destination VM;
  • Flood multicast or broadcast frames to all VMs and the uplink interface;
  • Send frames with unknown destination MAC addresses to the uplink interface.

Hyper-V network virtualization module intercepts packets forwarded by the extensible switch toward the uplink interface and performs layer-3 forwarding and local ARP processing:

  • All ARP requests are answered locally using the information installed with the New-NetVirtualizationLookupRecord cmdlet;
  • IP packets are forwarded to the destination hypervisor based on their destination IP address (not destination MAC address);
  • Flooded frames, frames sent to unknown MAC addresses, and non-IP frames are dropped.

Inter-subnet packet forwarding

Traffic between IP subnets is intercepted by HNV module based on the default gateway destination MAC address (which belongs to HNV). Hyper-V extensible switch sends the traffic toward the default gateway MAC address to the uplink interface (unknown destination MAC address rule), where it’s intercepted by HNV, which performs layer-3 lookup.

The true difference between intra-subnet and inter-subnet layer-3 forwarding is thus the destination MAC address:

  • Intra-subnet IP packets are sent to the MAC address of the destination VM, intercepted by HNV module, and forwarded based on destination IP address;
  • Inter-subnet IP packets are sent to the MAC address of the default gateway (virtual MAC address shared by all HNV modules), also intercepted by HNV module, and forwarded based on destination IP address (when the HNV module has a New-NetVirtualizationLookupRecord for destination IP address) or destination IP prefix (when there’s no New-NetVirtualizationLookupRecord for destination IP address).

Summary: Even though it looks like Hyper-V Network Virtualization in Windows Server 2012 works like any other L2+L3 solutions, it’s a layer-3-only solution between hypervisors and layer-2+layer-3 solution within a hypervisor.

More information

Overlay Virtual Networking webinar describes architectures from numerous vendors, including Cisco, VMware, Microsoft, IBM, and Midokura.

5 comments:

  1. For orchestration PowerShell can be exposed via web services...http://msdn.microsoft.com/en-us/library/hh880865(v=vs.85).aspx

    ReplyDelete
  2. Hi,

    Thank you for this interesting post!

    I have a question on a setup:

    PA: 192.168.2.1,
    VM1 IP: 10.0.0.8
    Gateway: 10.0.0.1

    VM2 IP vNIC1: 10.0.0.1 - connected to virtual switch
    VM2 IP vNIC2: Public IP - connected to internet

    VM2 is running as a router.

    I setup necessary routing policy to make sure 10.0.0.1 is pingable from the VM1. Question is what if VM needs to talk to outside world? There are two gateways in this setup: 192.168.2.1 and 10.0.0.1. My question is which gateway will be used by the VM if the packets need to reach public internet?

    Is this setup correct?

    Thank You,
    Margrita

    ReplyDelete
  3. You say: >>>two VMs connected to the same VLAN within the same hypervisor would communicate directly (without HNV involvement) and would exchange MAC addresses through ARP requests.

    This is because, by default, the Hyper-V Virtual Switch is configured in Trunk Mode. Assign a VLAN ID on the Virtual Switch and then you will see the difference.

    Another point you make is this "Flooded frames, frames sent to unknown MAC addresses, and non-IP frames are dropped."

    Are you sure about this? VM gateway will handle non-IP packets.

    A.

    ReplyDelete
  4. >>>Send frames with unknown destination MAC addresses to the uplink interface.

    By default, this is not the case unless you have a Type: WildcardGateway set in IP routing table. This is how Cisco 1000v and HNV Gateway works. WildcardGateway entry is created for each VSID.

    A

    ReplyDelete
  5. >>>Even though it looks like Hyper-V Network Virtualization in Windows Server 2012 works like any other L2+L3 solutions, it’s a layer-3-only solution between hypervisors and layer-2+layer-3 solution within a hypervisor.

    It has to be like as you said. Without this, you cannot route packets between different VSIDs.

    A

    ReplyDelete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.