John Jackson wrote an interesting comment on my Rearchitecting L3-Only Networks blog post:
What the host has configured for its default gateway doesn't really matter, correct? Because the default gateway in traditional L2 access networks really isn't about the gateway's IP address, but the gateway's MAC address. The destination IP address in the packet header is always the end destination IP address, never the default gateway.
He totally got the idea, however there are a few minor details to consider.
You might want to read this first
This is the fourth blog post in this series. You might want to read the first three before starting this one.
- Why is IPv6 layer-2 security so complex?
- What is layer-2 and why do we need it?
- Rearchitecting L3-Only Networks
Now for those pesky details
Host-side ARP/ND. Even though the L3 switch (aka router) doesn’t need L2 information to route IP/IPv6 packets (someone even suggested sending them to L2 broadcast address), the host still thinks it has to deal with traditional L2/L3 forwarding environment. The L3 switch must thus reply to every incoming ARP request.
We don’t want to go down the path of configuring IPv4 /31 or IPv6 /64 prefixes on host-to-switch links. This approach kills traditional VM mobility, wastes IPv4 address space (pundits claim there’s plenty of IPv6 address space, so who cares), and explodes L3 switch configuration.
MAC address used in ARP/ND replies. As explained in the previous paragraph, the L3 switch must reply to all host ARP/ND request. What IP address should it use in the reply? Most hardware implementations and Juniper Contrail use the anycast MAC address (MAC address shared across all L3 switches) in ARP/ND replies. Hyper-V Network Virtualization and Amazon VPC use MAC address of the destination host (if the destination host is in the same subnet) in ARP/ND replies to enable end-to-end reachability checks done with unicast ARP or unicast Neighbor Discovery messages.
Replying with MAC address of destination host is not possible if the first-hop router doesn’t have that information. Hyper-V can do it because the orchestration system distributes IP, MAC and VTEP (remote hypervisor) addresses to all hypervisor hosts in a routing domain (tenant/VRF).
Using multicast or broadcast MAC address in ARP replies is a non-starter – it’s not permitted by one of the RFCs, making it hard to make Cisco switches work properly with multicast-based Microsoft Network Load Balancing.
Forwarding pipeline in hardware switches. Even though the first-hop hardware switch might be able to reply with the MAC address of the destination host, it might not be able to ignore the MAC address when doing L3 forwarding – its hardware forwarding pipeline might require a hit on destination MAC address before sending the incoming packet to L3 forwarding. Replying to ARP requests with anycast MAC address is thus the only sensible thing to do in physical networks.
- Cisco DFA forwarding behavior is briefly described in Data Center Fabric Architectures webinar.
- Microsoft Hyper-V Network Virtualization packet forwarding is detailed in the Overlay Virtual Networking webinar (which includes step-by-step packet traces for most major overlay virtual networking solutions).
- There might be more – check the other webinars you get with the yearly subscription.