ARP Processing in Layer-3-Only Networks

John Jackson wrote an interesting comment on my Rearchitecting L3-Only Networks blog post:

What the host has configured for its default gateway doesn't really matter, correct? Because the default gateway in traditional L2 access networks really isn't about the gateway's IP address, but the gateway's MAC address. The destination IP address in the packet header is always the end destination IP address, never the default gateway.

He totally got the idea, however there are a few minor details to consider.

You might want to read this first

This is the fourth blog post in this series. You might want to read the first three before starting this one.

Now for those pesky details

Host-side ARP/ND. Even though the L3 switch (aka router) doesn’t need L2 information to route IP/IPv6 packets (someone even suggested sending them to L2 broadcast address), the host still thinks it has to deal with traditional L2/L3 forwarding environment. The L3 switch must thus reply to every incoming ARP request.

We don’t want to go down the path of configuring IPv4 /31 or IPv6 /64 prefixes on host-to-switch links. This approach kills traditional VM mobility, wastes IPv4 address space (pundits claim there’s plenty of IPv6 address space, so who cares), and explodes L3 switch configuration.

MAC address used in ARP/ND replies. As explained in the previous paragraph, the L3 switch must reply to all host ARP/ND request. What IP address should it use in the reply? Most hardware implementations and Juniper Contrail use the anycast MAC address (MAC address shared across all L3 switches) in ARP/ND replies. Hyper-V Network Virtualization and Amazon VPC use MAC address of the destination host (if the destination host is in the same subnet) in ARP/ND replies to enable end-to-end reachability checks done with unicast ARP or unicast Neighbor Discovery messages.

Replying with MAC address of destination host is not possible if the first-hop router doesn’t have that information. Hyper-V can do it because the orchestration system distributes IP, MAC and VTEP (remote hypervisor) addresses to all hypervisor hosts in a routing domain (tenant/VRF).

Using multicast or broadcast MAC address in ARP replies is a non-starter – it’s not permitted by one of the RFCs, making it hard to make Cisco switches work properly with multicast-based Microsoft Network Load Balancing.

Forwarding pipeline in hardware switches. Even though the first-hop hardware switch might be able to reply with the MAC address of the destination host, it might not be able to ignore the MAC address when doing L3 forwarding – its hardware forwarding pipeline might require a hit on destination MAC address before sending the incoming packet to L3 forwarding. Replying to ARP requests with anycast MAC address is thus the only sensible thing to do in physical networks.

Tangential information

Add comment