On Generating EVPN MAC/IP Routes
Naveen Kumar Devaraj was reading my Integrated Routing and Bridging (IRB) with EVPN MAC-VRF Instances lab exercise and spotted this detail:
Arista EOS originates MAC-IP routes with and without IP addresses, effectively doubling the size of the EVPN BGP table
He kindly wrote a LinkedIn comment explaining that behavior:
This is by design since the triggers for these two types of routes are fundamentally different. A MAC table update triggers and aligns with a MAC-Only route, while an ARP table update triggers and aligns with a MAC+IP route. They are kept separate in EVPN for the exact same reason ARP and MAC tables are separate, as we know traditionally.
You know I had to check that against the RFC 7432, right? Here’s what it says in Section 9.2.1 (ARP and ND):
Thanks a million, Naveen (I also fixed the exercise description)!
Thanks for getting down to this level of detail Ivan, it's interesting to evaluate the tradeoff between table size &
Back in the bad old days of RSTP & IRB, it was critical to ensure that ARP aging timers & MAC aging timers were set to the same value on router-flavored switches in order to avoid unknown unicast flooding.
EVPN handles BUM traffic a lot better, but I'd imagine it's still good practice to align these two timers to avoid similar issues. & you can add IPv6 neighbor discovery stale timers to the list too...
Also worth noting that this technique might double the number of EVPN RIB & FIB entries, but MAC entries generally don't consume as many hardware resources as full type 5 routes so the overhead to add the additional MAC routes is lower than it may appear at first glance.
One of these days, I have to check how various implementations deal with ARP/MAC aging. I know that at least Cisco IOS keeps ARPing things it has in the ARP cache, so the cache never times out as long as the device is there (https://blog.ipspace.net/2007/06/ar/).
As for RIB & FIB entries, separate MAC and MAC/IP routes definitely consume twice the RIB memory, but there's no extra overhead on the hardware side -- you need MAC and ARP tables no matter what.
Hi Blake & Ivan,
Sharing some information to supplement your comments:
1.1. Aligning ARP and MAC Timers
Aligning these timers is important, but simply setting them to the same value may not solve the problem because the ARP and MAC timers could start at different times! Generally, what we do in Arista deployments is ensure the ARP timer < MAC timer. This is to ensure that the ARP refresh is forced on those "router-flavored" switches / IRB VTEPs / EVPN-PE devices. For context, the EOS default ARP timeout is 4 hours, and MAC aging is 5 minutes. We generally set MAC aging to something > 4 hours (the actual values are a debatable topic, so I won't get into the details here).
1.2. EOS Defaults & Control Plane Optimization
I don't know the history behind why these specific EOS default values were chosen, but here is my reasoning for why the default ARP timeout > MAC aging. (default values were built into EOS well before EVPN was conceived/implemented on Arista)
ARP timeout influences two things: - ARP expiration - ARP refreshes
Unlike MAC learning and aging, both of these are control plane operations initiated on the switch's CPU. So, perhaps this is the reason for having a higher timeout value for ARP - it serves as a control plane optimization. As a next-level optimization, the ARP refresh timer on EOS includes a randomness factor to ensure ARP refreshes are staggered.
Ultimately, as long as you set the ARP timer < MAC timer, you're good. There won't be unicast flooding, as the ARP refresh will naturally force MAC learning and keep it refreshed.
P.S. I call it “Recommended Practices” instead of the usual “Best Practices” terminology people may be used to. “Best” is very subjective IMHO!
2.1. EVPN Route Types & Hardware Tables
Type-2 MAC-IP routes are treated as host routes (/32 routes with IPv4). As a result, they consume hardware resources - specifically the LEM (Longest Exact Match) table - similar to how Type-5 routes are typically populated in the LPM (Longest Prefix Match) table.
Scaling is a fascinating topic for me personally, across both the control plane and data plane. There are a lot of cool features on EOS to address scaling. If there's interest in this topic, I'd be happy to share more in my next comment (or maybe a dedicated article!).
2.2. Hardware Table Optimization (DCI)
As for hardware table optimization with overlays, especially in Data Center Interconnect (DCI) scenarios, you'll find this patent interesting: US12375401B2 (https://patents.google.com/patent/US12375401B2/en). It is simple idea marrying route summarization and route-type conversion to save valuable hardware and software resources.
Let me know your thoughts!
Cheers Naveen