EVPN: All that Glitters Is Not Gold
Cumulus Linux 3.2 shipped with a rudimentary EVPN implementation and everyone got really excited, including smaller ASIC manufacturers that finally got a control plane for their hardware VTEP functionality.
However, while it’s nice to have EVPN support in Cumulus Linux, the claims of its benefits are sometimes greatly exaggerated.
For example, David Iles from Mellanox claims that “… with EVPN, we have an industry standardized control plane for VTEP orchestration using an extension of BGP” and “… EVPN will deliver many of the promises of TRILL, FabricPath, VCS, and other data center fabrics but in a scalable, non-proprietary, way”
Let’s get a few things straight:
- TRILL is as standard as EVPN (if not more). Mentioning TRILL in the same list as VCS and FabricPath is disingenuous.
- As much as I love EVPN, TRILL has fewer options and thus higher chance of having at least two interoperable implementations.
Finally, EVPN can be implemented in so many ways that I started calling it “the SIP of networking”. For example, at the moment (in release 3.2.1) Cumulus Linux implements only Type-3 routes (similar to what Cisco Nexus 1000V was doing) and relies on dynamic MAC learning, while Cisco’s and Juniper’s implementations on ToR switches use BGP-based MAC+IP address propagation with Type-2 routes.
2017-02-23: Cumulus Linux already supports type-2 (MAC/IP) routes. That was fast ;) However, I got a hefty dose of “these are all the things that can go wrong” information a few days back, and it's even worse than what I thought. More details in a future blog post.
I’m also hearing rumors that symmetrical IRB (Cisco) and asymmetrical IRB (Juniper) implementations (you’ll find more details in the Mixed Layer-2 + Layer-3 Designs part of Leaf-and-Spine Architectures webinar) don’t work well together… or maybe that’s just FUD - I would love to hear from someone who got EVPN working between a Nexus 9000 and QFX 5100 or QFX 10K.
It can be difficult enough to have different types of optics to talk to each other. Is Ethernet always Ethernet? Yes, mostly but there are things to consider there as well. How good is inter-op in MPLS? Works, mostly, probably.
Don't get me wrong, I like EVPN but sometimes even running protocols within the same vendor is challenging. An RFC can always be interpreted in different ways. When we do interop we often have to settle on getting the common features working which is less than either device might be capable of.
Another concern is that if everyone started building networks exactly the same with exactly the same components. What would happen the next time we see something like Intel SOC issue? The consequences would be disastrous. So while some ideas are nice in theory not everyone can use them and too much homogenity would not be good.
Think you have some good blogging material above :)
Now you are talking about the old rosy "vendor lock-in" ....
I know many networks here in denmark that have gone with a vendor / technology to circumvent vendor lock-in and they all end up with a mess of hacks and instability.
From my point of view vendor lock-in is great as long as you build networks and compute like "lego pods" ... Some pods running blue lego others pods run green lego.
All glued together with a fairly well tested multi vendor protocol vanilla BGP.
I guess my point is that it will never make sense to build a leaf & spine topology consisting of Cisco, Juniper, Arista and Brocade all in the same topology. They would be different parts of the network and interconnected in some way. Different pods as you say.
Oh, and you should talk with those vendors and ask them why they want to pile so many unnecessary protocols on top of one another ;))
The answer is pretty simple. Tens of thousands of customers world-wide want thousands of different protocols, different flavors, different behaviour, and different knobs.
I'll give you a basic example: MPLS. There are customers who insist on solving every problem with MPLS. And then there are other customers who insist on keeping MPLS out of their network at all costs. What is a vendor supposed to do ? If they want to make a living, the only solution is to implement all those protocols and knobs. Or else they will have very few customers. Can you blame them ?
The right answer would be "you COULD run EVPN on top of IBGP on top of IGP and use PIM to build multicast trees but you COULD also run EVPN with EBGP and source-node replication if you want to keep your DC fabric simple"
No surprise - you are getting "circumcised" implementations from vendors without solid BGP and system know-how. I highly respect person who did EVPN for Quagga, however it is too much and too short time to build a fully functional code base.
Would you expect them to? I'd have thought the node running symmetric IRB should be able to understand the routes sent by the asymmetric node (since it's just regular EVPN forwarding between PEs), but I wouldn't expect the asymmetric node to understand the routes sent by the symmetric node (since the forwarding is "between VRFs" in a sense).
Is there a sensible use case that involves both symmetric and asymmetric IRB in the same DC?
Use case: using Cisco and Juniper gear in the same network ;)
On the data plane traffic from "asymmetric" to "symmetric" should have no problem (assuming route is accepted). "Symmetric" could have forwarded L2 traffic (but choose to drop route on contol plane), but L3 forwarding will definitely fail route lookup.
It might be possible to do "asymmetric" routing on spine switches with something else in pure L2 mode as leaf nodes. But this is not exactly "interoperability", it is simply one of the properties of "asymmetric" implementation.
And I've upstreamed some Ryu patches to get Ryu to play along as well.
Talking about MAC-IP: these can be injected into the fabric, but in my experiments Cisco opted to ignore the IP adres for L2 EVPN and instead uses unicast flooding ("ingress replication") while they could have simply accepted and ARP-table imported the IP-MAC from the BGP update.
There's bound to be an explanation (and yes I have read the relevant RFC's, ugh) but I'm struggling to understand the downside of using the IP from the update.
There are significant efforts to make protocols interoperable. How long did it take to get MPLS working between multiple vendors? Jeff and Ivan can probably write a book about it :-)
EVPN is still in its early days. While some vendors ship products for 2 or 3 years, many others just started the journey. The original contributors to EVPN had different use-cases in mind and so different implementation have been chosen, all compliant to the IETF RFCs or drafts.
For the record: Symmetric/Asymmetric IRB is one of the differences and there are reasons why one was chosen over the other. There are others variations in implementation that exists and all are compliant to the RFC/drafts.
Disclaimer: I’m working for Cisco and I’m very involved with the EVPN solution
I am in a verge to rationalize this statement soon with my environment. lmk if anyone have done testing already.
Based on what I've heard from someone who tried to deploy pretty complex EVPN setup on QFX10K with an early version of the code, test everything you want to use before committing to a deployment.