Category: VXLAN
EVPN/VXLAN Complexity
We have school holidays this week, so I’m reposting wonderful comments that would otherwise be lost somewhere in the page margins. Today: Minh Ha on complexity of emulating layer-2 networks with VXLAN and EVPN.
Dmytro Shypovalov is a master networker who has a sophisticated grasp of some of the most advanced topics in networking. He doesn’t write often, but when he does, he writes exceptional content, both deep and broad. Have to say I agree with him 300% on “If an L2 network doesn’t scale, design a proper L3 network. But if people want to step on rakes, why discourage them.”
Worth Reading: Switching to IP fabrics
Namex, an Italian IXP, decided to replace their existing peering fabric with a fully automated leaf-and-spine fabric using VXLAN and EVPN running on Cumulus Linux.
They documented the design, deployment process, and automation scripts they developed in an extensive blog post that’s well worth reading. Enjoy ;)
Why Would You Need VXLAN Transport?
It’s amazing how sometimes people fond of sharing their opinions and buzzwords on various social media can’t answer simple questions. Today’s blog post is based on a true story… a “senior network architect” fully engaged in a recent hype cycle couldn’t answer a simple question:
Why exactly would you need VXLAN and EVPN?
We could spend a day (or a week) discussing the nuances of that simple question, but all I have at the moment is a single web page, so here we go…
Should I Go with VXLAN or MLAG with STP?
TL&DR: It’s 2020, and VXLAN with EVPN is all the rage. Thank you, you can stop reading.
On a more serious note, I got this questions from an Johannes Spanier after he read my do we need complex data center switches for NSX underlay blog post:
Would you agree that for smaller NSX designs (~100 hypervisors) a much simpler Layer2 based access-distribution design with MLAGs is feasible? One would have two distribution switches and redundant access switches MLAGed together.
I would still prefer VXLAN for a number of reasons:
The Never-Ending "My Overlay Is Better Than Yours" Saga
I published a blog post describing how complex the underlay supporting VMware NSX still has to be (because someone keeps pretending a network is just a thick yellow cable), and the tweet announcing it admittedly looked like a clickbait.
[Blog] Do We Need Complex Data Center Switches for VMware NSX Underlay
Martin Casado quickly replied NO (probably before reading the whole article), starting a whole barrage of overlay-focused neteng-versus-devs fun.
Getting More Bang for Your VXLAN Bucks
A little while ago I explained why you can’t use more than 4K VXLAN segments on a ToR switch (at least with most ASICs out there). Does that mean that you’re limited to a total of 4K virtual ethernet segments?
Of course not.
You could implement overlay virtual networks in software (on hypervisors or container hosts), although even there the enterprise products rarely give you more than a few thousand logical switches (to use NSX terminology)… but that’s a product, not technology limitation. Large public cloud providers use the same (or similar) technology to run gazillions of tenant segments.
EVPN Route Targets, Route Distinguishers, and VXLAN Network IDs
Got this interesting question from one of my readers:
BGP EVPN message carries both VNI and RT. In importing the route, is it enough either to have VNI ID or RT to import to the respective VRF?. When importing routes in a VRF, which is considered first, RT or the VNI ID?
A bit of terminology first (which you’d be very familiar with if you ever had to study how MPLS/VPN works):
The EVPN Dilemma
I got an interesting set of questions from a networking engineer who got stuck with the infamous “let’s push the **** down the stack” challenge:
So, I am a rather green network engineer trying to solve the typical layer two stretch problem.
I could start the usual “friends don’t let friends stretch layer-2” or “your business doesn’t need that” windmill fight, but let’s focus on how the vendors are trying to sell him the “perfect” solution:
Can We Really Use Millions of VXLAN Segments?
One of my readers sent me a question along these lines…
VXLAN Network Identifier is 24 bit long, giving 16 us million separate segments. However, we have to map VNI into VLANs on most switches. How can we scale up to 16 million segments when we have run out of VLAN IDs? Can we create a separate VTEP on the same switch?
VXLAN is just an encapsulation format and does not imply any particular switch architecture. What really matters in this particular case is the implementation of the MAC forwarding table in switching ASIC.
VMware NSX-T and Geneve Q&A
A Network Artist left a lengthy comment on my Brief History of VMware NSX blog post. He raised a number of interesting topics, so I decided to write my replies as a separate blog post.
Using Geneve is an interesting choice to be made and while the approach has it’s own Pros and Cons, I would like to stick to VXLAN if I were to recommend to someone for few good reasons.
The main reason I see for NSX-T using Geneve instead of VXLAN is the need for additional header fields to carry metadata around, and to implement Network Services Header (NSH) for east-west service insertion.
Don't Base Your Design on Vendor Marketing
Remember how Arista promoted VXLAN coupled with deep buffer switches as the perfect DCI solution a few years ago? Someone took Arista’s marketing too literally, ran with the idea and combined VXLAN-based DCI with traditional MLAG+STP data center fabric.
While I love that they wrote a blog post documenting their experience (if only more people would do that), it doesn’t change the fact that the design contains the worst of both worlds.
Here are just a few things that went wrong:
Building Fabric Infrastructure for an OpenStack Private Cloud
An attendee in my Building Next-Generation Data Center online course was asked to deploy numerous relatively small OpenStack cloud instances and wanted select the optimum virtual networking technology. Not surprisingly, every $vendor had just the right answer, including Arista:
We’re considering moving from hypervisor-based overlays to ToR-based overlays using Arista’s CVX for approximately 2000 VLANs.
As I explained in Overlay Virtual Networking, Networking in Private and Public Clouds and Designing Private Cloud Infrastructure (plus several presentations) you have three options to implement virtual networking in private clouds:
Private VLANs With VXLAN
I got this remark from a reader after he read the VXLAN and Q-in-Q blog post:
Another area with a feature gap with EVPN VXLAN is Private VLANs with VXLAN. They’re not supported on either Nexus or Juniper switches.
I have one word on using private VLANs in 2019: Don’t. They are messy and complicated to maintain (not to mention how exciting it gets to combine virtual and physical switches).
Loop Avoidance in VXLAN Networks
Antonio Boj sent me this interesting challenge:
Is there any way to avoid, prevent or at least mitigate bridging loops when using VXLAN with EVPN? Spanning-tree is not supported when using VXLAN encapsulation so I was hoping to use EVPN duplicate MAC detection.
MAC move dampening (or anything similar) doesn’t help if you have a forwarding loop. You might be able to use it to identify there’s a loop, but that’s it… and while you’re doing that your network is melting down.
Q-in-Q Support in Multi-Site EVPN
One of my subscribers sent me a question along these lines (heavily abridged):
My customer runs a colocation business and has to provide L2 connectivity between racks, sometimes even across multiple data centers. They were using Q-in-Q to deliver that in a traditional fabric and would like to replace that with multi-site EVPN fabric with ~100 ToR switches in each data center. However, Cisco doesn’t support Q-in-Q with multi-site EVPN. Any ideas?
As Lukas Krattiger explained in his part of Multi-Site Leaf-and-Spine Fabrics section of Leaf-and-Spine Fabric Architectures webinar, multi-site EVPN (VXLAN-to-VXLAN bridging) is hard. Don’t expect miracles like Q-in-Q over VNI any time soon ;)