Category: Bridging
The EVPN Dilemma
I got an interesting set of questions from a networking engineer who got stuck with the infamous “let’s push the **** down the stack” challenge:
So, I am a rather green network engineer trying to solve the typical layer two stretch problem.
I could start the usual “friends don’t let friends stretch layer-2” or “your business doesn’t need that” windmill fight, but let’s focus on how the vendors are trying to sell him the “perfect” solution:
Can We Really Use Millions of VXLAN Segments?
One of my readers sent me a question along these lines…
VXLAN Network Identifier is 24 bit long, giving 16 us million separate segments. However, we have to map VNI into VLANs on most switches. How can we scale up to 16 million segments when we have run out of VLAN IDs? Can we create a separate VTEP on the same switch?
VXLAN is just an encapsulation format and does not imply any particular switch architecture. What really matters in this particular case is the implementation of the MAC forwarding table in switching ASIC.
Stretched Layer-2 Subnets in Azure
Last Thursday morning I found this gem in my Twitter feed (courtesy of Stefan de Kooter)
Greg Cusanza in #BRK3192 just announced #Azure Extended Network, for stretching Layer 2 subnets into Azure!
As I know a little bit about how networking works within Azure, and I’ve seen something very similar a few times in the past, I was able to figure out what’s really going on behind the scenes in a few seconds… and got reminded of an old Russian joke I found somewhere on Quora:
Auto-MLAG and Auto-BGP in Cumulus Linux
When I first met Cumulus Networks engineers (during NFD9) their focus on simplifying switch configurations totally delighted me (video).
After solving the BGP configuration challenge (could you imagine configuring BGP in a leaf-and-spine fabric with just a few commands in 2015), they did the same thing with EVPN configuration, where they decided to implement the simplest possible design (EBGP-only fabric running EBGP EVPN sessions on leaf-to-spine links), resulting in another round of configuration simplicity.
Disaster Recovery Test Faking: Another Use Case for Stretched VLANs
The March 2019 Packet Pushers Virtual Design Clinic had to deal with an interesting question:
Our server team is nervous about full-scale DR testing. So they have asked us to stretch L2 between sites. Is this a good idea?
The design clinic participants were a bit more diplomatic (watch the video) than my TL&DR answer which would be: **** NO!
Let’s step back and try to understand what’s really going on:
Real-Life Data Center Meltdown
A good friend of mine who prefers to stay A. Nonymous for obvious reasons sent me his “how I lost my data center to a broadcast storm” story. Enjoy!
Small-ish data center with several hundred racks. Row of racks supported by an end-of-row stack. Each stack with 2 x L2 EtherChannels, one EC to each of 2 core switches. The inter-switch link details don’t matter other than to highlight “sprawling L2 domains."
VLAN pruning was used to limit L2 scope, but a few VLANs went everywhere, including the management VLAN.
Automation Solution: Find Source of STP Topology Changes
Topology changes are a bane of large STP-based networks, and when they become a serious challenge you could probably use a tool that could track down what’s causing them.
I’m sure there’s a network management tool out there that can do just that (please write a comment if you know one); Eder Gernot decided to write his own while working on a hands-on assignment in the Building Network Automation Solutions online course. Like most course attendees he published the code on GitHub and might appreciate pull requests ;)
Wonder what else course attendees created in the past? Here’s a small sample.
Commentary: We’re stuck with 40 years old technology
One of my readers sent me this email after reading my Loop Avoidance in VXLAN Networks blog post:
Not much has changed really! It’s still a flood/learn bridged network, at least in parts. We count 2019 and talk a lot about “fabrics” but have 1980’s networks still.
The networking fundamentals haven’t changed in the last 40 years. We still use IP (sometimes with larger addresses and augmentations that make it harder to use and more vulnerable), stream-based transport protocol on top of that, leak addresses up and down the protocol stack, and rely on technology that was designed to run on 500 meters of thick yellow cable.
How Common Are Data Center Meltdowns?
We all know about catastrophic headline-generating failures like AWS East-1 region falling apart or a major provider being down for a day or two. Then there are failures known only to those who care, like losing a major exchange point. However, I’m becoming more and more certain that the known failures are not even the tip of the iceberg – they seem to be the climber at the iceberg summit.
Loop Avoidance in VXLAN Networks
Antonio Boj sent me this interesting challenge:
Is there any way to avoid, prevent or at least mitigate bridging loops when using VXLAN with EVPN? Spanning-tree is not supported when using VXLAN encapsulation so I was hoping to use EVPN duplicate MAC detection.
MAC move dampening (or anything similar) doesn’t help if you have a forwarding loop. You might be able to use it to identify there’s a loop, but that’s it… and while you’re doing that your network is melting down.