Category: WAN
VXLAN/EVPN Layer-3 Handoff (L3Out) on Arista EOS
A while ago, I published a blog post describing how to establish a LAN/WAN L3 boundary in VXLAN/EVPN networks using Cisco NX-OS. At that time, I promised similar information for Arista EOS. Here it is, coming straight from Massimo Magnani. The useful part of what follows is his; all errors were introduced during my editing process.
In the cases I have dealt with so far, implementing the LAN-WAN boundary has the main benefit of limiting the churn blast radius to the local domain, trying to impact the remote ones as little as possible. To achieve that, we decided to go for a hierarchical solution where you create two domains, local (default) and remote, and maintain them as separate as possible.
Layer-3 WAN Handoff (L3Out) in VXLAN/EVPN Fabrics
I got a question from a few of my students regarding the best way to implement end-to-end EVPN across multiple locations. Obviously there’s the multi-pod and multi-site architecture for people believing in the magic powers of stretching VLANs across the globe, but I was looking for something that I could recommend to people who understand that you have to have a L3 boundary if you want to have multiple independent failure domains (or availability zones).
Worth Reading: MP-TCP in Hybrid Access Networks
Wouldn’t it be nice if your home router (CPE) could use DSL (or slow-speed fibre) and LTE connection at the same time? Even better: run a single TCP session over both links? The answer to both questions is YES, of course it could do that, if only your service provider would be interested in giving you that option.
We solved similar problems with multilink PPP in the networking antiquity, today you could use a CPE with an MP-TCP proxy combined with a Hybrid Access Gateway in the service provider network. For more details, read the excellent Increasing broadband reach with Hybrid Access Networks article by prof. Olivier Bonaventure and his team.
Configuring Linux Traffic Control in a Sane Way
Smart engineers were forever using Linux (in particular, its traffic control/queue discipline functionality) to simulate WAN link impairment. Unfortunately, there’s a tiny hurdle you have to jump across: the tc CLI is even worse than iptables.
A long while ago someone published a tc wrapper that simulates shitty network connections and (for whatever reason) decided to call it Comcast. It probably does the job, but I would prefer to have something in Python. Daniel Dib found just that – tcconfig – and used it to simulate WAN link behavior on VMware vSphere.
CE-to-CE IBGP Session in a Multihomed Site
One of my readers sent me a question along these lines:
Do I have to have an IBGP session between Customer Edge (CE) routers in a multihomed site if they run EBGP with the upstream provider(s)?
Let’s start with a simple diagram and a refactoring of the question:
Worth Exploring: OMNI and AERO
Do you ever feel like we don’t have enough overlay networking technologies? Don’t worry, there’s always another one, for example Overlay Multilink Network Interface (OMNI) with Asymmetric Extended Route Optimization (AERO) services. Want to know more? Fred Templin described it in a series of overview articles on APNIC blog.
Feedback Appreciated: Next-Generation Metro Area Networks
Etienne-Victor Depasquale, a researcher at University of Malta, is trying to figure out what technologies service providers use to build real-life metro-area networks, and what services they offer on top of that infrastructure.
If you happen to be involved with a metro area network, he’d love to hear from you – please fill in this survey – and he promised that he’ll share the results of the survey with the participants.
Worth Reading: Latency Matters When Migrating Workloads
It’s so refreshing to find someone who understands the impact of latency on application performance, and develops a methodology that considers latency when migrating a workload into a public cloud: Adding latency: one step, two step, oops by Lawrence Jones.
Twilight Zone: File Transfer Never Completes
Ages ago when we were building networks using super-expensive 64kbps WAN links, a customer sent us a weird bug report:
Everything works fine, but we cannot transfer one particular file between two locations – the file transfer stalls and eventually times out. At the same time, we’re seeing increased number of CRC errors on the WAN link.
My chat with the engineer handling the ticket went along these lines:
Twilight Zone: File Transfer Causes Link Drop
Long long time ago, we built a multi-protocol WAN network for a large organization. Everything worked great, until we got the weirdest bug report I’ve seen thus far:
When trying to transfer a particular file with DECnet to the central location, the WAN link drops. That does not happen with any other file, or when transferring the same file with TCP/IP. The only way to recover is to power cycle the modem.
Try to figure out what was going on before reading any further ;)
VXLAN-to-VXLAN Bridging in DCI Environments
Almost exactly a decade ago I wrote that VXLAN isn’t a data center interconnect technology. That’s still true, but you can make it a bit better with EVPN – at the very minimum you’ll get an ARP proxy and anycast gateway. Even this combo does not address the other requirements I listed a decade ago, but maybe I’m too demanding and good enough works well enough.
However, there is one other bit that was missing from most VXLAN implementations: LAN-to-WAN VXLAN-to-VXLAN bridging. Sounds weird? Supposedly a picture is worth a thousand words, so here we go.
Is X.25 Still Alive?
Enrique Vallejo asked an interesting question a while ago:
When was X.25 official declared dead? Note that the wikipedia claims that it is still in use in parts of the world.
Wikipedia is probably right, and had several encounters with X.25 that would corroborate that claim. If you happen to have more up-to-date information, please leave a comment.
Must Watch: How NOT to Measure Latency
A while ago someone pointed me to an interesting talk explaining why 99th percentile represents a pretty good approximation of user-experienced latency on a typical web page (way longer version: Understanding Latency and Application Responsiveness, also How I Learned to Stop Worrying and Love Misery)
If you prefer reading instead of watching videos, there’s also everything you know about latency is wrong.
Automation Example: Drain a Circuit
One of the attendees of our Building Network Automation Solutions online course asked an interesting question in the course Slack team:
Has anyone wrote a playbook for putting a circuit into maintenance mode — i.e. adjusting metrics to drain traffic away from a circuit that is going to be taken down for maintenance?
As always, you have to figure out what you want to do before you can start automating stuff.
Video: Bandwidth Is Neither Infinite Nor Cheap
After decades of riding Moore’s law curve, the networking bandwidth should be (almost) infinite and (almost) free, right? WRONG, as I explained in the Bandwidth Is (Not) Infinite and Free video (part of How Networks Really Work webinar).
There are still pockets of Internet desert where mobile- or residential users have to deal with traffic caps. If you decide to move your applications into any public cloud you better check how much bandwidth those applications consume, or you’ll be the next victim of the Great Bandwidth Swindle, for more details, watch the video.