I got a question from a few of my students regarding the best way to implement end-to-end EVPN across multiple locations. Obviously there’s the multi-pod and multi-site architecture for people believing in the magic powers of stretching VLANs across the globe, but I was looking for something that I could recommend to people who understand that you have to have a L3 boundary if you want to have multiple independent failure domains (or availability zones).
Wouldn’t it be nice if your home router (CPE) could use DSL (or slow-speed fibre) and LTE connection at the same time? Even better: run a single TCP session over both links? The answer to both questions is YES, of course it could do that, if only your service provider would be interested in giving you that option.
We solved similar problems with multilink PPP in the networking antiquity, today you could use a CPE with an MP-TCP proxy combined with a Hybrid Access Gateway in the service provider network. For more details, read the excellent Increasing broadband reach with Hybrid Access Networks article by prof. Olivier Bonaventure and his team.
Smart engineers were forever using Linux (in particular, its traffic control/queue discipline functionality) to simulate WAN link impairment. Unfortunately, there’s a tiny hurdle you have to jump across: the tc CLI is even worse than iptables.
A long while ago someone published a tc wrapper that simulates shitty network connections and (for whatever reason) decided to call it Comcast. It probably does the job, but I would prefer to have something in Python. Daniel Dib found just that – tcconfig – and used it to simulate WAN link behavior on VMware vSphere.
One of my readers sent me a question along these lines:
Do I have to have an IBGP session between Customer Edge (CE) routers in a multihomed site if they run EBGP with the upstream provider(s)?
Let’s start with a simple diagram and a refactoring of the question:
Do you ever feel like we don’t have enough overlay networking technologies? Don’t worry, there’s always another one, for example Overlay Multilink Network Interface (OMNI) with Asymmetric Extended Route Optimization (AERO) services. Want to know more? Fred Templin described it in a series of overview articles on APNIC blog.
Etienne-Victor Depasquale, a researcher at University of Malta, is trying to figure out what technologies service providers use to build real-life metro-area networks, and what services they offer on top of that infrastructure.
If you happen to be involved with a metro area network, he’d love to hear from you – please fill in this survey – and he promised that he’ll share the results of the survey with the participants.
It’s so refreshing to find someone who understands the impact of latency on application performance, and develops a methodology that considers latency when migrating a workload into a public cloud: Adding latency: one step, two step, oops by Lawrence Jones.
Ages ago when we were building networks using super-expensive 64kbps WAN links, a customer sent us a weird bug report:
Everything works fine, but we cannot transfer one particular file between two locations – the file transfer stalls and eventually times out. At the same time, we’re seeing increased number of CRC errors on the WAN link.
My chat with the engineer handling the ticket went along these lines:
Long long time ago, we built a multi-protocol WAN network for a large organization. Everything worked great, until we got the weirdest bug report I’ve seen thus far:
When trying to transfer a particular file with DECnet to the central location, the WAN link drops. That does not happen with any other file, or when transferring the same file with TCP/IP. The only way to recover is to power cycle the modem.
Try to figure out what was going on before reading any further ;)
Almost exactly a decade ago I wrote that VXLAN isn’t a data center interconnect technology. That’s still true, but you can make it a bit better with EVPN – at the very minimum you’ll get an ARP proxy and anycast gateway. Even this combo does not address the other requirements I listed a decade ago, but maybe I’m too demanding and good enough works well enough.
However, there is one other bit that was missing from most VXLAN implementations: LAN-to-WAN VXLAN-to-VXLAN bridging. Sounds weird? Supposedly a picture is worth a thousand words, so here we go.
Enrique Vallejo asked an interesting question a while ago:
When was X.25 official declared dead? Note that the wikipedia claims that it is still in use in parts of the world.
Wikipedia is probably right, and had several encounters with X.25 that would corroborate that claim. If you happen to have more up-to-date information, please leave a comment.
A while ago someone pointed me to an interesting talk explaining why 99th percentile represents a pretty good approximation of user-experienced latency on a typical web page (way longer version: Understanding Latency and Application Responsiveness, also How I Learned to Stop Worrying and Love Misery)
If you prefer reading instead of watching videos, there’s also everything you know about latency is wrong.
One of the attendees of our Building Network Automation Solutions online course asked an interesting question in the course Slack team:
Has anyone wrote a playbook for putting a circuit into maintenance mode — i.e. adjusting metrics to drain traffic away from a circuit that is going to be taken down for maintenance?
As always, you have to figure out what you want to do before you can start automating stuff.
After decades of riding the Moore’s law curve the networking bandwidth should be (almost) infinite and (almost) free, right? WRONG, as I explained in the Bandwidth Is (Not) Infinite and Free video (part of How Networks Really Work webinar).
There are still pockets of Internet desert where mobile- or residential users have to deal with traffic caps, and if you decide to move your applications into any public cloud you better check how much bandwidth those applications consume or you’ll be the next victim of the Great Bandwidth Swindle. For more details, watch the video.
After the “shocking” revelation that a network can never be totally reliable, I addressed another widespread lack of common sense: due to laws of physics, the client-server latency is never zero (and never even close to what a developer gets from the laptop’s loopback interface).