Blog Posts in February 2020
A pretty good summary of the topic by Drew Conry-Murray: the market is not going to correct itself, it’s very hard to hold manufacturers or developers accountable for security defects in their products, and nothing much will change until someone dies.
And just in case you wonder how "innovative forwarding-looking disruptive knowledge-focused" companies could produce such ****, I can highly recommend The Stupidity Paradox.
After the “shocking” revelation that a network can never be totally reliable, I addressed another widespread lack of common sense: due to laws of physics, the client-server latency is never zero (and never even close to what a developer gets from the laptop’s loopback interface).
Starting with a short message to anyone interested in our on-site events in Switzerland: on March 10th we’re running our first 2020 workshop, focusing on Docker and containers.
I totally reworked the material, adding tons of new Docker networking examples (including deep dive into iptables) and a few fun things like building an Ansible container, or starting the whole NetBox stack with a single command. Even if you don’t plan to deploy containers in your production network, you might drop by just for that part.
And now for the upcoming webinars:
Every now and then someone tries to justify the “wisdom” of migrating VMs from on-premises data center into a public cloud (without renumbering them) with the idea of “scaling out into the public cloud” aka “cloud bursting”. My usual response: this is another vendor marketing myth that works only in PowerPoint.
To be honest, that statement is too harsh. You can easily scale your application into a public cloud assuming that:
While running the Using VXLAN And EVPN To Build Active-Active Data Centers workshop in early December 2019 I got the usual set of questions about using BGP as the underlay routing protocol in EVPN fabrics, and the various convoluted designs like IBGP-over-EBGP or EBGP-between-loopbacks over directly-connected-EBGP that some vendors love so much.
I got a question along the same lines from one of the readers of my latest EPVN rant who described how convoluted it is to implement the design he’d like to use with the gear he has (I won’t name any vendor because hazardous chemical substances get mentioned when I do).
Imagine you followed the steps taken by Anne Baretta and stored network inventory into a database. What could you do with that information (apart from creating reports)? How about adding a web UI to help less-skilled network operators perform automated tasks?
- While we won’t tell you how to build a web UI in our network automation course, we will tell you how to build a system out of numerous components (and what components you might need).
Considering VMware’s enrapturement with vMotion the following news (reported by Salman Naqvi in a comment to my blog post) was clearly inevitable:
I was surprised to learn that LIVE vMotion is supported between on-premise and Vmware on AWS Cloud
What’s more interesting is how did they manage to do it?
Urban legends claim that Sir Isaac Newton started thinking about gravity when an apple dropped on his head. Regardless of its origins, his theory successfully predicted planetary motions and helped us get people to the moon… there was just this slight problem with Mercury’s precession.
Likewise, his laws of motion worked wonderfully until someone started crashing very small objects together at very high speeds, or decided to see what happens when you give electrons two slits to go through.
Then there was the tiny problem of light traveling at the same speed in all directions… even on objects moving in different directions.
I published a blog post describing how complex the underlay supporting VMware NSX still has to be (because someone keeps pretending a network is just a thick yellow cable), and the tweet announcing it admittedly looked like a clickbait.
[Blog] Do We Need Complex Data Center Switches for VMware NSX Underlay
Martin Casado quickly replied NO (probably before reading the whole article), starting a whole barrage of overlay-focused neteng-versus-devs fun.
What’s the next logical automation step after you cleaned up device configurations and started using configuration templates? It obviously depends on your pain points; for Anne Baretta it was a network inventory database stored in SQL tables (and thus readily accessible from his other projects).
- I’m always amazed that we have to solve simple problems decades after the glitzy slide decks from network management vendors proclaimed them solved;
- I’m also saddened that it’s often really hard to get data out of a network management product;
- Check out our network automation course when you’re ready to start your own automation journey.
Dinesh Dutt, a pragmatic IP routing guru, the mastermind behind great concepts like simplified BGP configuration, and one of the best ipSpace.net authors, finally decided to start blogging. His first article: describing the impact of having 256 100GE ports in a single ASIC (Tomahawk 4). Hope you’ll enjoy his musings as much as I did ;)
After my response to the BGP is a hot mess topic, Corey Quinn graciously invited me to discuss BGP issues on his podcast. It took us a long while to set it up, but we eventually got there… and the results were published last week. Hope you’ll enjoy our chat.
Every now and then I find an IT professional claiming we should not be worried about split-brain scenarios because you have redundant links.
I might understand that sentiment coming from software developers, but I also encountered it when discussing stretched clusters or even SDN controllers deployed across multiple data centers.
Finally I found a great analogy you might find useful. A reader of my blog pointed me to the awesome Why Must Systems Be Operated blog post explaining the same problem from the storage perspective, so the next time you might want to use this one: “so you’re saying you don’t need backup because you have RAID disks”. If someone agrees with that, don’t walk away… RUN!
Got this question from one of ipSpace.net subscribers:
Do we really need those intelligent datacenter switches for underlay now that we have NSX in our datacenter? Now that we have taken a lot of the intelligence out of our underlying network, what must the underlying network really provide?
Reading the marketing white papers the answer would be IP connectivity… but keep in mind that building your infrastructure based on information from vendor white papers usually gives you the results your gullibility deserves.
January 2020 was one of the busiest months we ever had:
- Elisa Jasinska and Nick Buraglio started the year with the first part of Surviving in Internet Default-Free Zone webinar.
- Krzysztof Szarkowicz completed the EVPN-with-MPLS topic describing the intricacies of layer-2 and layer-3 integration, and extending the EVPN webinar to almost 11 hours (with three hours of it covering EVPN in SP/MPLS world).
- Matthias Luft continued the Cloud Security saga, covering logging, monitoring, testing and auditing.
- Preparing for our public cloud course, I described how you can automate AWS infrastructure deployments.
- Christopher Werny wrapped up the month with the second part of IPv6 Enterprise Security talk, this time focusing on layer-2 security.
You can get immediate access to all these webinars with Standard or Expert ipSpace.net subscription.
During a recent workshop I made a comment along the lines “be careful with feature X from vendor Y because it took vendor Z two years to fix all the bugs in a very similar feature”, and someone immediately asked “are you saying it doesn’t work?”
My answer: “I never said that, I just drew inferences from other people’s struggles.”
A Step Back
Networking operating systems are probably some of the most complex pieces of software out there. Distributed systems are hard. Real-time distributed systems are even harder. Real-time distributed systems running on top of eventually-consistent distributed databases are extra fun.
After introducing the fallacies of distributed computing in the How Networks Really Work webinar, I focused on the first one: the network is (not) reliable.
While that might be understood by most networking professionals (and ignored by many developers), here’s an interesting shocker: even TCP is not always reliable (see also: Joel Spolsky’s take on Leaky Abstractions).
A journey of a thousand miles begins with one step they say… but what should that first step be if you want to start a network automation journey (and have no idea how to do it)?
Aldrin wrote a well-thought-out comment to my EVPN Dilemma blog post explaining why he thinks it makes sense to use Juniper’s IBGP (EVPN) over EBGP (underlay) design. The only problem I have is that I forcefully disagree with many of his assumptions.
He started with an in-depth explanation of why EBGP over directly-connected interfaces makes little sense:
If you’re an ipSpace.net subscriber, you might have noticed how busy the last month has been (more about that later). February won’t be much better:
- Later today we’ll have David Barroso talk about safely managing network automation secrets.
- On February 6th I’ll describe the tools you can use to automate Azure deployments, including simple CLI scripts, Ansible, Terraform, and Azure Resource Manager templates.
- We’re starting the Networking in Public Cloud Deployments online course on February 11th.
- David Peñaloza Seijas will talk about Cisco (Viptela) SD-WAN on February 13th;
Finally, I’ll run a day-long workshop in Zurich on March 10th describing containers and Docker.
Unless you’re working for a cloud-only startup, you’ll always have to connect applications running in a public cloud with existing systems or databases running in a more traditional environment, or connect your users to public cloud workloads.
Public cloud providers love stable and robust solutions, and they took the same approach when implementing their legacy connectivity solutions: you could use routed Ethernet connections or IPsec VPN, and run BGP across them, turning the problem into a well-understood routing problem.
In January 2020 Doug Heckaman documented his experience with VeloCloud SD-WAN. He tried to be positive, but for whatever reason this particular bit caught my interest:
Edge Gateways have a limited number of tunnels they can support […]
WTF? Wasn’t x86-based software packet forwarding supposed to bring infinite resources and nirvana? How badly written must your solution be to have a limited number of IPsec tunnels on a decent x86 CPU?