Blog Posts in March 2020
One of the first hands-on exercises in our Networking in Public Cloud Deployments asks the attendees to automate something. They can choose the cloud provider they want to work with and the automation tool they prefer… but whatever they do has to be automated.
Most solutions include a simple CloudFormation, Azure Resource Manager, or Terraform template with a line or two of README.MD, but Erik Auerswald totally astonished me with a detailed and precise writeup. Enjoy!
With webinars being the only way to deliver training content these days, we’ll run one every week in April 2020:
Starting on April 2nd I’ll talk about one of my favorite topics: switching, bridging and routing, covering almost everything ever invented from virtual circuits and source route bridging to so-called routing at layer-2 and IP forwarding based on host routes;
I was planning to update the Introduction to Containers and Docker material for ages… but then had to move the December 2019 workshop to March 2020, only to cancel it a week before the coronavirus exploded for real in Switzerland. I hope I’ll manage to deliver the online version on April 9th ;)
Dinesh Dutt is back on April 16th with an update of Network Automation Tools webinar, in which he’ll cover (among other things) the new network automation tools launched since we did the original webinar in 2016.
On April 23rd Pete Lumbis plans to dive as deep into the intricacies of switching ASICs as he can without violating an NDA ;)
When I’ve seen my good friends Christopher Werny and Enno Rey talk about IPv6 security at RIPE78 meeting, another bit of one of my puzzles fell in place. I was planning to do an update of the IPv6 security webinar I’d done with Eric Vyncke, and always wanted to get it done by a security practitioner focused on enterprise networks, making Christopher a perfect fit.
As it was almost a decade since we did the original webinar, Christopher started with an overview of IPv6 security challenges (TL&DR: not much has changed).
A lot of people are confused about the roles of network layers (some more than others), the interactions between MAC addresses, IP addresses, and TCP/UDP port numbers, the differences between routing and bridging… and why it’s so bad to bridge across large distances (or in large networks).
I tried to explain most of those topic in How Networks Really Work webinar (next session coming on April 2nd), but as is usually the case someone did a much better job: you MUST READ the poetic and hilariously funny World in which IPv6 was a good design by Avery Pennarun.
A reader of my blog was “blessed” with hands-on experience with SD-WAN offered by large service providers. Based on that experience he sent me his views on whether that makes sense. Enjoy ;)
We all have less-than-stellar opinions on service providers and their offerings. It is well known that those services are expensive and usually lacking quality, experience, or simply, knowledge. This applies to regular MPLS/BGP techniques as to - currently, the new challenge - SD-WAN.
One of the attendees of our Building Network Automation Solutions online course asked an interesting question in the course Slack team:
Has anyone wrote a playbook for putting a circuit into maintenance mode — i.e. adjusting metrics to drain traffic away from a circuit that is going to be taken down for maintenance?
As always, you have to figure out what you want to do before you can start automating stuff.
Found this “gem” describing the differences between layer-2 and layer-3 on an unnamed $vendor web site.
Layer 2 is mainly concerned with the local delivery of data frames between network devices on the same network or local area network (LAN).
So far so good…
Years ago I figured out that I’d eventually have to migrate my blog from Blogger to something more independent, and based on my previous experience with Wordpress I wasn’t exactly enthusiastic to go down that path.
In 2015 I’ve seen Scott Lowe going from Wordpress to Jekyll and then to Hugo, and decided it might make sense to recreate ipSpace.net blog with a tool that generates static web pages… but never found the time to do it.
Defining service availability using the famous X nines (and all the hacks like “planned downtime doesn’t count”) is pretty useless in a highly distributed system where the only thing that really matters is the user experience, not ping response times. One should ask what precisely should we be measuring, and how could we make sure we can act on the measurements
One of the first roadblocks you’ll hit in your “let’s master Ansible” journey will be a weird error deep inside a Jinja2 template. Can we manage that complexity somehow… or as one of the participants in our Building Network Automation Solutions online course asked:
Is there any recommendation/best practices on Jinja templates size and/or complexity, when is it time to split single template into function portions, what do you guys do? And what is better in terms of where to put logic - into jinja or playbooks
One of my friends described the challenge as “Debugging Ansible is one of the most terrible experiences one can endure…” and debugging Jinja2 errors within Ansible playbooks is even worse, but there are still a few things you can do.
Johannes Weber published a PCAP file containing samples of over 50 different L2 - L7 protocols. Enjoy ;)
After decades of riding the Moore’s law curve the networking bandwidth should be (almost) infinite and (almost) free, right? WRONG, as I explained in the Bandwidth Is (Not) Infinite and Free video (part of How Networks Really Work webinar).
There are still pockets of Internet desert where mobile- or residential users have to deal with traffic caps, and if you decide to move your applications into any public cloud you better check how much bandwidth those applications consume or you’ll be the next victim of the Great Bandwidth Swindle. For more details, watch the video.
Stumbled upon an article by Tom Limoncelli. He starts with a programming question (skip that) but then goes into an interesting discussion of what’s really important.
Being focused primarily on networking this is the bit I liked most (another case of Latency Matters):
I once observed a situation where a developer was complaining that an operation was very slow. His solution was to demand a faster machine. The sysadmin who investigated the issue found that the code was downloading millions of data points from a database on another continent. The network between the two hosts was very slow. A faster computer would not improve performance.
The solution, however, was not to build a faster network, either. Instead, we moved the calculation to be closer to the data.
Another interesting question I got from an ipSpace.net subscriber:
Assuming we can simplify the physical network when using overlay virtual network solutions like VMware NSX, do we really need datacenter switches (example: Cisco Nexus instead of Catalyst product line) to implement the underlay?
Let’s recap what we really need to run VMware NSX:
For instance, many companies conclude that they need to be more innovative. To increase their rates of innovation, they look at firms well known for being innovative, such as Google, then dispatch their executives to Silicon Valley to visit tech companies’ corporate campuses in the hope that they will learn something.
Not surprisingly, the book authors observed the same behavior in those companies as I did a while ago when I was still teaching SDN workshops:
They often ignore the fact that Google is an entirely different sector to them, and the lessons in view probably of limited value. They also overlook that even if they do learn something, actually implementing it within their organization is likely to be difficult, if not impossible.
Finally a warning: that book will make you laugh or cry hysterically (or both), so take it in small daily doses.
Zero-Touch Provisioning (ZTP) is a solved problem if you believe the networking vendors… and yet numerous network automation projects involve at least some ZTP functionality. It seems that smart organizations investing in premium people (instead of premium vendors) prefer the Unix way of solving problems: take a number of small versatile tools, and put them together to build a solution that fits your requirements.
Anne Baretta did exactly that and combined Oxidized, FreeZTP, Ansible and custom web UI to build a ZTP solution that addresses the needs of his organization.
After covering configuration and performance optimizations introduced in recent FRRouting releases, Donald Sharp focused on some of the recent usability enhancements, including BGP BestPath explanations, BGP Hostname, BGP Failed Neighbors, and improved debugging.
TL&DR: It’s 2020, and VXLAN with EVPN is all the rage. Thank you, you can stop reading.
On a more serious note, I got this questions from an Johannes Spanier after he read my do we need complex data center switches for NSX underlay blog post:
Would you agree that for smaller NSX designs (~100 hypervisors) a much simpler Layer2 based access-distribution design with MLAGs is feasible? One would have two distribution switches and redundant access switches MLAGed together.
I would still prefer VXLAN for a number of reasons:
AI is the new SDN, and we’re constantly bombarded with networking vendor announcements promising AI-induced nirvana, from reinventing Clippy to automatic anomaly- and threat identifications.
If you still think these claims are realistic, it’s time you start reading what people involved in AI/ML have to say about hype in their field. I posted a few links in the past, and the Packet Pushers Human Infrastructure magazine delivered another goodie into my Inbox.
As a response to my Live vMotion into VMware-on-AWS Cloud blog post Nico Vilbert pointed me to his blog post explaining the details of cross-Atlantic vMotion into AWS.
Today I will not go into yet another rant pointing out all the things that can go wrong, but focus on a minor detail: “no ping was dropped in the process.”
The vMotion is instantaneous and lossless myth has been propagated since the early days of vMotion when sysadmins proudly demonstrated what seemed to be pure magic to amazed audiences… including the now-traditional terminal window running ping and not losing a single packet.
Anne Baretta got pretty far in his automation story: after starting with configuration templates and storing network inventory into a database, he tackled the web UI. What’s next? How about a few auto-generated network diagrams?