Blog Posts in April 2020
After setting the stage clarifying the current Cisco SD-WAN deployment scenarios, David Penaloza focused on definitions and fundamentals that must be considered before dealing with solutions that hide and abstract complexity like overlays, routing, and network virtualization from the network administrator.
One of the attendees of our Building Next-Generation Data Center online course submitted a picture-perfect solution to scalable layer-2 fabric design challenge:
- VXLAN/EVPN based data center fabric;
- IGP within the fabric;
- EBGP with the WAN edge routers because they’re run by a totally different team and they want to have a policy enforcement point between the two;
- EVPN over IBGP within the fabric;
- EVPN over EBGP between the fabric and WAN edge routers.
The only seemingly weird decision he made: he decided to run the EVPN EBGP session between loopback interfaces of core switches (used as BGP route reflectors) and WAN edge routers.
Being stuck at home like most everyone else we’re continuing the increased pace of content production in May 2020:
- I’ll continue the Introduction to Containers and Docker update. We got through the basics the last time and will cover Linux namespaces and how Docker uses them on May 5th.
- We finally found an independent guest speaker familiar with Cisco ACI. Mario Rosi will start the Cisco ACI series with an introductory webinar on May 12th.
- Dinesh Dutt will continue his Network Automation Tools update on May 19th.
- Hoping to get through Introduction to Docker on May 5th, I plan to do a deep dive into Docker Networking on May 26.
- Finally, I’m positive I won’t cover all the bridging and routing material I created in today’s webinar, so we’ll continue with routing protocol basics on May 28th.
Imagine that you just stumbled upon the hammer Thor carelessly dropped, and you’re so proud of your new tool that everything looks like a nail even though it might be a lightbulb or an orange.
That happens to some people when they get the network automation epiphany: all of a sudden CLI and manual configuration should be banned, and everything can be solved by proper incantation of Git and Ansible commands or whatever other workflow you might have set up… even though the particular problem might have nothing to do with what you have just automated.
Kode Vicious (aka George V. Neville-Neil ) wrote another brilliant article on reducing risk in systems that can do serious harm. Here are just two of the gems:
The risks involved in these systems come from three major areas: marketing, accounting, and management.
There is a wealth of literature on safety-critical systems, much of which points in the same direction: toward simplicity. With increasing complexity comes increasing risk …
Ben Friedman and his team (the video crew producing all the Tech Field Day events) published a number of interviews about the impact of COVID-19 on IT.
Among other things we discussed how busy networking engineers are trying to cope with unexpected demand, and how public cloud isn’t exactly infinitely elastic.
This podcast introduction was written by Nick Buraglio, the host of today’s podcast.
As private overlays are becoming more and more prevalent and as SD-WAN systems and technologies advance, it remains critical that we continue to investigate how we think about internetworking. Even with platforms such as Slack Nebula, Zerotier, or the wireguard based TailScale becoming a mainstream staple of many businesses, the question of “what is next” is being asked by an ambitious group of researchers.
A reader of my blog sent me this question:
Do you think we can trust DSCP marking on servers (whether on DC or elsewhere - Windows or Linux )?
As they say “not as far as you can throw them”.
Does that mean that the network should do application recognition and marking on the ingress network node? Absolutely not, although the switch- and router vendors adore the idea of solving all problems on their boxes.
One of the hands-on exercises in our Networking in Public Cloud Deployments online course asks the attendees to deploy a full-blown virtual networking solution with a front-end (web) server in a public subnet, and back-end (database) server in a private subnet.
The next (optional) exercise asks them to add IPv6 to the mix for a full-blown dual-stack deployment.
Two weeks ago I started with a seemingly simple question:
If a BGP speaker R is advertising a prefix A with next hop N, how does the network know that N is actually alive and can be used to reach A?
… and answered it for the case of directly-connected BGP neighbors (TL&DR: Hope for the best).
Jeff Tantsura provided an EVPN perspective, starting with “the common non-arguable logic is reachability != functionality”.
Now let’s see what happens when we add route reflectors to the mix. Here’s a simple scenario:
We started March 2020 with the second part of Cisco SD-WAN webinar by David Peñaloza Seijas, continued with Upcoming Internet Challenges update, and concluded with 400 GE presentation by Lukas Krattiger and Mark Nowell.
You can access all these webinars with Standard or Expert ipSpace.net subscription. The Cisco SD-WAN presentation is already available with free ipSpace.net subscription, which will also include the edited 400 GE videos once we get them back from our video editor.
However, there’s a slight gotcha if you use Git with Markdown: whenever you change something, the whole line (and using tools like IA Writer a whole paragraph is a single line) is marked as changed, for example:
Maybe it’s time to build our own network monitoring systems from open-source components instead of paying vendors big bucks for slick PowerPoint slides.
David Penaloza decided to demystify Cisco’s SD-WAN, provide real world experience beyond marketing hype, and clear confusing and foggy messages around what can or cannot be done with Cisco SD-WAN.
One of the attendees of my Building Network Automation Solutions online course quickly realized a limitation of Ansible (by far the most popular network automation tool): it stores all the information in random text files. Here’s what he wrote:
I’ve been playing around with Ansible a lot, and I figure that keeping random YAML files lying around to store information about routers and switches is not very uh, scalable. What’s everyone’s favorite way to store all the things?
He’s definitely right (and we spent a whole session in the network automation course discussing that).
Let’s agree for a millisecond that you can’t find any other way to migrate your workload into a public cloud than to move the existing VMs one-by-one without renumbering them. Doing a clumsy cloud migration like this will get you the headaches and the cloud bill you deserve, but that’s a different story. Today we’ll talk about being clumsy the right and the wrong way.
There are two ways of solving today’s challenge:
I’d like to bring back EVPN context. The discussion is more nuanced, the common non-arguable logic here - reachability != functionality.
Numerous online companies are using the COVID-19 crisis to make their products better known (PacketPushers collected some of them). Nothing wrong with that - they’re investing into providing free- or at-cost resources, and hope to get increased traction in the market. Pretty fair and useful.
Then there are others… Here’s a recent email I got:
Way too many people still believe in Security Fairy (the mythical entity that makes your application magically secure), fueling the whole industry of security researchers who happily create excruciatingly detailed talks of how you can use whatever security oversight to wreak havoc (even when the limitations of a technology are clearly spelled out in an RFC).
In the Networks Are Not Secure (part of How Networks Really Work webinar) I described why we should never rely on the network infrastructure to provide security but have to implement it higher up in the application stack.
Andrea Dainese added REST (Web) API to his Automation for Cisco NetDevOps article. You might love his explanation of the screen scraping methods used by legacy implementations. He was too polite to throw around any names, but I could immediately think of NETCONF or RESTCONF implementation on Cisco IOS.
One of our subscribers sent me this email when trying to use ideas from Ansible for Networking Engineers webinar to build BGP route reflector configuration:
I’m currently discovering Ansible/Jinja2 and trying to create BGP route reflector configuration from Jinja2 template using Ansible playbook. As part of group_vars YAML file, I wish to list all route reflector clients IP address. When I have 50+ neighbors, the YAML file gets quite unreadable and it’s hard to see data model anymore.
Whenever you hit a roadblock like this one, you should start with the bigger picture and maybe redefine the problem.
How does the network know that a VTEP is actually alive? (1) from the point of view of the control plane and (2) from the point of view of the data plane? And how do you ensure that control and data plane liveness monitoring has the same view? BFD for BGP is a possible solution for (1) but it’s not meant for 3rd party next hops, i.e. it doesn’t address (2).
Let’s stop right there (or you’ll stop reading in the next 10 milliseconds). I will also try to rephrase the question in more generic terms, hoping Aldrin won’t mind a slight detour… we’ll get back to the original question in another blog post.
Over the last weekend I almost got pulled into yet-another CLI-or-automation Twitter spat. The really sad part: I thought we were past that point. After all, I’ve been ranting about that topic for almost seven years… and yet I’m still hearing the same arguments I did in those days.
Just for the giggles I collected a few old blog posts on the topic (not that anyone evangelizing their opinions on Twitter would ever take the time to read them ;).
Most of us are in some sort of lockdown (or quarantine or shelter-in-place or whatever it’s called) at the moment. Some have their hands full balancing work and homeschooling their kids (hang in there!), others are getting bored and looking for networking-related content (or you wouldn’t be reading this blog).
If you’re in the latter category you might want to browse some of the free ipSpace.net content: almost 3500 blog posts, dozens of articles, over a hundred podcast episodes, over 20 free webinars, and another 30+ webinars with sample videos that you can access with free subscription.
Need more? Standard subscription includes 260 hours of video content and if you go for Expert subscription and select the network automation course as part of the subscription, you’ll get another 60 hours of content plus hands-on exercises, support, access to Slack team… hopefully enough to last you way past the peak of the current pandemic.
As I explained in How Networks Really Work and Upcoming Internet Challenges webinars, routing security, and BGP security in particular remain one of the unsolved challenges we’ve been facing for decades (see also: what makes BGP a hot mess).
Fortunately, due to enormous efforts of a few persistent individuals BGP RPKI is getting traction (NTT just went all-in), and Flavio Luciani and Tiziano Tofoni decided to do their part creating an excellent in-depth document describing BGP RPKI theory and configuration on Cisco- and Juniper routers.
There are only two things you have to do:
- Read the document;
- Implement RPKI in your network.
Thank you, the Internet will be grateful.
Every time I’m complaining about the stupidities $vendors are trying to sell us, someone from vendorland saddles a high horse and starts telling me how I got it all wrong, for example:
It is a duty of a pre-sales, consultant, vendor representative to inform the customer about the risk.
When you stop laughing (maybe it was just April Fools’ joke ;), here’s how the reality of that process looks like (straight from one of my readers):
I remember when the VM guys and their managers were telling me (like they had discovered the solutions to all of ours problems) about “with VXLAN we can move a machine from one country to another, and keep having service with the same IP” … while looking at me with the “I’m so smart” face… and me thinking shit… I’m doomed :) … I don’t even want to start explaining … but in the long run I had to anyway.