What Are You Going to Test in Network Automation CI/CD Pipeline?
Network automation CI/CD pipeline seems to be the next hot thing, with vendors and bloggers describing in detail how you could get it done. How realistic is that idea for an average environment that’s barely starting its automation journey?
TL&DR: it will take a long time to get there, and lack of tests is the first showstopper.
Highlights: Multi-Threaded Routing Daemons
The multi-threaded routing daemons blog post generated numerous in-depth comments here and on LinkedIn. As always, thanks a million for keeping me honest and providing more details or additional perspectives. Here are some of the best bits.
Jeff Tantsura provided the first dose of reality:
All modern routing protocols implementations are multi-threaded, with a minimum separation of adjacency handling, route calculations and update generation. Note - writing multi-threaded code for complex tasks is a non trivial exercise (you could search for thread safety and similar artifacts and what happens when not implemented correctly). Moving to a multi-threaded code in early 2010s resulted in a multi-release (year) effort and 100s of related bugs all around.
Dr. Tony Przygienda added his hands-on experience (he’s been developing routing protocol software for ages):
… updated on Sunday, January 16, 2022 07:58 UTC
Building a BGP Anycast Lab
The Anycast Works Just Fine with MPLS/LDP blog post generated so much interest that I decided to check a few similar things, including running BGP-based anycast over a BGP-free core, and using BGP Labeled Unicast (BGP-LU).
The Big Picture
We’ll use the same physical topology we used in the OSPF+MPLS anycast example: a leaf-and-spine fabric (admittedly with a single spine) with three anycast servers advertising 10.42.42.42/32 attached to two of the leafs:
Worth Reading: Xen on AWS Nitro NICs
If you find smart NICs interesting, you’ll like the latest blog post by James Hamilton explaining how AWS emulated Xen environment on Nitro hardware to keep old VM instances running on new hardware.
Video: Machine Learning 101
After a brief overview of the AI/ML hype, Javier Antich continued the AI and ML in Networking webinar with the basics of underlying technologies, starting with the machine learning fundamentals.
Optimal BGP Path Selection with BGP Additional Paths
A month ago I explained how using a BGP route reflector in a large-enough non-symmetrical network could result in suboptimal routing (or loss of path diversity or multipathing). I also promised to explain how Advertisement of Multiple Paths in BGP functionality1 solves that problem. Here we go…
I extended the original lab with another router to get a scenario where one route reflector (RR) client should use equal-cost paths to an external destination while another RR client should select a best path that is different from what the route reflector would select.
Scalable Policy Routing
More than a decade ago (before SD-WAN was even a thing) I wrote an article describing how easy it is to route different applications onto different links (MPLS/VPN versus IPsec tunnels) using a distance vector routing protocol (preferably BGP, although even RIP would work).
You might find it interesting that it’s possible to solve tough problems with good network design instead of proprietary unicorn dust, so I salvaged the article from some dusty archive, cleaned it up, polished it, and published it on ipSpace.net.
Dynamic Negotiation of BGP Capabilities
I wanted to write a blog post explaining the intricacies of Advertisement of Multiple Paths in BGP, got into a yak-shaving exercise when discussing the need to exchange BGP capabilities to enable this feature, and decided to turn it into a separate prerequisite blog post. The optimal path selection with BGP AddPath post is coming in a few days.
The Problem
Whenever you want to use BGP for something else than simple IPv4 unicast routing the BGP neighbors must agree on what they are willing to do – be it multiprotocol extensions and individual additional address families, graceful restart, route refresh… (IANA has the complete BGP Capability Codes registry).
Mikrotik RouterOS and VyOS Added to netsim-tools
Stefano Sasso took my “Don’t complain, submit a PR” advice seriously and did a wonderful job adding support for Mikrotik RouterOS and VyOS to netsim-tools, increasing the number of supported platforms to twelve. His additions are available in release 1.0.2 which also includes:
- Automatic creation of groups based on BGP AS numbers
- Group-wide node attributes
- Experimental support for Python plugins
Interested? Start with tutorials and installation guide which includes lab building instructions.
Git as a Source of Truth for Network Automation
In Git as a source of truth for network automation, Vincent Bernat explained why they decided to use Git-managed YAML files as the source of truth in their network automation project instead of relying on a database-backed GUI/API product like NetBox.
Their decision process was pretty close to what I explained in Data Stores and Source of Truth parts of Network Automation Concepts webinar: you need change logging, auditing, reviews, and all-or-nothing transactions, and most IPAM/CMDB products have none of those.
On a more positive side, NetBox (and its fork, Nautobot) has change logging (HT: Leo Kirchner) and things are getting much better with Nautobot Version Control plugin. Stay tuned ;)
Worth Reading: Load Balancing on Network Devices
Christopher Hart wrote a great blog post explaining the fundamentals of how packet load balancing works on network devices. Enjoy.
For more details, watch the Multipath Forwarding part of Advanced Routing Protocol Topics section of How Networks Really Work webinar.
Lesson Learned: Some Services Are Not Worth Delivering
Here’s one of the secrets to AWS’s unprecedented scale and financial success: they quickly figured out that some services are not worth delivering. Most everyone else believes in building snowflake single-customer solutions to solve imaginary problems, effectively losing money while doing so.
Circular Dependencies, VMware NSX-T Edition
A friend of mine sent me a link to a lengthy convoluted document describing the 17-step procedure (with the last step having 10 micro-steps) to follow if you want to run NSX manager on top of N-VDS, or as they call it: Deploy a Fully Collapsed vSphere Cluster NSX-T on Hosts Running N-VDS Switches1.
You might not be familiar with vSphere networking and the way NSX-T uses that (in which case I can highly recommend vSphere and NSX webinars), so here’s a CliffsNotes version of it: you want to put the management component of NSX-T on top of the virtual switch it’s managing, and make it accessible only through that virtual switch. What could possibly go wrong?
Anycast Fundamentals
I got into an interesting debate after I published the Anycast Works Just Fine with MPLS/LDP blog post, and after a while it turned out we have a slightly different understanding what anycast means. Time to fall back to a Wikipedia definition:
Anycast is a network addressing and routing methodology in which a single destination IP address is shared by devices (generally servers) in multiple locations. Routers direct packets addressed to this destination to the location nearest the sender, using their normal decision-making algorithms, typically the lowest number of BGP network hops.
Based on that definition, any transport technology that allows the same IP address or prefix to be announced from several locations supports anycast. To make it a bit more challenging, I would add “and if there are multiple paths to the anycast destination that could be used for multipath forwarding1, they should all be used”.
… updated on Friday, November 26, 2021 15:35 UTC
Multi-Threaded Routing Daemons
When I wrote the Why Does Internet Keep Breaking? blog post a few weeks ago, I claimed that FRR still uses single-threaded routing daemons (after a too-cursory read of their documentation).
Donald Sharp and Quentin Young politely told me I was an idiot I should get my facts straight, I removed the offending part of the blog post, promised to write another one going into the details, and Quentin improved the documentation in the meantime, so here we are…