Feedback: Ansible for Networking Engineers
One of my subscribers sent me a nice email describing his struggles to master Ansible:
Some time ago I started to hear about Ansible as the new power tool for network engineer, my first reaction was “What the hell is this?” I searched the web and found many blah blahs about it… until I landed on your pages.
He found Ansible for Networking Engineers material sufficient to start an automation project:
VXLAN and EVPN on Hypervisor Hosts
One of my readers sent me a series of questions regarding a new cloud deployment where the cloud implementers want to run VXLAN and EVPN on the hypervisor hosts:
I am currently working on a leaf-and-spine VXLAN+ EVPN PoC. At the same time, the systems team in my company is working on building a Cloudstack platform and are insisting on using VXLAN on the compute node even to the point of using BGP for inter-VXLAN traffic on the nodes.
Using VXLAN (or GRE) encap/decap on the hypervisor hosts is nothing new. That’s how NSX and many OpenStack implementations work.
Worth Reading: The Fragile Engineers
Ethan Banks wrote an awesome blog post on the characteristics of fragile engineers (most of them probably being expert beginners). I can’t help but ponder how often I behave like one…
Leaf-and-Spine Fabric Myths (Part 1)
Apart from the “they have no clue what they’re talking about” observation, Evil CCIE left a long list of leaf-and-spine fabric myths he encountered in the wild in a comment on one of my blog posts. He started with:
Clos fabric (aka Leaf And Spine fabric) is a non-blocking fabric
That was obviously true in the days when Mr. Clos designed the voice switching solution that still bears his name. In the original Clos network every voice call would get a dedicated path across the fabric, and the number of voice calls supported by the fabric equaled the number of alternate end-to-end paths.
Network Automation Development Environments
Building the network automation lab environment seems to be one of the early showstoppers on everyone’s network automation journey. These resources might help you get started:
- I wrote an installation script that installs the myriad dependencies needed by Ansible and NAPALM in just the right order on a Ubuntu VM (step-by-step instructions for ipSpace.net subscribers).
- Carl Buchmann open-sourced a full-blown infrastructure-as-code development environment he uses for his automation projects.
- Jaap de Vos described how he creates a Docker image containing Ansible, NAPALM and Nornir.
Hint: after setting up your environment, you might want to enroll into the Spring 2019 network automation course ;)
Network Troubleshooting Guidelines
It all started with an interesting weird MLAG bugs discussion during our last Building Next-Generation Data Center online course. The discussion almost devolved into “when in doubt reload” yammering when Mark Horsfield stepped in saying “while that may be true, make sure to check and collect these things before reloading”.
I loved what he wrote so much that I asked him to turn it into a blog post… and he made it even better by expanding it into generic network troubleshooting guidelines. Enjoy!
Don't Make a Total Mess When Dealing with Exceptions
A while ago I had the dubious “privilege” of observing how my “beloved” airline Adria Airways deals with exceptions. A third-party incoming flight was 2.5 hours late and in their infinite wisdom (most probably to avoid financial impact) they decided to delay a half-dozen outgoing flights for 20-30 minutes while waiting for the transfer passengers.
Not surprisingly, when that weird thingy landed and they started boarding the outgoing flights (now all at the same time), the result was a total mess with busses blocking each other (this same airline loves to avoid jet bridges).
Prepare for Job Interview with ipSpace.net Subscription
Did you know that many networking engineers use ipSpace.net webinars (and subscription) to prepare for the job interviews?
Here’s one of their success stories (name changed for obvious reasons):
Implications of Valley-Free Routing in Data Center Fabrics
As I explained in a previous blog post, most leaf-and-spine best-practices (as in: what to do if you have no clue) use BGP as the IGP routing protocol (regardless of whether it’s needed) with the same AS number shared across all spine switches to implement valley-free routing.
This design has an interesting consequence: when a link between a leaf and a spine switch fails, they can no longer communicate.
Infrastructure-as-Code Tools
This is the fourth blog post in “thinking out loud while preparing Network Infrastructure as Code presentation for the network automation course” series. Previous posts: Network-Infrastructure-as-Code Is Nothing New, Adjusting System State and NETCONF versus REST API.
Dmitri Kalintsev sent me a nice description on how some popular Infrastructure-as-Code (IaC) tools solve the challenges I described in The CRUD Hell section of Infrastructure-as-Code, NETCONF and REST API blog post:
Upcoming Webinars and Events: October 2018
The fast pace of webinars continues in October 2018:
- Rachel Traylor will talk about graph theory and its relevance to reliable network design on October 8th;
- The Amazon Web Services Networking webinar will start on October 11th. The second session is planned for October 25th;
- On October 16th we’ll have the third session of VMware NSX technical deep dive (unless I manage to finish on time later today… not likely).
There are no on-site events planned until early December:
- We’ll run another on-site workshop in Zurich on December 5th . This time we’ll focus on using VXLAN and EVPN to build multi-site fabrics;
- I’ll talk about making SDN better with IPv6 on December 6th.
You can attend all upcoming webinars with an ipSpace.net webinar subscription. Online courses and on-site events require separate registration.
VXLAN Broadcast Domain Size Limitations
One of the attendees of my Building Next-Generation Data Center online course tried to figure out whether you can build larger broadcast domains with VXLAN than you could with VLANs. Here’s what he sent me:
I’m trying to understand differences or similarities between VLAN and VXLAN technologies in a view of (*cast) domain limitation.
There’s no difference between the two on the client-facing side. VXLAN is just an encapsulation technology and doesn’t change how bridging works at all (read also part 2 of that story).
Smart or Dumb NICs on Software Gone Wild
Hardware vendors are always making their silicon more complex and feature-rich. Is that a great idea or a disaster waiting to happen? We asked Luke Gorrie, the lead developer of Snabb Switch (an open-source user-land virtual switch written in Lua) about his opinions on the topic.
TL&DL version: Give me a dumb NIC, software can do everything else.
If you want to know more, listen to Episode 93 of Software Gone Wild.
Using CSR1000V in AWS Instead of Automation or Orchestration System
As anyone starting their journey into AWS quickly discovers, cloud is different (or as I wrote in the description of my AWS workshop you feel like Alice in Wonderland). One of the gotchas: when you link multiple routing domains (Virtual Private Clouds – the other VPC) you have to create static routing table entries on both ends. Even worse, there’s no transit VPC – you have to build a full mesh of relationships.
The correct solution to this challenge is automation:
Infrastructure-as-Code, NETCONF and REST API
This is the third blog post in “thinking out loud while preparing Network Infrastructure as Code presentation for the network automation course” series. You might want to start with Network-Infrastructure-as-Code Is Nothing New and Adjusting System State blog posts.
As I described in the previous blog post, the hardest problem any infrastructure-as-code (IaC) tool must solve is “how to adjust current system state to desired state described in state definition file(s)”… preferably without restarting or rebuilding the system.
There are two approaches to adjusting system state: