Blog Posts in January 2019
I got some interesting feedback from one of my readers on Segment Routing with IPv6 extension headers:
Some people position SRv6 as the universal underlay and overlay due to its capabilities for network programming by means of feature+locator SRH separation.
Stupid me replied “SRv6 is NOT an overlay solution but a source routing solution.”
Remember how earlier releases of Nexus-OS started dropping configuration commands if you were typing them too quickly (and how it was declared a feature ;)?
Mark Fergusson had a similar experience on Cisco IOS. All he wanted to do was to use Ansible to configure a VRF, an interface in the VRF, and OSPF routing process on Cisco CSR 1000v running software release 15.5(3).
Here’s what he was trying to deploy. Looks like a configuration straight out of an MPLS book, right?
The crazy pace of webinar sessions continued last week. Howard Marks continued his deep dive into Hyper-Converged Infrastructure, this time focusing on go-to-market strategies, failure resiliency with replicas and local RAID, and the eternal debate (if you happen to be working for a certain $vendor) whether it’s better to run your HCI code in a VM and not in hypervisor kernel like your competitor does. He concluded with the description of what major players (VMware VSAN, Nutanix and HPE Simplivity) do.
On Thursday I started my Ansible 2.7 Updates saga, describing how network_cli plugin works, how they implemented generic CLI modules, how to use SSH keys or usernames and passwords for authentication (and how to make them secure), and how to execute commands on network devices (including an introduction into the gory details of parsing text outputs, JSON or XML).
The last thing I managed to cover was the cli_command module and how you can use it to execute any command on a network device… and then I ran out of time. We’ll continue with sample playbooks and network device configurations on February 12th.
You can get access to both webinars with Standard ipSpace.net subscription.
However, what really caught my eye was this bit:
The law of leaky abstractions means that whenever somebody comes up with a wizzy new code-generation tool that is supposed to make us all ever-so-efficient, you hear a lot of people saying “learn how to do it manually first, then use the wizzy tool to save time.”
Sadly, the Law of Leaky Abstractions blog post was written in 2002… and nothing changed in the meantime, at least not for the better.
I know many networking engineers who went into networking because they didn’t want to write code the rest of their lives. I also know a few awesome engineers who decided to keep coding while designing networks.
Andrea Dainese (author of UNetLab – the tool you might know as EVE-NG) is one of the latter and practiced network automation for years, dealing with all sorts of crappy device configuration and monitoring mechanisms, from screen- and web scraping to broken REST APIs.
One of my subscribers sent me a question along these lines (heavily abridged):
My customer is running a colocation business, and has to provide L2 connectivity between racks, sometimes even across multiple data centers. They were using Q-in-Q to deliver that in a traditional fabric, and would like to replace that with multi-site EVPN fabric with ~100 ToR switches in each data center. However, Cisco doesn’t support Q-in-Q with multi-site EVPN. Any ideas?
As Lukas Krattiger explained in his part of Multi-Site Leaf-and-Spine Fabrics section of Leaf-and-Spine Fabric Architectures webinar, multi-site EVPN (VXLAN-to-VXLAN bridging) is hard. Don’t expect miracles like Q-in-Q over VNI any time soon ;)
In summer 2018 Juniper started talking about another forward-looking concept: Network Reliability Engineering. We wanted to find out whether that’s another unicorn driving DeLorean with flux capacitors or something more tangible, so we invited Matt Oswalt, the author of Network Reliability Engineer’s Manifesto to talk about it in Episode 97 of Software Gone Wild.
In the first part of his interview with Christoph Jaggi Kristian Larsson talked about the basics of CI testing. Now let’s see how you can use these concepts in network automation (and you’ll learn way more in Kristian’s talk on April 9th… if you register for our network automation course).
How does CI testing fit into an overall testing environment?
Traditionally, in particular in the networking industry, it's been rather common to have proof of concepts (POC) delivered by vendors for various networking technologies and then people have sat down and manually tested that the POC meets some set of requirements.
As I’m doing occasional consulting for large enterprises redesigning their data centers, I encounter a wide range of network automation readiness, from “we don’t need that” to “how could we automate as much as possible”.
Based on the pervasiveness of “we don’t need that” responses it looks like many enterprise network engineers still have to go through the five stages of automation grief.
One of the attendees of the Building Next-Generation Data Center online course solved the build small data center fabric challenge with Virtual Chassis Fabric (VCF). I pointed out that I would prefer not to use VCF as it uses centralized control plane and is thus a single failure domain.
Here are his arguments for using VCF:
Every now and then someone tells me I should write more about the basic networking concepts like I did years ago when I started blogging. I’m probably too old (and too grumpy) for that, but fortunately I’m no longer on my own.
Over the years ipSpace.net slowly grew into a small community of networking experts, and we got to a point where you’ll see regular blog posts from other community members, starting with Using BGP as High-Availability protocol written by Nicola Modena, member of ExpertExpress team.
In spring 2019 Building Network Automation Solutions course we’ll have Kristian Larsson diving into continuous integration and his virtual networking lab product (you might want to listen to the Software Gone Wild episode we did with him to get a taste of what he’ll be talking about). Christoph Jaggi did a short interview with him starting with the obvious question:
What is CI testing and how does it differ from other testing methods?
CI is short for Continuous Integration and refers to a way of developing software where changes written by individual developers are frequently (or "continuously") integrated together into a master branch/trunk, thus continuous integration.
One of my readers sent me a description of their automation system that manages firewall rulesets on Fortigate firewalls using NAPALM to manage device configurations.
In his own words:
We are now managing thousands of address objects, services and firewall policies using David Barroso’s FortiOS Napalm module. This works very well and with a few caveats (such as finding a way to enforce the ordering of firewall policies) we are able to manage all the configuration of our firewalls from a single Ansible playbook.
The did the right thing and implemented an abstracted data model using GitOps to manage it:
You might have noticed that our Winter 2019 webinar schedule got crazily busy with seven live sessions in the first two months of the year (another first)… but that’s not all, there are two more live sessions that we haven’t announced yet as we always schedule a single live session of a particular webinar.
Wondering what’s coming during the rest of 2019? Starting with committed ideas:
Here’s some feedback I got from a subscriber who got pulled into an SD-WAN project:
I realized (thanks to you) that it’s really important to understand the basics of how things work. It helped me for example at my work when my boss came with the idea “we’ll start selling SD-WAN and this is the customer wish list”. Looked like business-as-usual until I realized I’ve never seen so big a difference between reality, customer wishes and what was promised to customer by sales guys I never met. And the networking engineers are supposed to save the day afterwards…
How did your first SD-WAN deployment go? Please write a comment!
One of the attendees of my Building Network Automation Solutions online course sent me this suggestion:
Stick to JUST Ansible - no GitHub, Vagrant, Docker or even Python - all of which come with their own significant learning curves.
While I understand how overwhelming the full-blown network automation landscape is to someone who never touched programming, you have to make a hard choice when you decide to start the learning process: do you want to master a single tool, or understand a whole new technology area and be able to select the best tool for the job on as-needed basis.
Supposedly it was a problem with the management network used by their optical gear, but it looks a lot like a layer-2 network spanning 15 data centers and no control-plane policing on the managed devices… proving yet again that large-scale layer-2 networks are a really bad idea.