Do We Need LFA or FRR for Fast Failover in ECMP Designs?
One of my readers sent me a question along these lines:
Imagine you have a router with four equal-cost paths to prefix X, two toward upstream-1 and two toward upstream-2. Now let’s suppose that one of those links goes down and you want to have link protection. Do I really need Loop-Free Alternate (LFA) or MPLS Fast Reroute (FRR) to get fast (= immediate) failover or could I rely on multiple equal-cost paths to get the job done? I’m getting different answers from different vendors…
Please note that we’re talking about a very specific question of whether in scenarios with equal-cost layer-3 paths the hardware forwarding data structures get adjusted automatically on link failure (without CPU reprogramming them), and whether LFA needs to be configured to make the adjustment happen.
MUST READ: How to troubleshoot routing protocols session flaps
Did you ever experience an out-of-the-blue BGP session flap after you were running that peering for months? As Dmytro Shypovalov explains in his latest blog post, it’s always MTU (just kidding, of course it’s always DNS, but MTU blackholes nonetheless result in some crazy behavior).
New Ansible for Networking Engineers Content
When restructuring our online courses we decided to make the video content that was previously part of Ansible online course available with Standard ipSpace.net Subscription.
If you haven’t enrolled into our automation online course (which always included the extra bits) you’ll find the following additional content in our Ansible for Networking Engineers webinar:
Video: Cisco SD-WAN Onboarding Process
After describing Cisco SD-WAN architecture and routing capabilities, David Penaloza focused on the onboarding process and tasks performed by the Cisco SD-WAN solution (encryption, tunnel establishment, and device onboarding) in it’s so-called Orchestration Plane.
Don't Lift-and-Shift Your Enterprise Spaghetti into a Public Cloud
Jon Kadis spent most of his life working on enterprise networks, and sadly found out that even changing jobs and moving into a public cloud environment can’t save you from people trying to lift-and-shift enterprise IT kludges into a greenfield environment.
Here’s what he sent me:
Building Secure Layer-2 Data Center Fabric with Cisco Nexus Switches
One of my readers is designing a layer-2-only data center fabric (no SVI interfaces on switches) with stringent security requirements using Cisco Nexus switches, and he wondered whether a host connected to such a fabric could attack a switch, and whether it would be possible to reach the management network in that way.
Do you think it’s possible to reach the MANAGEMENT PLANE from the DATA PLANE? Is it valid to think that there is a potential attack vector that someone can compromise to source traffic from the front of the device (ASIC) through the PCI bus across the CPU to the across the PCI bus to the Platform Controller Hub through the I/O card to spew out the Management Port onto that out-of-band network?
My initial answer was “of course there’s always a conduit from the switching ASIC to the CPU, how would you handle STP/CDP/LLDP otherwise”. I also asked Lukas Krattiger for more details; here’s what he sent me:
Worth Reading: The Shared Irresponsibility Model in the Cloud
A long while ago I wrote a blog post along the lines of “it’s ridiculous to allow developers to deploy directly to a public cloud while burdening them with all sorts of crazy barriers when deploying to an on-premises infrastructure,” effectively arguing for self-service approach to on-premises deployments.
Not surprisingly, the reality is grimmer than I expected (I’m appalled at how optimistic my predictions are even though I always come across as a die-hard grumpy pessimist), as explained in The Shared Irresponsibility Model in the Cloud by Dan Hubbard.
For more technical details, watch cloud-focused ipSpace.net webinars, in particular the Cloud Security one.
Yet Again: CLI or API
Carl Montanari recently published an interesting blog post on the punditry of network APIs (including hilarious fact that “SNMP is also an API”), and as someone sent me a link to that post he commented “it reminds me of a few blog posts you wrote a while ago”.
Speaking of those blog posts… last July I was getting bored and put together a list of interesting blog posts I published on that topic. Enjoy!
Podcast: State of Multi-Cloud Networking
In mid-September Ethan Banks invited me to chat about multi-cloud networking in the Day Two Cloud podcast. It was just a few weeks after Corey Quinn published a fantastic Multi-Cloud is the Worst Practice rant, which perfectly matched my observations, so I came well prepared ;)
Weird: Wrong Subnet Mask Causing Unicast Flooding
When I still cared about CCIE certification, I was always tripped up by the weird scenario with (A) mismatched ARP and MAC timeouts and (B) default gateway outside of the forwarding path. When done just right you could get persistent unicast flooding, and I’ve met someone who reported average unicast flooding reaching ~1 Gbps in his data center fabric.
One would hope that we wouldn’t experience similar problems in modern leaf-and-spine fabrics, but one of my readers managed to reproduce the problem within a single subnet in FabricPath with anycast gateway on spine switches when someone misconfigured a subnet mask in one of the servers.
Validate Ansible YAML Data with JSON Schema
When I published the Optimize Network Data Models series a long while ago, someone made an interesting comment along the lines of “You should use JSON Schema to validate the data model.”
It took me ages to gather the willpower to tame that particular beast, but I finally got there. In the next installment of the Data Models saga I described how you can use JSON Schema to validate Ansible inventory data and your own YAML- or JSON-based data structures.
To learn more about data validation, error handling, unit- and system testing, and CI/CD pipelines in network automation, join our automation course.
Worth Exploring: bgpstuff.net
Darren O’Connor put together a BGP looking glass with web GUI. Nothing fancy so far… but he also offers REST API interface (because REST API sounds so much better than HTTP).
The REST API calls return text results, so you can use them straight in a Bash script. For example, here’s a simple script to print a bunch of details about your current IP address:
New on ipSpace.net: Virtualizing Network Devices Q&A
A few weeks ago we published an interesting discussion on network operating system details based on an excellent set of questions by James Miles.
Unfortunately we got so far into the weeds at that time that we answered only half of James’ questions. In the second Q&A session Dinesh Dutt and myself addressed the rest of them including:
- How hard is it to virtualize network devices?
- What is the expected performance degradation?
- Does it make sense to use containers to do that?
- What are the operational implications of running virtual network devices?
- What will be the impact on hardware vendors and networking engineers?
And of course we couldn’t avoid the famous last question: “Should network engineers program network devices?”
You’ll need Standard or Expert ipSpace.net subscription to watch the videos.
Worth Reading: Does your hammer own you?
My friend Marjan Bradeško wrote a great article describing how we tend to forget common sense and rely too much on technology. I would strongly recommend you read it and start thinking about the choices you make when building a network with magic software-intent-defined-intelligent technology from your preferred vendor.
Zero-Touch Provisioning with Nornir
In early 2018 I described how Hans Verkerk implemented zero-touch provisioning with Ansible. Recently he rewrote his scripts as a Python-only solution using Nornir. Enjoy!