Appreciating the Networking Fundamentals
When I started creating the How Networks Really Work series, I wondered whether our subscribers (mostly seasoned networking engineers) would find it useful. Turns out at least some of them do; this is what a long-time subscriber sent me:
How Networks Really Work is great, it’s like looking from a plane and seeing how all the roads are connected to each other. I know networking just enough to design and manage a corporate network, but there are many things I have learned, used and forgotten along the way.
So, getting a broad vision helps me remember why I chose something and maybe solve my bad choices. There are many things that I may never use, but with the movement of all things in the cloud it’s great to know, or at least understand, how things really work.
Fast Failover: The Challenge
Sometimes you’re asked to design a network that will reroute around a failure in milliseconds. Is that feasible? Maybe. Is it simple? Absolutely not.
In this series of blog posts we’ll start with the basics, explore the technologies that you can use to reach that goal, and discover one or two unexpected rabbit holes.
New Content: VMware NSX-T 3.0 Update
When VMware NSX-T 3.0 came out, I planned to do an update session of the VMware NSX Technical Deep Dive webinar along the lines of what I did for AWS Networking a few weeks ago. However, it turned out that most of the new features didn’t take more than a bullet or two on an existing slide, or at most a new slide.
Covering them in a live session and then slicing-and-dicing the resulting recording simply didn’t make sense, so I updated the videos in summer 2020 (the last batch was published in early August).
Worth Reading: The Trap of The Premature Senior
Here’s another riff on the “when you’re the smartest person in the room, change the room” theme: The Trap of The Premature Senior by inimitable Charity Majors. Enjoy!
Video: NetQ and Cumulus Linux Data Models
In the last part of his Cumulus Linux 4.0 Update Pete Lumbis talked about using NetQ to capture streaming telemetry and increase network observability, and the new model-driven configuration approach (including all the usual buzzwords like NETCONF, RPC, YAML, JSON, and OpenConfig) coming in 2020.
Renumbering Public Cloud Address Space
Got this question from one of the networking engineers “blessed” with rampant clueless-rush-to-the-cloud.
I plan to peer multiple VNet from different regions. The problem is that there is not any consistent deployment in regards to the private IP subnets used on each VNet to the point I found several of them using public IP blocks as private IP ranges. As far as I recall, in Azure we can’t re-ip the VNets as the resource will be deleted so I don’t see any other option than use NAT from offending VNet subnets to use my internal RFC1918 IPv4 range. Do you have a better idea?
The way I understand Azure, while you COULD have any address range configured as VNet CIDR block, you MUST have non-overlapping address ranges for VNet peering.
Do We Need LFA or FRR for Fast Failover in ECMP Designs?
One of my readers sent me a question along these lines:
Imagine you have a router with four equal-cost paths to prefix X, two toward upstream-1 and two toward upstream-2. Now let’s suppose that one of those links goes down and you want to have link protection. Do I really need Loop-Free Alternate (LFA) or MPLS Fast Reroute (FRR) to get fast (= immediate) failover or could I rely on multiple equal-cost paths to get the job done? I’m getting different answers from different vendors…
Please note that we’re talking about a very specific question of whether in scenarios with equal-cost layer-3 paths the hardware forwarding data structures get adjusted automatically on link failure (without CPU reprogramming them), and whether LFA needs to be configured to make the adjustment happen.
MUST READ: How to troubleshoot routing protocols session flaps
Did you ever experience an out-of-the-blue BGP session flap after you were running that peering for months? As Dmytro Shypovalov explains in his latest blog post, it’s always MTU (just kidding, of course it’s always DNS, but MTU blackholes nonetheless result in some crazy behavior).
New Ansible for Networking Engineers Content
When restructuring our online courses we decided to make the video content that was previously part of Ansible online course available with Standard ipSpace.net Subscription.
If you haven’t enrolled into our automation online course (which always included the extra bits) you’ll find the following additional content in our Ansible for Networking Engineers webinar:
Video: Cisco SD-WAN Onboarding Process
After describing Cisco SD-WAN architecture and routing capabilities, David Penaloza focused on the onboarding process and tasks performed by the Cisco SD-WAN solution (encryption, tunnel establishment, and device onboarding) in it’s so-called Orchestration Plane.
Don't Lift-and-Shift Your Enterprise Spaghetti into a Public Cloud
Jon Kadis spent most of his life working on enterprise networks, and sadly found out that even changing jobs and moving into a public cloud environment can’t save you from people trying to lift-and-shift enterprise IT kludges into a greenfield environment.
Here’s what he sent me:
Building Secure Layer-2 Data Center Fabric with Cisco Nexus Switches
One of my readers is designing a layer-2-only data center fabric (no SVI interfaces on switches) with stringent security requirements using Cisco Nexus switches, and he wondered whether a host connected to such a fabric could attack a switch, and whether it would be possible to reach the management network in that way.
Do you think it’s possible to reach the MANAGEMENT PLANE from the DATA PLANE? Is it valid to think that there is a potential attack vector that someone can compromise to source traffic from the front of the device (ASIC) through the PCI bus across the CPU to the across the PCI bus to the Platform Controller Hub through the I/O card to spew out the Management Port onto that out-of-band network?
My initial answer was “of course there’s always a conduit from the switching ASIC to the CPU, how would you handle STP/CDP/LLDP otherwise”. I also asked Lukas Krattiger for more details; here’s what he sent me:
Grasp the Fundamentals before Spreading Opinions
I should have known better, but I got pulled into another stretched VLANs for disaster recovery tweetfest. Surprisingly, most of the tweets were along the lines of you really shouldn’t be doing that and that would never work well, but then I guess I was only exposed to a small curated bubble of common sense… until this gem appeared in my timeline:

Interestingly, that’s exactly how IP works:
Worth Reading: The Shared Irresponsibility Model in the Cloud
A long while ago I wrote a blog post along the lines of “it’s ridiculous to allow developers to deploy directly to a public cloud while burdening them with all sorts of crazy barriers when deploying to an on-premises infrastructure,” effectively arguing for self-service approach to on-premises deployments.
Not surprisingly, the reality is grimmer than I expected (I’m appalled at how optimistic my predictions are even though I always come across as a die-hard grumpy pessimist), as explained in The Shared Irresponsibility Model in the Cloud by Dan Hubbard.
For more technical details, watch cloud-focused ipSpace.net webinars, in particular the Cloud Security one.
Yet Again: CLI or API
Carl Montanari recently published an interesting blog post on the punditry of network APIs (including hilarious fact that “SNMP is also an API”), and as someone sent me a link to that post he commented “it reminds me of a few blog posts you wrote a while ago”.
Speaking of those blog posts… last July I was getting bored and put together a list of interesting blog posts I published on that topic. Enjoy!