Blog Posts in October 2020
Video: Cisco SD-WAN Onboarding Process
After describing Cisco SD-WAN architecture and routing capabilities, David Penaloza focused on the onboarding process and tasks performed by the Cisco SD-WAN solution (encryption, tunnel establishment, and device onboarding) in it’s so-called Orchestration Plane.
Don't Lift-and-Shift Your Enterprise Spaghetti into a Public Cloud
Jon Kadis spent most of his life working on enterprise networks, and sadly found out that even changing jobs and moving into a public cloud environment can’t save you from people trying to lift-and-shift enterprise IT kludges into a greenfield environment.
Here’s what he sent me:
Building Secure Layer-2 Data Center Fabric with Cisco Nexus Switches
One of my readers is designing a layer-2-only data center fabric (no SVI interfaces on switches) with stringent security requirements using Cisco Nexus switches, and he wondered whether a host connected to such a fabric could attack a switch, and whether it would be possible to reach the management network in that way.
Do you think it’s possible to reach the MANAGEMENT PLANE from the DATA PLANE? Is it valid to think that there is a potential attack vector that someone can compromise to source traffic from the front of the device (ASIC) through the PCI bus across the CPU to the across the PCI bus to the Platform Controller Hub through the I/O card to spew out the Management Port onto that out-of-band network?
My initial answer was “of course there’s always a conduit from the switching ASIC to the CPU, how would you handle STP/CDP/LLDP otherwise”. I also asked Lukas Krattiger for more details; here’s what he sent me:
Grasp the Fundamentals before Spreading Opinions
I should have known better, but I got pulled into another stretched VLANs for disaster recovery tweetfest. Surprisingly, most of the tweets were along the lines of you really shouldn’t be doing that and that would never work well, but then I guess I was only exposed to a small curated bubble of common sense… until this gem appeared in my timeline:
Interestingly, that’s exactly how IP works:
New on ipSpace.net: Graph Algorithms
After a bit more than a year, we ran another math-focused webinar last week: Rachel Traylor came back to talk about graph algorithms, focusing on tree-, path- and center problems.
In her lecture you’ll find:
- maximum branching algorithms (and I couldn’t stop wondering why we don’t use them for OSPF- or IS-IS flooding)
- path algorithms including the ones used in OSPF, IS-IS, or BGP, as well as algorithms that find K shortest paths
- center problems (for example: where do I put my streaming server or my BGP route reflector)
You’ll need Standard or Expert ipSpace.net subscription to watch the videos.
Worth Reading: The Shared Irresponsibility Model in the Cloud
A long while ago I wrote a blog post along the lines of “it’s ridiculous to allow developers to deploy directly to a public cloud while burdening them with all sorts of crazy barriers when deploying to an on-premises infrastructure,” effectively arguing for self-service approach to on-premises deployments.
Not surprisingly, the reality is grimmer than I expected (I’m appalled at how optimistic my predictions are even though I always come across as a die-hard grumpy pessimist), as explained in The Shared Irresponsibility Model in the Cloud by Dan Hubbard.
For more technical details, watch cloud-focused ipSpace.net webinars, in particular the Cloud Security one.
Yet Again: CLI or API
Carl Montanari recently published an interesting blog post on the punditry of network APIs (including hilarious fact that “SNMP is also an API”), and as someone sent me a link to that post he commented “it reminds me of a few blog posts you wrote a while ago”.
Speaking of those blog posts… last July I was getting bored and put together a list of interesting blog posts I published on that topic. Enjoy!
Podcast: State of Multi-Cloud Networking
In mid-September Ethan Banks invited me to chat about multi-cloud networking in the Day Two Cloud podcast. It was just a few weeks after Corey Quinn published a fantastic Multi-Cloud is the Worst Practice rant, which perfectly matched my observations, so I came well prepared ;)
Weird: Wrong Subnet Mask Causing Unicast Flooding
When I still cared about CCIE certification, I was always tripped up by the weird scenario with (A) mismatched ARP and MAC timeouts and (B) default gateway outside of the forwarding path. When done just right you could get persistent unicast flooding, and I’ve met someone who reported average unicast flooding reaching ~1 Gbps in his data center fabric.
One would hope that we wouldn’t experience similar problems in modern leaf-and-spine fabrics, but one of my readers managed to reproduce the problem within a single subnet in FabricPath with anycast gateway on spine switches when someone misconfigured a subnet mask in one of the servers.
Validate Ansible YAML Data with JSON Schema
When I published the Optimize Network Data Models series a long while ago, someone made an interesting comment along the lines of “You should use JSON Schema to validate the data model.”
It took me ages to gather the willpower to tame that particular beast, but I finally got there. In the next installment of the Data Models saga I described how you can use JSON Schema to validate Ansible inventory data and your own YAML- or JSON-based data structures.
To learn more about data validation, error handling, unit- and system testing, and CI/CD pipelines in network automation, join our automation course.
Worth Exploring: bgpstuff.net
Darren O’Connor put together a BGP looking glass with web GUI. Nothing fancy so far… but he also offers REST API interface (because REST API sounds so much better than HTTP).
The REST API calls return text results, so you can use them straight in a Bash script. For example, here’s a simple script to print a bunch of details about your current IP address:
New on ipSpace.net: Virtualizing Network Devices Q&A
A few weeks ago we published an interesting discussion on network operating system details based on an excellent set of questions by James Miles.
Unfortunately we got so far into the weeds at that time that we answered only half of James’ questions. In the second Q&A session Dinesh Dutt and myself addressed the rest of them including:
- How hard is it to virtualize network devices?
- What is the expected performance degradation?
- Does it make sense to use containers to do that?
- What are the operational implications of running virtual network devices?
- What will be the impact on hardware vendors and networking engineers?
And of course we couldn’t avoid the famous last question: “Should network engineers program network devices?”
You’ll need Standard or Expert ipSpace.net subscription to watch the videos.
Worth Reading: Does your hammer own you?
My friend Marjan Bradeško wrote a great article describing how we tend to forget common sense and rely too much on technology. I would strongly recommend you read it and start thinking about the choices you make when building a network with magic software-intent-defined-intelligent technology from your preferred vendor.
Zero-Touch Provisioning with Nornir
In early 2018 I described how Hans Verkerk implemented zero-touch provisioning with Ansible. Recently he rewrote his scripts as a Python-only solution using Nornir. Enjoy!
Video: Simplify Device Configurations with Cumulus Linux
The designers of Cumulus Linux CLI were always focused on simplifying network device configurations. One of the first features along these lines was BGP across unnumbered interfaces, then they introduced simplified EVPN configurations, and recently auto-MLAG and auto-BGP.
You can watch a short description of these features by Dinesh Dutt and Pete Lumbis in Simplify Network Configuration with Cumulus Linux and Smart Datacenter Defaults videos (part of Cumulus Linux section of Data Center Fabrics webinar).
Automation Win: Recreating Cisco ACI Tenants in Public Cloud
This blog post was initially sent to the subscribers of our SDN and Network Automation mailing list. Subscribe here.
Most automation projects are gradual improvements of existing manual processes, but every now and then the stars align and you get a perfect storm, like what Adrian Giacommetti encountered during one of his automation projects.
The customer had well-defined security policies implemented in Cisco ACI environment with tenants, endpoint groups, and contracts. They wanted to recreate those tenants in a public cloud, but it took way too long as the only migration tool they had was an engineer chasing GUI screens on both platforms.
Must Read: Redistributing Full BGP Feed into OSPF
The idea of redistributing the full Internet routing table (840.000 routes at this moment) into OSPF sounds as ridiculous as it is, but when fat fingers strike, it should be relatively easy to recover, right? Just turn off redistribution (assuming you can still log into the offending device) and move on.
Wrong. As Dmytro Shypovalov explained in an extensive blog post, you might have to restart all routers in your OSPF domain to recover.
And that, my friends, is why OSPF is a single failure domain, and why you should never run OSPF between your data center fabric and servers or VM appliances.
Validating Data in GitOps-Based Automation
Anyone using text files as a poor man’s database eventually stumbles upon the challenge left as a comment in Automating Cisco ACI Environments blog post:
The biggest challenge we face is variable preparation and peer review process before committing variables to Git. I’d be particularly interested on how you overcome this challenge?
We spent hours describing potential solutions in Validation, Error Handling and Unit Tests part of Building Network Automation Solutions online course, but if you never built a network automation solution using Ansible YAML files as source-of-truth the above sentence might sound a lot like Latin, so let’s make it today’s task to define the problem.
New: AWS Networking Update
In last week’s update session we covered the new features AWS introduced since the creation of AWS Networking webinar in 2019:
- AWS Local Zones, Wavelengths, and Outposts
- VPC Sharing
- Bring Your Own Addresses
- IP Multicast support
- Managed Prefix Lists in security groups and route tables
- VPC Traffic Mirroring
- Web Application Firewall
- AWS Shield
- VPC Ingress Routing
- Inter-region VPC peering with Transit Gateways
The videos are already online; you need Standard or Expert ipSpace.net subscription to watch them.
Worth Reading: Don't Become A Developer, But Use Their Tools
I was telling you there’s no need to become a programmer over six years ago, but of course nobody ever listens to grumpy old engineers… which didn’t stop Ethan Banks from writing another excellent advice on the same theme: Don’t Become A Developer, But Use Their Tools.
Worth Reading: IP Fragmentation Considered Fragile
We all knew it for a long time, now it’s finally official: IP fragmentation is broken, or as the ever-so-diplomatic IETF likes to call it, IP Fragmentation is Considered Fragile.
Faucet Deep Dive on Software Gone Wild
This podcast introduction was written by Nick Buraglio, the host of today’s podcast.
In the original days of this podcast, there were heavy, deep discussions about this new protocol called “OpenFlow”. Like many of our most creative innovations in the IT field, OpenFlow came from an academic research project that aimed to change the way that we as operators managed, configured, and even thought about networking fundamentals.
For the most part, this project did what it intended, but once the marketing machine realized the flexibility of the technology and its potential to completely change the way we think about vendors, networks, provisioning, and management of networking, they were off to the races.
We all know what happened next.
Network Automation Products for Brownfield Deployments
Got this question from one of my long-time readers:
I am looking for commercial SDN solutions that can be deployed on top of brownfield networks built with traditional technologies (VPC/MLAG, STP, HSRP) on lower-cost networking gear, where a single API call could create a network-wide VLAN, or apply that VLAN to a set of ports. Gluware is one product aimed at this market. Are there others?
The two other solutions that come to mind are Apstra AOS and Cisco NSO. However, you probably won’t find a simple solution that would do what you want to do without heavy customization as every network tends to be a unique snowflake.
Fixing Firewall Ruleset Problem For Good
Before we start: if you’re new to my blog (or stumbled upon this blog post by incident) you might want to read the Considerations for Host-Based Firewalls for a brief overview of the challenge, and my explanation why flow-tracking tools cannot be used to auto-generate firewall policies.
As expected, the “you cannot do it” post on LinkedIn generated numerous comments, ranging from good ideas to borderline ridiculous attempts to fix a problem that has been proven to be unfixable (see also: perpetual motion).
EVPN Control Plane in Infrastructure Cloud Networking
One of my readers sent me this question (probably after stumbling upon a remark I made in the AWS Networking webinar):
You had mentioned that AWS is probably not using EVPN for their overlay control-plane because it doesn’t work for their scale. Can you elaborate please? I’m going through an EVPN PoC and curious to learn more.
It’s safe to assume AWS uses some sort of overlay virtual networking (like every other sane large-scale cloud provider). We don’t know any details; AWS never felt the need to use conferences as recruitment drives, and what little they told us at re:Invent described the system mostly from the customer perspective.
Using Ansible with Arista EOS and CloudVision
In mid-September, Carl Buchmann, Fred Hsu, and Thomas Grimonet had an excellent presentation describing Arista’s Ansible roles and collections. They focused on two collections: CloudVision integration, and Arista Validated Designs. All the videos from that presentation are available with free ipSpace.net subscription.
Want to know even more about Ansible and network automation? Join our 2-day automation event featuring network automation experts from around the globe talking about their production-grade automation solutions or tools they created, and get immediate access to automation course materials and reviewed hands-on exercises.
Must Watch: Fault Tolerance through Optimal Workload Placement
While I keep telling you that Google-sized solutions aren’t necessarily the best fit for your environment, some of the hyperscaler presentations contain nuggets that apply to any environment no matter how small it is.
One of those must-watch presentations is Fault Tolerance through Optimal Workload Placement together with a wonderful TL&DR summary by the one-and-only Todd Hoff of the High Scalability fame.
MUST READ: A Summary of High Speed Ethernet ASICs
Justin Pietsch is back with another must-read article, this time focused on high-speed Ethernet switching ASICs. I’ve rarely seen so many adjacent topics covered in a single easy-to-read article.
Network Operating Systems: Questions and Answers
James Miles got tons of really interesting questions while watching the Network Operating System Models webinar by Dinesh Dutt, and the only reasonable thing to do when he sent them over was to schedule a Q&A session with Dinesh to discuss them.
We got together last week and planned to spend an hour or two discussing the questions, but (not exactly unexpectedly) we got only halfway through the list in the time we had, so we’re continuing next week.
Network Automation Isn’t Easy
Contrary to what some evangelists would love you to believe, getting fluent in network automation is a bit harder than watching 3-minute videos and cobbling playbooks together with google-and-paste… but then nothing really worth doing is ever easy, or everyone else would be doing it already.
Here’s a typical comment from a Building Network Automation Solutions attendee:
I’m loving the class. I feel more confused than I ever have in my 23 year career… but I can already see the difference in my perspective shift in all aspects of my work.