Blog Posts in June 2019
We Are on a Break ;)
It’s high time for another summer break (I get closer and closer to burnout every year - either I’m working too hard or I’m getting older ;).
Of course we’ll do our best to reply to support (and sales ;) requests, but it might take us a bit longer than usual. I will publish an occasional worth reading or watch out blog post, but don’t expect anything deeply technical for the new two months.
We’ll be back (hopefully refreshed and with tons of new content) in early September, starting with network automation course on September 3rd and VMware NSX workshop on September 10th.
In the meantime, try to get away from work (hint: automating stuff sometimes helps ;), turn off the Internet, and enjoy a few days in your favorite spot with your loved ones!
First-hand Feedback: ipSpace.net Network Automation Course
Daniel Teycheney attended the Spring 2019 Building Network Automation Solutions online course and sent me this feedback after completing it (and creating some interesting real-life solutions on the way):
I spent a bit of time the other day reflecting on how much I’ve learn’t from the course in terms of technical skills and the amount I’ve learned has been great. I literally no idea about things like Git, Jinja2, CI testing, reading YAML files and had only briefly seen Ansible before.
I’m not an expert now, but I understand these things and have real practical experience on these subjects which has given me great confidence to push on and keep getting better.
Device Configuration Synthesis with NetComplete on Software Gone Wild
When I was still at university the fourth-generation programming languages were all the hype, prompting us to make jokes along the lines “fifth generation will implement do what I don’t know how”
The research team working in Networked Systems Group at ETH Zurich headed by prof. Laurent Vanbever got pretty close. The description of their tool says:
Impact of Controller Failures in Software-Defined Networks
Christoph Jaggi sent me this observation during one of our SD-WAN discussions:
The centralized controller is another shortcoming of SD-WAN that hasn’t been really addressed yet. In a global WAN it can and does happen that a region might be cut off due to a cut cable or an attack. Without connection to the central SD-WAN controller the part that is cut off cannot even communicate within itself as there is no control plane…
A controller (or management/provisioning) system is obviously the central point of failure in any network, but we have to go beyond that and ask a simple question: “What happens when the controller cluster fails and/or when nodes lose connectivity to the controller?”
Real-Life SD-WAN Experience
SD-WAN is the best thing that could have happened to networking according to some industry “thought leaders” and $vendor marketers… but it seems there might be a tiny little gap between their rosy picture and reality.
This is what I got from someone blessed with hands-on SD-WAN experience:
Read Network Device Information with REST API and Store It Into a Database
One of my readers sent me this question:
How can I learn more about reading REST API information from network devices and storing the data into tables?
Long story short: it’s like learning how to drive (well) - you have to master multiple seemingly-unrelated tasks to get the job done.
How Microsoft Azure Orchestration System Crashed My Demos
One of the first things I realized when I started my Azure journey was that the Azure orchestration system is incredibly slow. For example, it takes almost 40 seconds to display six routes from per-VNIC routing table. Imagine trying to troubleshoot a problem and having to cope with 30-second delay on every single SHOW command. Cisco IGS/R was faster than that.
If you’re old enough you might remember working with VT100 terminals (or an equivalent) connected to 300 baud modems… where typing too fast risked getting the output out-of-sync resulting in painful screen repaints (here’s an exercise for the youngsters: how long does it take to redraw an 80x24 character screen over a 300 bps connection?). That’s exactly how I felt using Azure CLI - the slow responses I was getting were severely hampering my productivity.
Feedback: Ansible for Networking Engineers
I always love to hear from networking engineers who managed to start their network automation journey. Here’s what one of them wrote after watching Ansible for Networking Engineers webinar (part of paid ipSpace.net subscription, also available as an online course).
This webinar helped me a lot in understanding Ansible and the benefits we can gain. It is a big area to grasp for a non-coder and this webinar was exactly what I needed to get started (in a lab), including a lot of tips and tricks and how to think. It was more fun than I expected so started with Python just to get a better grasp of programing and Jinja.
In early 2019 we made the webinar even better with a series of live sessions covering new features added to recent Ansible releases, from core features (loops) to networking plugins and new declarative intent modules.
Running OSPF in a Single Non-Backbone Area
One of my subscribers sent me an interesting puzzle:
One of my colleagues configured a single-area OSPF process in a customer VRF customer, but instead of using area 0, he used area 123 nssa. Obviously it works, but I was thinking: “What the heck, a single OSPF area MUST be in Area 0”
Not really. OSPF behaves identically within an area (modulo stub/NSSA behavior) regardless of the area number. Even there, you could argue that the only difference between area 0 and other areas is that the standard (and all compliant implementations) doesn’t allow you to set stub or NSSA bit in area 0.
Switch Buffer Sizes and Fermi Estimates
In my quest to understand how much buffer space we really need in high-speed switches I encountered an interesting phenomenon: we no longer have the gut feeling of what makes sense, sometimes going as far as assuming that 16 MB (or 32MB) of buffer space per 10GE/25GE data center ToR switch is another $vendor shenanigan focused on cutting cost. Time for another set of Fermi estimates.
Let’s take a recent data center switch using Trident II+ chipset and having 16 MB of buffer space (source: awesome packet buffers page by Jim Warner). Most of switches using this chipset have 48 10GE ports and 4-6 uplinks (40GE or 100GE).
Use Per-Link Prefixes in Network Data Models
We got pretty far in our data deduplication in network data model journey, from initial attempts to network modeled as a graph… but we still haven’t got rid of all the duplicate information.
For example, if we have multiple devices connected to the same subnet, why should we have to specify IP address and subnet mask for every device (literally begging the operators to make input errors). Wouldn’t it be better (assuming we don’t care about exact IP addresses on core links) to assign IP addresses automatically?
Repost: Automation Without Simplification
The No Scripting Required to Start Your Automation Journey blog post generated lively discussions (and a bit of trolling from the anonymous peanut gallery). One of the threads focused on “how does automation work in real life IT department where it might be challenging to simplify operations before automating them due to many exceptions, legacy support…”
Here’s a great answer provided by another reader:
As Expected: Where Have All the SDN Controllers Gone?
Roy Chua (SDx Central) published a blog post titled “Where Have All the SDN Controllers Gone” a while ago describing the gradual disappearance of SDN controller hype.
No surprise there - some of us were pointing out the gap between marketing and reality years ago.
It was evident to anyone familiar with how networking actually works that in a generic environment the drawbacks of orthodox centralized control plane SDN approach far outweigh its benefits. There are special use cases like intelligent patch panels where a centralized control plane makes sense.
Stop Using GUI to Configure SDN or Intent-Based Products
This blog post was initially sent to subscribers of my SDN and Network Automation mailing list. Subscribe here.
At the end of my vNIC 2018 keynote speech I made a statement along these lines:
The moment you start using GUI with an SDN product you’re back to square one.
That claim confused a few people – Mark left this comment on my blog:
Do Packet Drops Matter for TCP Performance?
Approximately two years ago I tried to figure out whether aggressive marketing of deep buffer data center switches makes sense, recorded a few podcasts on the topic and organized a webinar with JR Rivers.
Not surprisingly, the question keeps popping up, so it seems it’s time for another series of TL&DR articles. Let’s start with the basics:
Generalize the Network-as-Graph Data Model
Remember the avoid duplicate data in network automation data models challenge and the restructuring we did to represent a network as a graph.
Well, I was not happy with the end result - I hated the complexity of supporting Jinja2 templates that had to check left- and right nodes of a link, so I generalized the data structure a bit, and all of a sudden I could model stub interfaces, P2P links and multi-access networks.
Know Thy Environment Before Redesigning It
A while ago I had an interesting consulting engagement: a multinational organization wanted to migrate off global Carrier Ethernet VPN (with routers at the edges) to MPLS/VPN.
While that sounds like the right thing to do (after all, L3 must be better than L2, right?) in that particular case they wanted to combine the provider VPN with Internet-based IPsec VPN… and doing that in parallel with MPLS/VPN tends to become an interesting exercise in “how convoluted can I make my design before I give up and migrate to BGP”.
MUST READ: Thou shalt not commit logical fallacies
Found a fantastic list of common logical fallacies. It's a must read for anyone having at least occasional interaction with non-Vulcans... and when you stop laughing (or screaming, or both) make sure you go through the companion web site to understand bugs in your wetware that sabotage your attempts at being perfectly logical.