Building network automation solutions

9 module online course

Start now!

Blog Posts in March 2021

Intermittent Terraform Authentication Failure Using AWS Provider in a Vagrant VM

TL&DR: Client clock skew could result in AWS authentication failure when running terraform apply

When I wanted to compare AWS and Azure orchestration speeds I encountered a crazy Terraform error message when running terraform apply:

module.network.aws_vpc.My_VPC: Creating...

Error: Error creating VPC: AuthFailure: 
AWS was not able to validate the provided access credentials
	status code: 401, request id: ...

Obviously I did all the usual stuff before googling for a solution:

read more see 1 comments

Dealing with Cloud Challenges

Here’s a message I got from one of my subscribers (probably based on one of my recent public cloud rants):

I often think the cloud stuff has been sent to try us in IT – the struggle could be tough enough when we were dealing with waterfall development and monolithic projects. When products took years to develop, and years to understand.

And now we’re being asked to be agile and learn new stuff all the time about moving targets that barely have documentation at all, never mind accurate doco! We had obviously got into our comfort zone and needed shaking out of it!

Always interested to hear your experiences with the cloud networking though – it’s what I subscribed to ipspace.net for TBH as I think it’s the most complete reference source for that purpose and a vital part of enterprise networking these days!

It’s always extremely nice to hear someone finds your work valuable ;) Thanks a million!

add comment

netsim-tools: Release 0.4 Is Out

TL&DR: The new release of netsim-tools includes unnumbered interfaces, configuration modules, and OSPF configuration.

In mid-March, we enjoyed another excellent presentation by Dinesh Dutt, this time focused on running OSPF in leaf-and-spine fabrics. He astonished me when he mentioned unnumbered Ethernet interfaces being available on all major network operating systems. It was time to test things out, and I wanted to use my networking simulation builder to build the test lab.

read more add comment

Interview: Will AI Replace the Networking Engineers?

In the second half of my chat with David Bombal we focused on automation and AI in networking. Even though we discussed many things, including the dangers of doing a repeatable job, and how to make yourself unique, David chose a nice click-bait headline Will AI Replace the Networking Engineers?. According to Betteridge’s law of headlines the answer is still NO, but it’s obvious AI will replace the low-level easy-to-automate jobs (as textile workers found out almost 200 years ago).

While pondering that statement, keep in mind that AI is more than just machine learning (the overhyped stuff). According to one loose definition, “Artificial intelligence (AI) refers to the simulation of human intelligence in machines that are programmed to think like humans and mimic their actions

Full disclosure: the web site with this definition had and ad for Lego Friends set next to it, making it extra-trusty. I couldn’t find a similarly oversimplified definition on Wikipedia… probably for a good reason.
read more add comment

Relative Speed of Public Cloud Orchestration Systems

When I was complaining about the speed (or lack thereof) of Azure orchestration system, someone replied “I tried to do $somethingComplicated on AWS and it also took forever

Following the “opinions are great, data is better” mantra (as opposed to “never let facts get in the way of a good story” supposedly practiced by some podcasters), I decided to do a short experiment: create a very similar environment with Azure and AWS.

I took simple Terraform deployment configuration for AWS and Azure. Both included a virtual network, two subnets, a route table, a packet filter, and a VM with public IP address. Here are the observed times:

read more see 1 comments

Unequal-Cost Multipath in Link State Protocols

TL&DR: You get unequal-cost multipath for free with distance-vector routing protocols. Implementing it in link state routing protocols is an order of magnitude more CPU-consuming.

Continuing our exploration of the Unequal-Cost Multipath world, why was it implemented in EIGRP decades ago, but not in OSPF or IS-IS?

Ignoring for the moment the “does it make sense” dilemma: finding downstream paths (paths strictly shorter than the current best path) is a side effect of running distance vector algorithms.

For a more formal discussion of loop-free alternates and downstream paths, please read RFC 5714 and RFC 5286.
read more see 1 comments

There's No Recipe for Success

TL&DR: There cannot be a simple and easy recipe for success, or everyone else would be using it.

My recent chat with David Bombal about networking careers' future resulted in tons of comments, including a few complaints effectively saying I was pontificating instead of giving out easy-to-follow recipes. Here’s one of the more polite ones:

No tangible solutions given, no path provided, no actionable advice detailed.

I totally understand the resentment. Like a lot of other people, I spent way too much time looking for recipes for success. It was tough to admit there are none for a simple reason: if there was a recipe for easy success, everyone would be using it, and then we’d have to redefine success. Nobody would admit that being average is a success, or as Jeroen van Bemmel said in his LinkedIn comment:

Success requires differentiation, which cannot be achieved by copying others. As Steve Jobs put it: “Be hungry, stay foolish”

read more see 1 comments

Worth Reading: Splitting the Ping

I hope you’re aware that the venerable ping (and most of its variants) measures round-trip-time – how long it takes to get to the destination and back – but is there a way to measure one-way latency or find out asymmetric transit times?

Ben Cox found a way to use ICMP timestamps together with reasonably accurate NTP-derived time to do just that. More details in Splitting the ping (HT: Drew Conry-Murray).

add comment

Interview: Is Networking Dead?

A few weeks ago I enjoyed a long-overdue chat with David Bombal. David published the first part of it under the click-bait headline Is Networking Dead (he renamed it Is There any Future for Networking Engineers in the meantime).

According to Betteridge’s law of headlines the answer to his original headline is NO (and the second headline violates that law – there you go ūü§∑‚Äć‚ôāÔłŹ). If you’re still interested in the details, watch the interview.

add comment

Public Cloud Behind-the-Scenes Magic

One of my subscribers sent me this question after watching the networking part of Introduction to Cloud Computing webinar:

Does anyone know what secret networking magic the Cloud providers are doing deep in their fabrics which are not exposed to consumers of their services?

TL&DR: Of course not… and I’m guessing it would be pretty expensive if I knew and told you.

However, one can always guess based on what can be observed (see also: AWS networking 101, Azure networking 101).

read more see 4 comments

Repost: Using MP-TCP to Utilize Unequal Links

In the Does Unequal-Cost Multipathing Make Sense blog post I wrote (paraphrased):

The trick to successful utilization of unequal uplinks is to use them wisely […] It’s how multipath TCP (MP-TCP) could be used for latency-critical applications like Siri.

Minh Ha quickly pointed out (some) limitations of MP-TCP and as is usually the case, his comment was too valuable to be left as a small print at the bottom of a blog post.

Intuitively I don’t necessarily agree with all of his conclusions, but don’t know enough to have a qualified opinion.
read more see 2 comments

Using YAML Instead of Excel in Network Automation Solutions

One of the attendees of our network automation course asked a question along these lines:

In a previous Ansible-based project I used Excel sheet to contain all relevant customer data. I converted this spreadsheet using python (xls_to_fact) and pushed the configurations to network devices accordingly. I know some people use YAML to define the variables in Git. What would be the advantages of doing that over Excel/xsl_to_fact?

Whenever you’re choosing a data store for your network automation solution you have to consider a number of aspects including:

read more see 9 comments

Worth Reading: Modules, Monoliths, and Microservices

If you want to grow beyond being a CLI (or Python) jockey, it’s worth trying to understand things work… not only how frames get from one end of the world to another, but also how applications work, and why they’re structured they way they are.

Daniel Dib recently pointed out another must-read article in this category: Modules, monoliths, and microservices by Avery Pennarun – a wonderful addition to my distributed systems resources.

add comment

Video: Cisco SD-WAN Routing Design

After reviewing Cisco SD-WAN policies, it’s time to dig into the routing design. In this section, David Penaloza enumerated several possible topologies, types of transport, their advantages and drawbacks, considerations for tunnel count and regional presence, and what you should consider beforehand when designing the solution from the control plane’s perspective.

You need Free ipSpace.net Subscription to watch the video.
add comment

Azure Route Server: The Challenge

Imagine you decided to deploy an SD-WAN (or DMVPN) network and make an Azure region one of the sites in the new network because you already deployed some workloads in that region and would like to replace the VPN connectivity you’re using today with the new shiny expensive gadget.

Everyone told you to deploy two SD-WAN instances in the public cloud virtual network to be redundant, so this is what you deploy:

read more add comment

Interesting Tool: Schema Enforcer

It looks like JSON Schema is the new black. Last week I wrote about a new Ansible module using JSON Schema to validate data structures passed to it; a few weeks ago NetworkToCode released Schema Enforcer, a similar CLI tool (which means it’s easy to use it in any CI/CD pipeline).

Wondering why schema validation matters? Start here. See also: GIGO.

Here are just a few things Schema Enforcer can do:

read more add comment

Implementing Layer-2 Networks in a Public Cloud

A few weeks ago I got an excited tweet from someone working at Oracle Cloud Infrastructure: they launched full-blown layer-2 virtual networks in their public cloud to support customers migrating existing enterprise spaghetti mess into the cloud.

Let’s skip the usual does everyone using the applications now have to pay for Oracle licenses and I wonder what the lock in might be when I migrate my workloads into an Oracle cloud jokes and focus on the technical aspects of what they claim they implemented. Here’s my immediate reaction (limited to the usual 280 characters, because that’s the absolute upper limit of consumable content these days):

read more see 5 comments

MUST READ: Systems Design Explains the World

The one and only Avery Pennarun (of the world in which IPv6 was a good design fame) is back with another absolutely-must-read article explaining how various archetypes apply to real-world challenges, including:

  • Hierarchies and decentralization (and why decentralization is a myth)
  • Chicken-and-egg problem (and why some good things fail)
  • Second-system effect (or why it’s better to refactor than to rewrite)
  • Innovator’s dilemma (or why large corporations become obsolete)

If you think none of these applies to networking, you’re probably wrong… but of course please write a comment if you still feel that way after reading Avery’s article.

see 3 comments

Worth Reading: Career Advice for Young Engineers

David Bombal invited me for another short chat – this time on what I recommend young networking engineers just starting their career. As I did a bit of a research I stumbled upon some great recommendations on Quora:

I couldn’t save the pages to Internet Archive (looks like it’s not friendly with Quora), so I can only hope they won’t disappear ;)

add comment

Video: Path Discovery in Transparent Bridging and Routing

In the previous video in this series, I described how path discovery works in source routing and virtual circuit environments. I couldn’t squeeze the discussion of hop-by-hop forwarding into the same video (it would make the video way too long); you’ll find it in the next video in the same section.

The video is part of How Networks Really Work webinar and available with Free ipSpace.net Subscription.
see 2 comments

New Ansible Data Validation Module(s)

A few months ago I described how you could use JSON Schema to validate your automation data models, host/group variable files, or even Ansible inventory file.

I had to use a weird toolchain to get it done – either ansible-inventory to build a complete data model from various inventory sources, or yq to convert YAML to JSON… and just for the giggles jsonschema CLI command requires the JSON input to reside in a file, so you have to use a temporary file to get the job done.

read more see 2 comments

Chasing Anycast IP Addresses

One of my readers sent me this question:

My job required me to determine if one IP address is unicast or anycast. Is it possible to get this information from the bgp dump?

TL&DR: Not with anything close to 100% reliability. An academic research paper (HT: Andrea di Donato) documents a false-positive rate of around 10%.

If you’re not familiar with IP anycast: it’s a brilliant idea of advertising the same prefix from multiple independent locations, or the same IP address from multiple servers. Works like a charm for UDP (that’s how all root DNS servers are built) and supposedly pretty well across distant-enough locations for TCP (with a long list of caveats when used within a data center).

read more see 7 comments

Impact of Azure Subnets on High Availability Designs

Now that you know all about regions and availability zones (AZ) and the ways AWS and Azure implement subnets, let’s get to the crux of the original question Daniel Dib sent me:

As I understand it, subnets in Azure span availability zones. Do you see any drawback to this? You mentioned that it’s difficult to create application swimlanes that way. But does subnet matter if your VMs are in different AZs?

It’s time I explain the concepts of application swimlanes and how they apply to availability zones in public clouds.

read more add comment

Rant: Cisco ACI Complexity

A while ago Antti Leimio wrote a long twitter thread describing his frustrations with Cisco ACI object model. I asked him for permission to repost the whole thread as those things tend to get lost, and he graciously allowed me to do it, so here we go.


I took a 5 days Cisco DCACI course. This is all new to me. I’m confused. Who is ACI for? Capabilities and completeness of features is fantastic but how to manage this complex system?

read more see 3 comments
Sidebar