… updated on Saturday, May 1, 2021 17:44 +0200
Everything Is a Graph
One of the viewers of Rachel Traylor’s excellent Graph Algorithms in Networks webinar sent me this feedback:
I think it is too advanced for my needs. Interesting but difficult to apply. I love math and I find it interesting maybe for bigger companies, but for a small company it is not possible to apply it.
While a small company’s network might not warrant a graph-focused approach (I might disagree, but let’s not go there), keep in mind that almost everything we do in IT rides on top of some sort of graph:
Worth Reading: Understand Your Single Points of Failure
I’ve been saying the same thing for years, but never as succinctly as Alastair Cooke did in his Understand Your Single Points of Failure (SPOF) blog post:
The problem is that each time we eliminated a SPOF, we at least doubled our cost and complexity. The additional cost and complexity are precisely why we may choose to leave a SPOF; eliminating the SPOF may be more expensive than an outage cost due to the SPOF.
Obviously that assumes that you’re able to follow business objectives and not some artificial measure like uptime. Speaking of artificial measures, you might like the discussion about taxonomy of indecision.
Worth Reading: The Insider's Guide To Evangelizing Good Design
Scott Berkun wrote another great article that’s equally applicable to the traditional notion of design (his specialty) and the network design. Read it, replace design with network design, and use its lessons. Here’s just a sample:
- Convincing people is a social process
- Aim for small wins, not conversions of belief systems
- Allies matter more than ideas
- Design maturity grows one step at a time.
Interview: What New Technologies Should You Aim to Master?
In the last part of my chat with David Bombal we discussed interesting technologies networking engineers could focus on if they want to grow beyond pure packet switching (and voice calls, if you happen to believe VoIP is not just an application). We mentioned public clouds, automation, Linux networking, tools like Git, and for whatever reason concluded with some of my biggest blunders.
Microsoft Azure: Remember Exchange Server?
Recently I joked there’s significant difference between AWS and Azure launching features:
- AWS launches a production-ready feature that you can consume the next day.
- Azure launches a preview that might work in 6 months.
Those with long enough memories shouldn’t be surprised. It’s not the first time Microsoft is using the same tactics.
Response: Is Switching Latency Relevant?
Minh Ha left another extensive comment on my Is Switching Latency Relevant blog post. As is usual the case, it’s well worth reading, so I’m making sure it doesn’t stay in the small print (this time interspersed with a few comments of mine in gray boxes)
I found Cisco apparently manages to scale port-to-port latency down to 250ns for L3 switching, which is astonishing, and way less (sub 100ns) for L1 and L2.
I don’t know where FPGA fits into this ultra low-latency picture, because FPGA, compared to ASIC, is bigger, and a few times slower, due to the use of Lookup Table in place of gate arrays, and programmable interconnects.
Using Unequal-Cost Multipath to Cope with Leaf-and-Spine Fabric Failures
Scott submitted an interesting the comment to my Does Unequal-Cost Multipath (UCMP) Make Sense blog post:
How about even Large CLOS networks with the same interface capacity, but accounting for things to fail; fabric cards, links or nodes in disaggregated units. You can either UCMP or drain large parts of your network to get the most out of ECMP.
Before I managed to write a reply (sometimes it takes months while an idea is simmering somewhere in my subconscious) Jeff Tantsura pointed me to an excellent article by Erico Vanini that describes the types of asymmetries you might encounter in a leaf-and-spine fabric: an ideal starting point for this discussion.
Starting Network Automation for Non-Programmers
The reader asking about infrastructure-as-code in public cloud deployments also wondered whether he has any chance at mastering on-premises network automation due to lack of programming skills.
I am starting to get concerned about not knowing automation, IaC, or any programming language. I didn’t go to college, like a lot of my peers did, and they have some background in programming.
First of all, thanks a million to everyone needs to become a programmer hipsters for thoroughly confusing people. Now for a tiny bit of reality.
Worth Reading: Get Better at Programming by Learning How Things Work
Who would have thought that you could get better at what you do by figuring out how things you use really work. I probably made that argument (about networking fundamentals) too many times; Julia Evans claims the same approach applies to programming.
MUST READ: Machine Learning is a Marvelously Executed Scam
I thought I was snarky and somewhat rude (and toned down some of my blog posts on second thought), but I’m a total amateur compared to Corey Quinn. His last masterpiece – Machine Learning is a Marvelously Executed Scam – is another MUST READ.
Video: Transparent Bridging Fundamentals
Years ago I wrote a series of blog posts comparing transparent bridging and IP routing, and creating How Networks Really Work materials seemed like a perfect opportunity to make that information more structured, starting with Transparent Bridging Fundamentals.
… updated on Saturday, May 1, 2021 18:19 +0200
Fundamentals: Is Switching Latency Relevant?
One of my readers wondered whether it makes sense to buy low-latency switches from Cisco or Juniper instead of switches based on merchant silicon like Trident-3 or Jericho (regardless of whether they are running NX-OS, Junos, EOS, or Linux).
As always, the answer is it depends, but before getting into the details, let’s revisit what latency really is. We’ll start with a simple two-node network.

The simplest possible network
… updated on Monday, July 12, 2021 18:12 UTC
Netsim-tools Release 0.5 Work with Containerlab
TL&DR: If you happen to like working with containers, you could use netsim-tools release 0.5 to provision your container-based Arista EOS labs.
Why does it matter? Lab setup is blindingly fast, and it’s easier to integrate your network devices with other containers, not to mention the crazy idea of running your network automation CI pipeline on Gitlab CPU cycles. Also, you could use the same netsim-tools topology file and provisioning scripts to set up container-based or VM-based lab.
What is containerlab? A cool project that builds realistic virtual network topologies with containers. More details…
Must Read: Automate Nexus-OS Fabric Deployment
Some networking engineers breeze through our Network Automation online course, others disappear after a while… and a few of those come back years later with a spectacular production-grade solution.
Stephen Harding is one of those. He attended the automation course in spring 2019 and I haven’t heard from him in almost two years… until he submitted one of the most mature data center fabric automation solutions I’ve seen.
Not only that, he documented the solution in a long series of must-read blog posts. Hope you’ll find them useful; I liked them so much I immediately saved them to Internet Archive (just in case).
Start Automating Public Cloud Deployments with Infrastructure-as-Code
One of my readers sent me a series of “how do I get started with…” questions including:
I’ve been doing networking and security for 5 years, and now I am responsible for our cloud infrastructure. Anything to do with networking and security in the cloud is my responsibility along with another team member. It is all good experience but I am starting to get concerned about not knowing automation, IaC, or any programming language.
No need to worry about that, what you need (to start with) is extremely simple and easy-to-master. Infrastructure-as-Code is a simple concept: infrastructure configuration is defined in machine-readable format (mostly text files these days) and used by a remediation tool like Terraform that compares the actual state of the deployed infrastructure with the desired state as defined in the configuration files, and makes changes to the actual state to bring it in line with how it should look like.