Blog Posts in April 2021
What is Katacoda? An awesome environment that allows content authors to create scenarios running on Linux VMs accessible through a web browser. I can only hope they’ll fix the quirks and keep going – I have so many ideas what could be done with it.
Why FRR? Not too long ago Jeroen van Bemmel sent me a link to a simple Katacoda scenario he created to demonstrate how to set up netsim-tools and containerlab. His scenario got the tools installed and set up, but couldn’t create a running network as there are almost no usable Network OS images on Docker Hub (that is accessible from within Katacoda) – the only image I could find was FRR.
During my interview with David Bombal I made a recommendation I find crucial for anyone serious about blogging:
Make sure you own your content.
There’s a simple reason for that rule: if you want to write quality content, you’ll have to invest a lot of time into it.
TL&DR: If you want to test BGP, OSPF, IS-IS, or SR-MPLS in a virtual lab, you might build the lab faster with netsim-tools release 0.6.
In the netsim-tools release 0.6 I focused on adding routing protocol functionality:
- IS-IS on Cisco IOS/IOS XE, Cisco NX-OS, Arista EOS, FRR, and Junos.
- BGP on the same set of platforms, including support for multiple autonomous systems, EBGP, IBGP full mesh, IBGP with route reflectors, next-hop-self control, and BGP/IGP interaction.
- Segment Routing with MPLS on Cisco IOS XE and Arista EOS.
You’ll also get:
I know the title sounds like a buzzword-bingo-winning clickbait, but it’s true. Adrian Giacometti decided to merge the topics of two ipSpace.net online courses and automated deployment of AWS security rules using Terraform within GitLab CI pipeline, with Slack messages serving as manual checks and approvals.
Not only did he do a great job mastering- and gluing together so many diverse bits and pieces, he also documented the solution and published the source code:
- Part 1: Cloud & Network automation challenge: Deploy Security Rules in a DevOps/GitOps world with AWS, Terraform, GitLab CI, Slack, and Python (special guest FastAPI)
- Part 2: AWS, Terraform and FastAPI
- Part 3: GitLab CI, Slack, and Python
- Source code: aegiacometti/devops_cloud_challenge · GitLab
Want to build something similar? Join our Network Automation and/or Public Cloud course and get started. Need something similar in your environment? Adrian is an independent consultant and ready to work on your projects.
I think it is too advanced for my needs. Interesting but difficult to apply. I love math and I find it interesting maybe for bigger companies, but for a small company it is not possible to apply it.
While a small company’s network might not warrant a graph-focused approach (I might disagree, but let’s not go there), keep in mind that almost everything we do in IT rides on top of some sort of graph:
I’ve been saying the same thing for years, but never as succinctly as Alastair Cooke did in his Understand Your Single Points of Failure (SPOF) blog post:
The problem is that each time we eliminated a SPOF, we at least doubled our cost and complexity. The additional cost and complexity are precisely why we may choose to leave a SPOF; eliminating the SPOF may be more expensive than an outage cost due to the SPOF.
Obviously that assumes that you’re able to follow business objectives and not some artificial measure like uptime. Speaking of artificial measures, you might like the discussion about taxonomy of indecision.
Scott Berkun wrote another great article that’s equally applicable to the traditional notion of design (his specialty) and the network design. Read it, replace design with network design, and use its lessons. Here’s just a sample:
- Convincing people is a social process
- Aim for small wins, not conversions of belief systems
- Allies matter more than ideas
- Design maturity grows one step at a time.
In the last part of my chat with David Bombal we discussed interesting technologies networking engineers could focus on if they want to grow beyond pure packet switching (and voice calls, if you happen to believe VoIP is not just an application). We mentioned public clouds, automation, Linux networking, tools like Git, and for whatever reason concluded with some of my biggest blunders.
Recently I joked there’s significant difference between AWS and Azure launching features:
- AWS launches a production-ready feature that you can consume the next day.
- Azure launches a preview that might work in 6 months.
Those with long enough memories shouldn’t be surprised. It’s not the first time Microsoft is using the same tactics.
Minh Ha left another extensive comment on my Is Switching Latency Relevant blog post. As is usual the case, it’s well worth reading, so I’m making sure it doesn’t stay in the small print (this time interspersed with a few comments of mine in gray boxes)
I found Cisco apparently manages to scale port-to-port latency down to 250ns for L3 switching, which is astonishing, and way less (sub 100ns) for L1 and L2.
I don’t know where FPGA fits into this ultra low-latency picture, because FPGA, compared to ASIC, is bigger, and a few times slower, due to the use of Lookup Table in place of gate arrays, and programmable interconnects.
Scott submitted an interesting the comment to my Does Unequal-Cost Multipath (UCMP) Make Sense blog post:
How about even Large CLOS networks with the same interface capacity, but accounting for things to fail; fabric cards, links or nodes in disaggregated units. You can either UCMP or drain large parts of your network to get the most out of ECMP.
Before I managed to write a reply (sometimes it takes months while an idea is simmering somewhere in my subconscious) Jeff Tantsura pointed me to an excellent article by Erico Vanini that describes the types of asymmetries you might encounter in a leaf-and-spine fabric: an ideal starting point for this discussion.
The reader asking about infrastructure-as-code in public cloud deployments also wondered whether he has any chance at mastering on-premises network automation due to lack of programming skills.
I am starting to get concerned about not knowing automation, IaC, or any programming language. I didn’t go to college, like a lot of my peers did, and they have some background in programming.
First of all, thanks a million to everyone needs to become a programmer hipsters for thoroughly confusing people. Now for a tiny bit of reality.
Who would have thought that you could get better at what you do by figuring out how things you use really work. I probably made that argument (about networking fundamentals) too many times; Julia Evans claims the same approach applies to programming.
I thought I was snarky and somewhat rude (and toned down some of my blog posts on second thought), but I’m a total amateur compared to Corey Quinn. His last masterpiece – Machine Learning is a Marvelously Executed Scam – is another MUST READ.
Years ago I wrote a series of blog posts comparing transparent bridging and IP routing, and creating How Networks Really Work materials seemed like a perfect opportunity to make that information more structured, starting with Transparent Bridging Fundamentals.
One of my readers wondered whether it makes sense to buy low-latency switches from Cisco or Juniper instead of switches based on merchant silicon like Trident-3 or Jericho (regardless of whether they are running NX-OS, Junos, EOS, or Linux).
As always, the answer is it depends, but before getting into the details, let’s revisit what latency really is. We’ll start with a simple two-node network.
TL&DR: If you happen to like working with containers, you could use netsim-tools release 0.5 to provision your container-based Arista EOS labs.
Why does it matter? Lab setup is blindingly fast, and it’s easier to integrate your network devices with other containers, not to mention the crazy idea of running your network automation CI pipeline on Gitlab CPU cycles. Also, you could use the same netsim-tools topology file and provisioning scripts to set up container-based or VM-based lab.
Some networking engineers breeze through our Network Automation online course, others disappear after a while… and a few of those come back years later with a spectacular production-grade solution.
Stephen Harding is one of those. He attended the automation course in spring 2019 and I haven’t heard from him in almost two years… until he submitted one of the most mature data center fabric automation solutions I’ve seen.
Not only that, he documented the solution in a long series of must-read blog posts. Hope you’ll find them useful; I liked them so much I immediately saved them to Internet Archive (just in case).
One of my readers sent me a series of “how do I get started with…” questions including:
I’ve been doing networking and security for 5 years, and now I am responsible for our cloud infrastructure. Anything to do with networking and security in the cloud is my responsibility along with another team member. It is all good experience but I am starting to get concerned about not knowing automation, IaC, or any programming language.
No need to worry about that, what you need (to start with) is extremely simple and easy-to-master. Infrastructure-as-Code is a simple concept: infrastructure configuration is defined in machine-readable format (mostly text files these days) and used by a remediation tool like Terraform that compares the actual state of the deployed infrastructure with the desired state as defined in the configuration files, and makes changes to the actual state to bring it in line with how it should look like.
Here’s an interesting fact: cloud-based stuff often refuses to die; it might become insufferably slow, but would still respond to the health checks. The usual fast failover approach used in traditional high-availability clusters is thus of little use.
For more details, read the Fail-Fast is Failing… Fast ACM Queue article.
Ansible and Jinja2 are not an ideal platform for data manipulation, but sometimes it’s easier to hack together something in Jinja2 than writing a Python filter. In those cases, you might find the Data Model Transformation with Jinja2 by Philippe Jounin extremely useful.
As I started Software Gone Wild podcast in June 2014, I wanted to help networking engineers grow beyond the traditional networking technologies. It’s only fitting to conclude this project almost seven years and 116 episodes later with a similar theme Avi Freedman proposed when we started discussing podcast topics in late 2020: how do we make networking attractive to young engineers.
I was listening to an excellent container networking podcast and enjoyed it thoroughly until the guest said something along the lines of:
With Kubernetes networking policy, you no longer have to be a networking expert to do container network security.
That’s not even wrong. You didn’t have to be a networking expert to write traffic filtering rules for ages.
A junior networking engineer asked me for a list of recommended entry-level networking blogs. I have no idea (I haven’t been in that position for ages); the best I can do is to share my list of networking-related RSS feeds and the process I’m using to collect interesting blogs:
- RSS is your friend. Find a decent RSS reader. I’m using Feedly – natively in a web browser and with various front-ends on my tablet and phone (note to Google: we haven’t forgotten you killed Reader because you weren’t making enough money with it).
- If a blog doesn’t have an RSS feed I’m not interested.
A while ago, someone made a remark on my suggestions that networking engineers should focus on getting fluent with cloud networking and automation:
The running thing is, we can all learn this stuff, but not without having an opportunity.
I tend to forcefully disagree with that assertion. What opportunity do you need to test open-source tools or create a free cloud account? My response was thus correspondingly gruff:
Last week I described the new features added to netsim-tools release 0.4, including support for unnumbered interfaces and OSPF routing. Now let’s see how I used them to build a multi-vendor lab to test which platforms could be made to interoperate when running OSPF over unnumbered Ethernet interfaces.
I needed to define an unnumbered addressing pool first:
addressing: core: unnumbered: true
I wanted to run OSPF on all devices in the lab:
module: [ ospf ]
Have you ever wondered what the Kubernetes fuss is all about? Why would you ever want to use it? Stuart Charlton tried to answer that question in the introduction part of his fantastic Kubernetes Networking Deep Dive webinar.
It’s almost exactly three months since I announced ipSpace.net going on an extended coffee break. We had some ideas of what we plan to do at that time, but there were still many gray areas, and thanks to tons of discussions I had with many of my friends, subscribers, and readers, they mostly crystallized into this:
You’re trusting me to deliver. We added a “you might want to read this first” warning to the checkout process, and there was no noticeable drop in revenue. Thanks a million for your vote of confidence!