Category: worth reading

Worth Reading: Cloudflare Control Plane Outage

Cloudflare experienced a significant outage in early November 2023 and published a detailed post-mortem report. You should read the whole report; here are my CliffsNotes:

Also (unrelated to Cloudflare outage):

read more see 2 comments

Git Rebase: What Can Go Wrong?

Julia Evans wrote another must-read article (if you’re using Git): git rebase: what can go wrong?

I often use git rebase to clean up the commit history of a branch I want to merge into a main branch or to prepare a feature branch for a pull request. I don’t want to run it unattended – I’m always using the interactive option – but even then, I might get into tight spots where I can only hope the results will turn out to be what I expect them to be. Always have a backup – be it another branch or a copy of the branch you’re working on in a remote repository.

see 1 comments

Worth Reading: Taming the BGP Reconfiguration Transients

Almost exactly a decade ago I wrote about a paper describing how IBGP migrations can cause forwarding loops and how one could reorder BGP reconfiguration steps to avoid them.

One of the paper’s authors was Laurent Vanbever who moved to ETH Zurich in the meantime where his group keeps producing great work, including the Chameleon tool (code on GitHub) that can tame transient loops while reconfiguring BGP. Definitely something worth looking at if you’re running a large BGP network.

add comment

Worth Exploring: BGP from Theory to Practice

My good friend Tiziano Tofoni finally created an English version of his evergreen classic BGP from theory to practice with co-authors Antonio Prado and Flavio Luciani.

I had the Italian version of the book since the days I was running SDN workshops with Tiziano in Rome, and it’s really nice to see they finally decided to address a wider market.

Also, you know what would go well with that book? Free open-source BGP configuration labs of course 😉

add comment

How GitHub Saved My Day

I always tell networking engineers who aspire to be more than VLAN-munging CLI jockeys to get fluent with Git. I should also be telling them that while doing local version control is the right thing to do, you should always have backups (in this case, a remote repository).

I’m eating my own dog food1 – I’m using a half dozen Git repositories in ipSpace.net production2. If they break, my blog stops working, and I cannot publish new documents3.

Now for a fun fact: Git is not transactionally consistent.

read more see 1 comments

Worth Reading: Where Are the Self-Driving Cars?

Gary Marcus wrote an interesting essay describing the failure of self-driving cars to face the unknown unknowns. The following gem from his conclusions applies to AI in general:

In a different world, less driven by money, and more by a desire to build AI that we could trust, we might pause and ask a very specific question: have we discovered the right technology to address edge cases that pervade our messy really world? And if we haven’t, shouldn’t we stop hammering a square peg into a round hole, and shift our focus towards developing new methodologies for coping with the endless array of edge cases?

Obviously that’s not going to happen, we’ll keep throwing more GPU power at the problem trying to solve it by brute force.

see 1 comments

Case Study: BGP Routing Policy

Talking about BGP routing policy mechanisms is nice, but it’s even better to see how real Internet Service Providers use those tools to implement real-life BGP routing policy.

Getting that information is incredibly hard as everyone considers their setup a secret sauce. Fortunately, there are a few exceptions; Pim van Pelt described the BGP Routing Policy of IPng Networks in great details. The article is even more interesting as he’s using Bird2 configuration language that looks almost like a programming language (as compared to the ancient route-maps used by vendors focused on “industry-standard” CLI).

Have fun!

add comment

Worth Reading: Looking Inside Large Language Models

Bruce Davie published an interesting overview article about Large Language Models. It would be worth reading just for the copious links to in-depth article; I particularly like his conclusions:

We mistake performance (producing realistic text) for competence (understanding the world).

Having a model for language is different from having a model of the world.

And that’s a perfect explanation why it makes no sense to expect ChatGPT and friends to produce picture-perfect device configurations or always-working code.

see 1 comments

How GitHub Learned How Hard Distributed Systems Are

Anne Baretta found a great video describing the October 2018 GitHub failure. Here’s the TL&DW:

  • The failure was caused by a short (~ 1 minute) disconnect of the primary data center
  • The database replicas failed over to the secondary data center, but that failover was never tested and of course some stuff didn’t work.
  • In the meantime, batch jobs modified data in the primary data center, making the two replicas out-of-sync.
  • It took them over 24 hours to clean up the mess.
read more add comment
Sidebar