Video: Comparing TCP/IP and CLNP

If you were building networks in early 1990s you probably remember at least a half-dozen different network protocols. Only one of them survived (IPv6 came later), with another one (CLNP) providing an interesting view into a totally different parallel universe that evolved using a different set of fundamental principles.

After introducing the network-layer addressing, I compared the two and pointed out where one or the other was clearly better.

You might think that it makes no sense to talk about protocols that were rarely used in old days, and that are almost non-existent today, but as always those who cannot remember the past are doomed to repeat it, this time reinventing CLNP principles in IPv6-based layer-3-only data center fabrics.

You need Free Subscription to watch the video, and the Standard Subscription to register for upcoming live sessions.
add comment

Data Plane Quirks in Virtual Network Devices

Have you noticed an interesting twist in the ICMP Redirects saga: operating systems of some network devices might install redirect entries and use them for control plane traffic – an interesting implementation side effect of the architecture of most modern network devices.

A large majority of network devices run on some variant of Linux or *BSD operating system, the only true exception being ancient operating systems like Cisco IOS1. The network daemons populate various routing protocol tables and compute the best routes that somehow get merged into a single routing table that might still be just a data structure in some user-mode process.

read more see 4 comments

Contribute to netlab: OSPFv3

Every other blue moon I get a question along the lines of “how could I contribute to netlab”. The process is pretty streamlined and reasonably (I hope) documented in Contributor Guidelines; if you want to get started with an easy task, try implementing OSPFv3 for one of almost a dozen devices (vSRX implementation by Stefano Sasso is a picture-perfect example):

read more add comment

Repost: LISP Is a False Economy

Minh Ha left this comment on the Packet Forwarding 101 blog post. As is usually the case, it’s fun reading and it would be a shame not to repost it as a standalone blog post (even though I don’t necessarily agree with all his conclusions).

I always enjoy Bela’s great insights, esp. on hardware and transport networks, but this time I beg to differ. LISP, is a false economy. It was twisted from the start, unscalable right from the get-go. In Networking and OS, to name (ID) something is to locate it, and vice versa. So the name LISP itself reflects a false distinction. Due to this misconception, LISP proponents are unable to establish the right boundary conditions, leading to the size of xTRs’ RIB diverging (going unbounded). In a word, it has come full circle back to BGP, an exemplary manifestation of RFC 1925 rule 6.

read more see 4 comments

Running a Ubuntu VM on a Mac M1

If you’re brand-new to Python and Ansible, you might be a bit reluctant to install a bunch of packages and Ansible collections on your production laptop to start building your automation skills. The usual recommendation I make to get past that hurdle is to create a Ubuntu virtual machine that can be destroyed every time to mess it up.

Creating a virtual machine is trivial on Linux and MacOS with Intel CPU (install VirtualBox and Vagrant). The same toolset no longer works on newer Macs with M1 CPU (VMware Fusion is in tech preview, so we’re getting there), but there’s an amazingly simple alternative: Multipass by Canonical.

read more see 4 comments

Cache-Based Packet Forwarding

In the previous blog post in this series I described how convoluted routing table lookups could become when you have to deal with numerous layers of indirection (BGP prefix ⇨ BGP next hop ⇨ IGP next hop ⇨ link bundle ⇨ outgoing interface). Modern high-end hardware can deal with the resulting complexity; decades ago we had to use router CPU to do multiple (potentially recursive) lookups in the IP routing table (there was no FIB at that time).

Network devices were always pushed to the bleeding edge of performance, and smart programmers always tried to optimize the CPU-intensive processes. One of the obvious packet forwarding optimizations relied on the fact that within a short timeframe most packets have to be forwarded to a small set of destinations. Welcome to the wonderful world of cache-based forwarding.

read more see 8 comments

New netlab Installation Instructions

A long-time subscriber with a knack for telling me precisely why something I’m doing sucks big time sent me his opinion on netlab1 installation instructions:

I do not want to say it is impossible to follow your instruction but I wonder why the process is not clearly defined for someone not deeply involved in such tasks with full understanding of why to install from github, etc..

Many guys do not know if they want to use libvirt. They want to use the tool simple way without studying upfront what the libvirt is - but they see libvirt WARNING - should we install libvirt then or skip the installation?. But stop, this step of libvirt installation is obligatory in the 2nd Ubuntu section. So why the libvirt warning earlier?

I believe we should start really quickly to enjoy the tool before we reject it for “complexity”. Time To Play matters. Otherwise you are tired trying to understand the process before you check if this tool is right for you.

He was absolutely right – it was time to overhaul the “organically grown” installation instructions and make them goal-focused and structured. For those of you who want to see the big picture first, I also added numerous (hopefully helpful) diagrams. The new documentation is already online, and I’d love to hear your feedback. Thank you!

  1. netlab was known as netsim-tools at that time. ↩︎

see 1 comments

ICMP Redirects Considered Harmful

One of my readers sent me an intriguing challenge based on the following design:

  • He has a data center with two core switches (C1 and C2) and two Cisco Nexus edge switches (E1 and E2).
  • He’s using static default routing from core to edge switches with HSRP on the edge switches.
  • E1 is the active HSRP gateway connected to the primary WAN link.

The following picture shows the simplified network diagram:

read more see 3 comments

Feedback: DMVPN Webinars

Some webinars on are ancient (= more than a decade old). I’m refreshing some of them (the overhaul of Introduction to Virtualized Networking was completed earlier this month); others will stay as they are because the technology hasn’t changed in a long while, and it’s always nice to hear someone still finds them useful. This is a recent feedback I got on the DMVPN webinars:

As with any other webinar I have viewed on, this one provides the background as to why you may or may not want to do certain things and what impact that may have (positive or negative) on your network. Then it digs into the how of actually doing something. Brilliant content as always. is my go-to for deep dives on existing and emerging technologies in the networking industry. No unnecessary preamble. Gets straight to the point of why you are looking at a specific technology and explains the what and the why before getting into the how.

add comment

Worth Reading: Performance Testing of Commercial BGP Stacks

For whatever reason, most IT vendors attach “you cannot use this for performance testing and/or publish any results” caveat to their licensing agreements, so it’s really hard to get any independent test results that are not vendor-sponsored and thus suitably biased.

Justin Pietsch managed to get a permission to publish test results of Junos container implementation (cRPD) – no surprise there, Junos outperformed all open-source implementations Justin tested in the past.

What about other commercial BGP stacks? Justin did the best he could: he published Testing Commercial BGP Stacks instructions, so you can do the measurements on your own.

add comment

netsim-tools (now netlab) on the Modem Podcast

A few weeks ago, Nick Buraglio and Chris Cummings invited me for an hour-long chat about netlab on the Modem Podcast1.

We talked about why one might want to use netlab instead of another lab orchestration solution and the high-level functionality offered by the tool. Nick particularly loved its IPAM features which got so extensive in the meantime that I had to write a full-blown addressing tutorial. But there’s so much more: you can also get a fully configured OSPFv2, OSPFv3, EIGRP, IS-IS, SRv6, or BGP lab built from more than a dozen different devices. In short (as Nick and Chris said): you can use netlab to make labbing less miserable.

  1. netlab was known as netsim-tools when we were recording that podcast. ↩︎

add comment

The Impact of Jumbo Maximum Frame Size on Data Center Switches

Sander Steffann sent me an intriguing question a long while ago:

I was wondering if there are any downsides to setting “system mtu jumbo 9198” by default on every switch? I mean, if all connected devices have MTU 1500 they won’t notice that the switch could support longer frames, right?

That’s absolutely correct, and unless the end hosts get into UDP fights things will always work out (aka TCP MSS saves the day)… but there must be a reason switching vendors don’t use maximum frame sizes larger than 1514 by default (Cumulus Linux seems to be an exception, and according to Sébastien Keller Arista’s default maximum frame size is between 9214 and 10178 depending on the platform).

read more see 3 comments

Running BGP between Virtual Machines and Data Center Fabric

Got this question from one of my readers:

When adopting the BGP on the VM model (say, a Kubernetes worker node on top of vSphere or KVM or Openstack), how do you deal with VM migration to another host (same data center, of course) for maintenance purposes? Do you keep peering with the old ToR even after the migration, or do you use some BGP trickery to allow the VM to peer with whatever ToR it’s closest to?

Short answer: you don’t.

Kubernetes was designed in a way that made worker nodes expendable. The Kubernetes cluster (and all properly designed applications) should recover automatically after a worker node restart. From the purely academic perspective, there’s no reason to migrate VMs running Kubernetes.

read more see 2 comments