Blog Posts in November 2023

The BGP Multi-Exit Discriminator (MED) Saga

Martijn Van Overbeek left this comment on my LinkedIn post announcing the BGP MED lab:

It might be fixed, but I can recall in the past that there was a lot of quirkiness in multi-vendor environments, especially in how different vendors use it and deal with the setting when the attribute does exist or does not have to exist.

TL&DR: He’s right. It has been fixed (mostly), but the nerd knobs never went away.

In case you’re wondering about the root cause, it was the vagueness of RFC 1771. Now for the full story ;)

read more see 2 comments

BGP Labs: Set BGP Communities on Outgoing Updates

It’s hard to influence the behavior of someone with strong opinions (just ask any parent with a screaming toddler), and trying to persuade an upstream ISP not to send the traffic over a backup link is no exception – sometimes even AS path prepending is not a strong enough argument.

An easy solution to this problem was proposed in 1990s – what if we could attach some extra attributes (called communities just to confuse everyone) to BGP updates and use them to tell adjacent autonomous systems to lower their BGP local preference? You can practice doing that in the Attach BGP Communities to Outgoing BGP Updates lab exercise.

add comment

Why Do We Need BGP Identifiers?

A friend of mine sent me an interesting question along these lines:

We all know that in OSPF, the router ID is any 32-bit number, not necessarily an IP address of an interface. The only requirement is that it must be unique throughout the OSPF domain. However, I’ve always wondered what the role of BGP router ID is. RFC 4271 says it should be set to an IP address assigned to that BGP speaker, but where do we use it?

Also, he observed somewhat confusing behavior in the wild:

Take two routers and configure the same BGP identifier on both. Cisco IOS will not establish a session, while IOS XR and Junos will.

I decided to take the challenge and dug deep into the bowels of RFC 4271 and RFC 6286. Here’s what I brought back from that rabbit hole:

read more see 1 comments

VXLAN/EVPN Layer-3 Handoff (L3Out) on Arista EOS

A while ago, I published a blog post describing how to establish a LAN/WAN L3 boundary in VXLAN/EVPN networks using Cisco NX-OS. At that time, I promised similar information for Arista EOS. Here it is, coming straight from Massimo Magnani. The useful part of what follows is his; all errors were introduced during my editing process.

In the cases I have dealt with so far, implementing the LAN-WAN boundary has the main benefit of limiting the churn blast radius to the local domain, trying to impact the remote ones as little as possible. To achieve that, we decided to go for a hierarchical solution where you create two domains, local (default) and remote, and maintain them as separate as possible.

read more see 1 comments

Video: Outages Caused by Bugs in BGP Implementations

The previous BGP-related videos described how fat fingers and malicious actors cause Internet outages.

Today, we’ll focus on the impact of bugs in BGP implementations, from malformed AS paths to mishandled transitive attributes. The examples in the video are a few years old, but you can see similar things in the wild in 2023.

You need at least free subscription to watch videos in this webinar.
add comment

Worth Reading: Cloudflare Control Plane Outage

Cloudflare experienced a significant outage in early November 2023 and published a detailed post-mortem report. You should read the whole report; here are my CliffsNotes:

Also (unrelated to Cloudflare outage):

read more see 2 comments

Is Anyone Using netlab on Windows?

Tomas wants to start netlab with PowerShell, but it doesn’t work for him, and I don’t know anyone running netlab directly on Windows (I know people running it in a Ubuntu VM on Windows, but that’s a different story).

In theory, netlab (and Ansible) should work fine with Windows Subsystem for Linux. In practice, there’s often a gap between theory and practice – if you run netlab on Windows (probably using VirtualBox with Vagrant), I’d love to hear from you. Please leave a comment, email me, add a comment to Tomas’ GitHub issue, or fix the documentation and submit a PR. Thank you!

add comment

LAN Data Link Layer Addressing

Last week, we discussed Fibre Channel addressing. This time, we’ll focus on data link layer technologies used in multi-access networks: Ethernet, Token Ring, FDDI, and other local area- or Wi-Fi technologies.

The first local area networks (LANs) ran on a physical multi-access medium. The first one (original Ethernet) started as a thick coaxial cable1 that you had to drill into to connect a transceiver to the cable core.

Later versions of Ethernet used thinner cables with connectors that you put together to build whole network segments out of pieces of cable. However, even in that case, we were dealing with a single multi-access physical network – disconnecting a cable would bring down the whole network.

read more add comment

Git Rebase: What Can Go Wrong?

Julia Evans wrote another must-read article (if you’re using Git): git rebase: what can go wrong?

I often use git rebase to clean up the commit history of a branch I want to merge into a main branch or to prepare a feature branch for a pull request. I don’t want to run it unattended – I’m always using the interactive option – but even then, I might get into tight spots where I can only hope the results will turn out to be what I expect them to be. Always have a backup – be it another branch or a copy of the branch you’re working on in a remote repository.

see 1 comments

Open BGP Daemons: There's So Many of Them

A while ago, the Networking Notes blog published a link to my “Will Network Devices Reject BGP Sessions from Unknown Sources?” blog post with a hint: use Shodan to find how many BGP routers accept a TCP session from anyone on the Internet.

The results are appalling: you can open a TCP session on port 179 with over 3 million IP addresses.

A report on Shodan opening TCP session to port 179

A report on Shodan opening TCP session to port 179

read more see 1 comments

Rapid Progress in BGP Route Origin Validation

In 2022, I was invited to speak about Internet routing security at the DEEP conference in Zadar, Croatia. One of the main messages of the presentation was how slow the progress had been even though we had had all the tools available for at least a decade (RFC 7454 was finally published in 2015, and we started writing it in early 2012).

At about that same time, a small group of network operators started cooperating on improving the security and resilience of global routing, eventually resulting in the MANRS initiative – a great place to get an overview of how many Internet Service Providers care about adopting Internet routing security mechanisms.

read more add comment

Fibre Channel Addressing

Whenever we talk about LAN data-link-layer addressing, most engineers automatically switch to the “must be like Ethernet” mentality, assuming all data-link-layer LAN framing must somehow resemble Ethernet frames.

That makes no sense on point-to-point links. As explained in Early Data-Link Layer Addressing article, you don’t need layer-2 addresses on a point-to-point link between two layer-3 devices. Interestingly, there is one LAN technology (that I’m aware of) that got data link addressing right: Fibre Channel (FC).

read more add comment

Video: Hacking BGP for Fun and Profit

At least some people learn from others’ mistakes: using the concepts proven by some well-publicized BGP leaks, malicious actors quickly figured out how to hijack BGP prefixes for fun and profit.

Fortunately, those shenanigans wouldn’t spread as far today as they did in the past – according to RoVista, most of the largest networks block the prefixes Route Origin Validation (ROV) marks as invalid.


You need at least free subscription to watch videos in this webinar.

add comment

Worth Reading: Taming the BGP Reconfiguration Transients

Almost exactly a decade ago I wrote about a paper describing how IBGP migrations can cause forwarding loops and how one could reorder BGP reconfiguration steps to avoid them.

One of the paper’s authors was Laurent Vanbever who moved to ETH Zurich in the meantime where his group keeps producing great work, including the Chameleon tool (code on GitHub) that can tame transient loops while reconfiguring BGP. Definitely something worth looking at if you’re running a large BGP network.

add comment