On November 22nd, 2023, AMS-IX, one of the largest Internet exchanges in Europe, experienced a significant performance drop lasting more than four hours. While its peak performance is around 10 Tbps, it dropped to about 2.1 Tbps during the outage.
AMS-IX published a very sanitized and diplomatic post-mortem incident summary in which they explained the outage was caused by LACP leakage. That phrase should be a red flag, but let’s dig deeper into the details.
If you’re monitoring the industry press (or other usual hype factories), you might have heard about Ultra Ethernet, a dazzling new technology that will be developed by the Ultra Ethernet Consortium1. What is it, and does it matter to you (TL&DR: probably not2)?
As always, let’s start with What Problem Are We Solving?
Lindsay Hill described an excellent idea: all ports on your
switches routers should be in link aggregation groups even when you have a single port in a group. That approach allows you to:
- Upgrade the link speed without changing any layer-3 configuration
- Do link maintenance without causing a routing protocol flap
It also proves RFC 1925 rule 6a, but then I guess we’re already used to that ;)
Roman Pomazanov documented his thoughts on the beauties of large layer-2 domains in a LinkedIn article and allowed me to repost it on ipSpace.net blog to ensure it doesn’t disappear
First of all: “L2 is a single failure domain”, a problem at one point can easily spread to the entire datacenter.
Got this question from a networking engineer attending the Building Next-Generation Data Center online course:
Has anyone an advice on LACP fast rate? When and why should you use it instead of normal LACP?
Apart from forming link aggregation groups, you can use LACP to detect link- and node failures (more details). However:
TL&DR: Installing an Ethernet NIC with two uplinks in a server is easy1. Connecting those uplinks to two edge switches is common sense2. Detecting physical link failure is trivial in Gigabit Ethernet world. Deciding between two independent uplinks or a link aggregation group is interesting. Detecting path failure and disabling the useless uplink that causes traffic blackholing is a living hell (more details in this Design Clinic question).
Want to know more? Let’s dive into the gory details.
Sometime last autumn, I was asked to create a short “network security challenges” presentation. Eventually, I turned it into a webinar, resulting in almost four hours of content describing the interesting gotchas I encountered in the past (plus a few recent vulnerabilities like turning WiFi into a thick yellow cable).
Each webinar section started with a short “This is why we have to deal with these stupidities” introduction. You’ll find all of them collected in the Root Causes video starting the Network Security Fallacies part of the How Networks Really Work webinar.
With the widespread deployment of Ethernet-over-something technologies, it became possible to build MLAG clusters without a physical peer link, replacing it with a virtual link across the core fabric. Avaya was one of the first vendors to implement virtual peer links with Provider Backbone Bridging (PBB) transport, and some data center switching vendors (example: Cisco) offer similar functionality with VXLAN transport.
Pim van Pelt built an x86/Linux-based using Vector Packet Processor that can forwarding IP traffic at 150 Mpps/180 Gbps forwarding rates on a 2-CPU Dell server with E5-2660 (8 core) CPU.
People keep telling me how well large language models like ChatGPT work for them, so now and then, I give it another try, most often resulting in another disappointment1. It might be that I suck at writing prompts2, or it could be that I have a knack for looking in the wrong places3.
This time4 I tried to “figure out5” why we need iSCSI checksums if we have iSCSI running over Ethernet which already has checksums. Enjoy the (ChatGPT) circular arguments and hallucinations with plenty of platitudes and no clear answer.
The “beauty” (from an attacker perspective) of the original shared-media Ethernet was the ability to see all traffic sent to other hosts. While it’s trivial to steal someone else’s IPv4 address, the ability to see their traffic allowed you to hijack their TCP sessions without the victim being any wiser (apart from the obvious session timeout). Really smart attackers could go a step further, insert themselves into the forwarding path, and inject extra payload into unencrypted sessions.
A recently-discovered WiFi vulnerability brought us back to that wonderful world.
Did you know most chassis switches look like leaf-and-spine fabrics1 from the inside? If you didn’t, you might want to watch the short Chassis Architectures video by Pete Lumbis (author of ASICs for Networking Engineers part of the Data Center Fabric Architectures webinar).
An ipSpace.net subscriber sent me a question along the lines of “does it matter that EVPN uses BGP to implement dynamic MAC learning whereas in traditional switching that’s done in hardware?” Before going into those details, I wanted to establish the baseline: is dynamic MAC learning really implemented in hardware?
Hardware-based switching solutions usually use a hash table to implement MAC address lookups. The above question should thus be rephrased as is it possible to update the MAC hash table in hardware without punting the packet to the CPU? One would expect high-end (expensive) hardware to be able do it, while low-cost hardware would depend on the CPU. It turns out the reality is way more complex than that.