switching « ipSpace.net blog

Thursday, March 20, 2025 08:16 +0100

Routed Interfaces on Layer-3 Switches and Internal VLANs

In the Router Interfaces and Switch Ports blog post, I described why we have switch ports and routed interfaces on layer-3 switches. Another blog post in the same series described the conceptual architecture of a layer-3 switch:

All interfaces are connected to a VLAN-aware switch
The switch interfaces could be access or trunk interfaces¹.
Each VLAN in a VLAN-aware switch can be connected to an internal router through a VLAN interface.

However, that’s not how we configure layer-3 switches. There’s a significant gap between the conceptual configuration model and the internal architecture:

The Linux Bridge MTU Hell

It all started with an innocuous article describing the MTU basics. As the real purpose of the MTU is to prevent packet drops due to fixed-size receiver buffers, and I ~~waste~~ spend most of my time in virtual labs, I wanted to check how various virtual network devices react to incoming oversized packets.

As the first step, I created a simple netlab topology in which a single link had a slightly larger than usual MTU… and then all hell broke loose.

EVPN Rerouting After LAG Member Failures

In the previous two blog posts (Dealing with LAG Member Failures, LAG Member Failures in VXLAN Fabrics) we discovered that it’s almost trivial to deal with a LAG member failure in an MLAG cluster if we have a peer link between MLAG members. What about the holy grail of EVPN pundits: ESI-based MLAG with no peer link between MLAG members?

MLAG Deep Dive: LAG Member Failures in VXLAN Fabrics

In the Dealing with LAG Member Failures blog post, we figured out how easy it is to deal with a LAG member failure in a traditional MLAG cluster. The failover could happen in hardware, and even if it’s software-driven, it does not depend on the control plane.

Let’s add a bit of complexity and replace a traditional layer-2 fabric with a VXLAN fabric. The MLAG cluster members still use an MLAG peer link and an anycast VTEP IP address (more details).

MLAG Deep Dive: Dealing with LAG Member Failures

Craig Weinhold pointed me to a complex topic I managed to ignore in my MLAG Deep Dive series: how does an MLAG cluster reroute around a failure of a LAG member link?

In this blog post, we’ll focus on traditional MLAG cluster implementations using a peer link; another blog post will explore the implications of using VXLAN and EVPN to implement MLAG clusters.

We’ll also ignore the interesting question of “how is the LAG member link failure detected?”¹ and focus on “what happens next?” using the sample MLAG topology:

AMS-IX Outage: Layer-2 Strikes Again

On November 22nd, 2023, AMS-IX, one of the largest Internet exchanges in Europe, experienced a significant performance drop lasting more than four hours. While its peak performance is around 10 Tbps, it dropped to about 2.1 Tbps during the outage.

AMS-IX published a very sanitized and diplomatic post-mortem incident summary in which they explained the outage was caused by LACP leakage. That phrase should be a red flag, but let’s dig deeper into the details.

What Is Ultra Ethernet All About?

If you’re monitoring the industry press (or other usual hype factories), you might have heard about Ultra Ethernet, a dazzling new technology that will be developed by the Ultra Ethernet Consortium¹. What is it, and does it matter to you (TL&DR: probably not²)?

As always, let’s start with What Problem Are We Solving?

Worth Reading: Single-Port LAGs

Lindsay Hill described an excellent idea: all ports on your ~~switches~~ routers should be in link aggregation groups even when you have a single port in a group. That approach allows you to:

Upgrade the link speed without changing any layer-3 configuration
Do link maintenance without causing a routing protocol flap

It also proves RFC 1925 rule 6a, but then I guess we’re already used to that ;)

see 2 comments

switching

Friday, September 22, 2023 06:19 UTC

Repost: L2 Is Bad

Roman Pomazanov documented his thoughts on the beauties of large layer-2 domains in a LinkedIn article and allowed me to repost it on ipSpace.net blog to ensure it doesn’t disappear

First of all: “L2 is a single failure domain”, a problem at one point can easily spread to the entire datacenter.

Are LACP Fast Timers Any Good?

Got this question from a networking engineer attending the Building Next-Generation Data Center online course:

Has anyone an advice on LACP fast rate? When and why should you use it instead of normal LACP?

Apart from forming link aggregation groups, you can use LACP to detect link- and node failures (more details). However:

Path Failure Detection on Multi-Homed Servers

TL&DR: Installing an Ethernet NIC with two uplinks in a server is easy¹. Connecting those uplinks to two edge switches is common sense². Detecting physical link failure is trivial in Gigabit Ethernet world. Deciding between two independent uplinks or a link aggregation group is interesting. Detecting path failure and disabling the useless uplink that causes traffic blackholing is a living hell (more details in this Design Clinic question).

Want to know more? Let’s dive into the gory details.

Network Security Vulnerabilities: the Root Causes

Sometime last autumn, I was asked to create a short “network security challenges” presentation. Eventually, I turned it into a webinar, resulting in almost four hours of content describing the interesting gotchas I encountered in the past (plus a few recent vulnerabilities like turning WiFi into a thick yellow cable).

Each webinar section started with a short “This is why we have to deal with these stupidities” introduction. You’ll find all of them collected in the Root Causes video starting the Network Security Fallacies part of the How Networks Really Work webinar.

Watch the video

You need Free ipSpace.net Subscription to watch the video.

add comment

Friday, May 19, 2023 07:21 UTC

Video: Types of Switching ASICs

Pete Lumbis concluded his ASICs for Networking Engineers presentation with a brief overview of types of switching ASICs and a wrap-up.

You can watch his entire 90-minute presentation (sliced into shorter videos) with Free ipSpace.net Subscription.

Watch the video

add comment

Thursday, May 11, 2023 07:54 UTC

MLAG Clusters without a Physical Peer Link

With the widespread deployment of Ethernet-over-something technologies, it became possible to build MLAG clusters without a physical peer link, replacing it with a virtual link across the core fabric. Avaya was one of the first vendors to implement virtual peer links with Provider Backbone Bridging (PBB) transport, and some data center switching vendors (example: Cisco) offer similar functionality with VXLAN transport.

Video: 400GbE Optics

When 400GbE was still an emerging technology, Mark Nowell explained its basics in an update session of the Data Center Fabric Architectures webinar, starting with 400GbE optics.

Watch the video

You need Free ipSpace.net Subscription to watch the video. To watch the whole webinar, buy Standard or Expert ipSpace.net Subscription.

add comment

Category: switching