Fabric « ipSpace.net blog

Wednesday, June 10, 2026 07:42 +0200… updated on Friday, June 12, 2026 19:51 +0200

Goodbye, Leaf-and-Spine Networks?

Of course not

A friend of mine sent me links to a new paper published by AWS engineers, and an associated LinkedIn post which claims:

We got lean, resilient, massive aggregation fabrics that provide 33% better throughput with 69% fewer routers, savings 27% of costs, cutting power usage by 40%, and reducing CO2 emissions.

The obvious question one should ask after reading the hyperventilated Radical Network Redesign blog post is thus: is this the end of leaf-and-spine networks? Of course not. Let’s go into the details.

read more see 2 comments

Tuesday, April 21, 2026 07:44 +0200

Hmmm: Rail-Optimized Networking for AI Workloads

Phil Gervasi wrote an interesting article describing Rail-Optimized Networking for AI Training Workloads. Go read it first; I’ll wait.

Does it sound interesting? Were you able to see behind the curtain and figure out what it’s really about?

Response: The Usability of VXLAN

Wes made an interesting comment to the Migrating a Data Center Fabric to VXLAN blog post:

The benefit of VXLAN is mostly scalability, so if your enterprise network is not scaling… just don’t. The migration path from VLANs is to just keep using VLANs. The (vendor-driven) networking industry has a huge blind spot about this.

Paraphrasing the famous Dinesh Dutt’s Autocon1 remark: I couldn’t disagree with you more.

read more see 2 comments

Tuesday, June 11, 2024 12:41 +0200… updated on Thursday, June 13, 2024 11:02 +0200

The Mythical Use Cases: Traffic Engineering for Data Center Backups

Vendor product managers love discussing mythical use cases to warrant complex functionality in their gear. Long-distance VM mobility was one of those (using it for disaster avoidance was Mission Impossible under any real-world assumptions), and high-volume network-based backups seems to be another. Here’s what someone had to say about that particular unicorn in a LinkedIn comment when discussing whether we need traffic engineering in a data center fabric.

When you’re dealing with a large cluster on a fabric, you will see things like inband backup. The most common one I’ve seen is VEEAM. Those inband backups can flood a single link, and no amount of link scheduling really solves that; depending on the source, they can saturate 100G. There are a couple of solutions; IPv6 or eBGP SID has been used to avoid these links or schedule avoidance for other traffic.

It is true that (A) in-band backups can be bandwidth intensive and that (B) well-written applications can saturate 100G server links. However:

Why Are We Using EVPN Instead of SPB or TRILL?

Dan left an interesting comment on one of my previous blog posts:

It strikes me that the entire industry lost out when we didn’t do SPB or TRILL. Specifically, I like how Avaya did SPB.

Oh, we did TRILL. Three vendors did it in different proprietary ways, but I’m digressing.

read more see 2 comments

Thursday, March 14, 2024 08:55 +0100

Data Center Fabric Designs: Size Matters

The “should we use the same vendor for fabric spines and leaves?” discussion triggered the expected counterexamples. Here’s one:

I actually have worked with a few orgs that mix vendors at both spine and leaf layer. Can’t take names but they run fairly large streaming services. To me it seems like a play to avoid vendor lock-in, drive price points down and be in front of supply chain issues.

As always, one has to keep two things in mind:

read more see 2 comments

Monday, August 14, 2023 07:10 UTC

Worth Reading: Networking for AI Workloads

Sharada Yeluri (Senior Director of Engineering at Juniper Networks) wrote a long article describing the connectivity requirements of AI workloads and new approaches to Ethernet fabrics. Definitely worth reading if you’re interested in these topics.

add comment

Tuesday, May 23, 2023 06:36 UTC

Dealing with Cisco ACI Quirks

Sebastian described an interesting Cisco ACI quirk they had the privilege of chasing around:

We’ve encountered VM connectivity issues after VM movements from one vPC leaf pair to a different vPC leaf pair with ACI. The issue did not occur immediately (due to ACI’s bounce entries) and only sometimes, which made it very difficult to reproduce synthetically, but due to DRS and a large number of VMs it occurred frequently enough, that it was a serious problem for us.

Here’s what they figured out:

What Happened to Leaf Switches with Four Uplinks?

The last time I spent days poring over vendor datasheets collecting information for the overview part of Data Center Fabrics webinar a lot of 1RU data center leaf switches came in two form factors:

48 low-speed server-facing ports and 4 high-speed uplinks
32 high-speed ports that you could break out into four times as many low-speed ports (but not all of them)

I expected the ratios to stay the same when the industry moved from 10/40 GE to 25/100 GE switches. I was wrong – most 1RU leaf data center switches based on recent Broadcom silicon (Trident-3 or Trident-4) have between eight and twelve uplinks.

External Links on Spine Switches

A networking engineer attending the Building Next-Generation Data Center online course asked this question:

What is the best practice to connect DC fabric to outside world assuming there are 2 spine switches in the fabric and EVPN VXLAN is used as overlay? Is it a good idea to introduce edge (border) switches, or it is better to connect outside world directly to the spine?

As always, the answer is “it depends,” this time based on:

Category: Fabric