Apart from the “they have no clue what they’re talking about” observation, Evil CCIE left a long list of leaf-and-spine fabric myths he encountered in the wild in a comment on one of my blog posts. He started with:
Clos fabric (aka Leaf And Spine fabric) is a non-blocking fabric
That was obviously true in the days when Mr. Clos designed the voice switching solution that still bears his name. In the original Clos network every voice call would get a dedicated path across the fabric, and the number of voice calls supported by the fabric equaled the number of alternate end-to-end paths.
In packet switching networks we have (at best) statistically non-blocking behavior – as long as no output port is congested, and the ECMP algorithm running on ingress switch does a perfect traffic distribution. Fat chance… for more details read at least the TL&DR version of the CONGA article (HT: Boris Hassanov).
What we do have today are non-blocking switches… but even that means nothing more than the internal switching bandwidth is equal to the sum of external-facing bandwidth across all ports. As soon as an output port is congested the switch cannot be non-blocking anymore.
But wait, there are the details that silicon vendors don’t want you to know (and thus they only show you their hardware documentation after you sign NDA in blood):
- Most switching silicon has 40GE or 100GE connections that are then split out into 10/25/50 GE front-panel ports. It seems that at least some chipsets have head-of-line blocking challenges across 25GE lanes of a single 100GE port.
- Internal fabric bandwidth is just one of the parameters. Packet forwarding performance is another one… and not all silicon can do line-rate forwarding of small packets.
- Every single packet has metadata attached to it while traversing the internal (intra-switch) fabric as JR Rivers explained in the Networks, Buffers and Drops webinar (available with free ipSpace.net subscription). Some chipsets might struggle with the amount of bandwidth needed to transport both packets content and metadata across the internal fabric.
Finally, we usually build oversubscribed leaf-and-spine fabrics. The total amount of leaf-to-spine bandwidth is usually one third of the edge bandwidth. Leaf-and-spine fabrics are thus almost never non-blocking, but they do provide equidistant bandwidth.
If you want to know more about leaf-and-spine fabrics (and be able to figure out where exactly the vendor marketers cross the line between unicorn-colored reality and plain bullshit), start with the Leaf-and-Spine Fabric Architectures webinar (part of Standard ipSpace.net subscription).
You can also take one step further and enroll in the Designing and Building Data Center Fabrics online course which includes three design assignments reviewed by a member of ipSpace.net ExpertExpress team.
Finally, when you want to be able to design more than just the data center fabrics, check out the Building Next-Generation Data Center online course.