Leaf-and-Spine Fabrics versus Fabric Extenders

One of my readers wondered what the difference between fabric extenders and leaf-and-spine fabrics is:

We are building a new data center for DR and we management is wanting me to put in recommendations to either stick with our current Cisco 7k to 2k ToR FEX solution, or prepare for what seems to be the future of DC in that spine leaf architecture.

Let’s start with “what is leaf-and-spine architecture?

Leaf-and-Spine 101

Leaf-and-spine architecture (a marketing name for three stage folded Clos network) is a physical fabric architecture in which every edge node (leaf) is connected to every core node (spine).


Sample leaf-and-spine fabric

Leaf-and-spine fabrics have equidistant endpoints – any pair of endpoints gets the same average end-to-end bandwidth. We get the "equidistant endpoints" property because leaf-and-spine fabrics are perfectly symmetrical with every leaf switch being connected to every spine switch with uplinks of uniform bandwidth.

Contrary to the original Clos networks that used circuit switching leaf-and-spine fabrics use hop-by-hop packet forwarding (statistical multiplexing). Endpoints are thus equidistant only when the fabric transports large enough number of small flows to make statistical multiplexing and ECMP work.

Also, unless the leaf-and-spine fabric is non-oversubscribed, the endpoints connected to the same leaf switch have more bandwidth available between them than endpoints connected to different leaf switches.

What About Fabric Extenders?

It’s obvious that a bunch of fabric extenders (leafs) connected to a pair of Nexus switches (spines) form a leaf-and-spine fabric.

However, there are several important differences between a fabric extender-based fabric and a leaf-and-spine fabric built with standard data center switches:

  • In a well-designed leaf-and-spine fabric the spine nodes are completely independent – they share no configuration, state or risk. Nexus switches configured as a vPC pair share a lot of configuration and state (and risk).
  • Leaf nodes in a traditional leaf-and-spine fabric are independent devices, whereas fabric extenders act as linecards of the spine switches. The blast radius (how many things can go wrong based on a single failure) on a fabric extender-based architecture is much larger than in a fabric built with independent switches.
  • Independent leaf nodes can do local packet switching whereas in a fabric extender environment all traffic has to traverse the spine layer.
  • Leaf-and-spine fabrics can have more than two spines, resulting in more resilient architecture.

Caveat Emptor

Please note that the above list makes sense only if you’re building a routed fabric – either a L3 fabric or a L2 fabric using TRILL, SPB, VXLAN or whatever proprietary vendor technology.

If you’re trying to build a leaf-and-spine fabric with MLAG technology you’re either limited to two spine switches (in an MLAG pair), a shared control plane on the spine layer (a large blast radius), or a proprietary spine-layer fabric.

Want to know more?

4 comments:

  1. Hi Ivan,
    Nice article. Doesnt the fabric extender technology has the below advantage s over leaf-spine architecture. Please correct me if I am wrong
    ==> Single point of management (I do understand there are ACI and similar technologies which provide the same benefit)
    ==> Cost.
    ==>Wiring. [As you increase the number of spines in leaf-spine, need to add additional cables to all the existing leafs right. Obviously with fabric extenders we wont be able to increase the bandwidth beyond the number of uplinks on the fex]
    Replies
    1. Single point of management: sort-of (actually two points in redundant setup). It's also single point of fat-finger mistake ;)

      Cost: That's purely vendor licensing decision. I doubt the hardware is much cheaper.

      Wiring: identical to 2-spine leaf-and-spine fabric. Nobody said a leaf-and-spine fabric must have 4 spines.
  2. nobody seems to have commented on the limited buffers of the FEXes and the poor effect the buffer limitations has on very bursty traffic such as storage traffic.
  3. You are making things too complex. You don't need to do L3 or "a L2 fabric using TRILL, SPB, VXLAN or whatever." Just use Spanning Tree and have it shut down three of your four uplinks.

    Oh, wait, it's not the 90's anymore. Yet we still have Spanning Tree in a lot of what we do. Play sad trombone sound here.

    Maybe we should give SPB and TRILL another chance.
Add comment
Sidebar