Do We Need Leaf-and-Spine Fabrics?
Evil CCIE left a lengthy comment on one of my blog posts including this interesting observation:
It's always interesting to hear all kind of reasons from people to deploy CLOS fabrics in DC in Enterprise segment typically that I deal with while they mostly don't have clue about why they should be doing it in first place. […] Usually a good justification is DC to support high amount of East-West Traffic....but really? […] Ask them if they even have any benchmarks or tools to measure that in first place :)
What he wrote proves that most networking practitioners never move beyond regurgitating vendor marketing (because that’s so much easier than making the first step toward becoming an engineer by figuring out how technology really works).
Note that I decided to call some people working in networking practitioners because I honestly can’t see where they’d be applying any engineering methodology to their work.
There’s another really good reason to use leaf-and-spine fabrics: equidistant bandwidth. Regardless of where a device is connected to the fabric, it has the same (minimum average) bandwidth to every other device connected to the same fabric, removing all restrictions on server- or workload placement… or even the discussions about optimal firewall- or load balancer placement. Compare that to traditional 3-tier architectures (or watch the Introduction section of Leaf-and-Spine Fabric Architectures webinar).
This argument obviously applies only when your fabric needs more than just two switches. Also note that leaf-and-spine fabrics are nothing new.
In slightly larger fabrics, the leaf-and-spine architecture with more than two spine switches allows you to use fixed-size switches instead of more complex and more expensive chassis switches. Russ White and Shawn Zandi talked about this topic (and single-SKU data center fabric) in Open Networking webinar.
Finally, if you build a leaf-and-spine fabric with more than two spine nodes, you decrease the performance hit you take on spine switch failure, and (potentially) remove the need for maintenance window when doing spine switch upgrades.
Next steps
Found something interesting in this blog post? Maybe you want to explore the leaf-and-spine fabrics a bit further:
- Start with the Leaf-and-Spine Fabric Architectures webinar and continue with EVPN Technical Deep Dive (both webinars are part of Standard ipSpace.net subscription);
- When you want to test your design skills, go for the Designing and Building Data Center Fabrics online course which includes three design assignments reviewed by a member of ipSpace.net ExpertExpress team;
- When you’re ready for the bigger picture, enroll into Building Next-Generation Data Center online course.
For me the main point is to kill the old "everything 2N redundant" model with fire.
N+1 is the only way to scale economically.
N+1 bandwidth & N+1 redundancy (& N must be >=2) are really nice (& IMHO operationally necessary) but economics are the driver that will get management to sign off on it.
Especially when you're replacing 2N big expensive $VENDOR chassis with N+1 1u merchant silicon boxes.
Just being able to traceroute and see exactly which way packets take are quite helpful in finding where things break. (That assumes you are not doing tunneling/overlays, though.)
My experience is also that at least OSPF converges much faster after a failure than STP (even Rapid STP) after the STP root fails.
There is often more configuration (as in lines in the switch configs) in a routed network than in a bridged network, though, especially when you add a bunch of VRFs to separate various management networks. But unnumbered links makes things easier (allocating all those /31:s and /127:s and making sure you get the right address at the right end can be a large part of the work of adding a VRF). And if you subscribe to the automation mantra, that takes out a large part of the rest. :-)