Facebook published their next-generation data center architecture a few weeks ago, resulting in the expected “revolutionary approach to data center fabrics” echoes from the industry press and blogosphere.
In reality, they did a great engineering job using an interesting twist on pretty traditional multi-stage leaf-and-spine (or folded Clos) architecture.
They split data center into standard pods. No surprise there, anyone aiming for easy-to-manage scale-out architecture (i.e. not so many people) is doing that – we discussed it on Episode 8 of Software Gone Wild, and I described it in one of the data center design case studies. The second part of this video should give you a few additional ideas along the same lines.
Inside each pod they use leaf-and-spine architecture, almost identical to what Brad Hedlund described in my Leaf-and-Spine Fabric Architectures webinar… including the now-standard 3:1 oversubscription on the leaf switches (48 server-facing ports, four 40GE uplinks).
Note that every fabric switch needs 48 leaf-facing 40GE ports. Adding the necessary pod-to-spine uplinks, they need 96-port 40GE switches to implement this design. I wouldn't be too surprised to see Arista launch a switch meeting these specs at the next Interop ;)
The interesting twist is the inter-pod connectivity. Instead of building a single non-oversubscribed core fabric, and connecting leaf nodes to it (the traditional way of building multi-stage leaf-and-spine fabrics), they treat each pod fabric switch as a leaf node in another orthogonal leaf-and-spine fabric (for a total of four core fabrics), resulting in a data center fabric that can potentially support over 100.000 server ports (the limiting factor is the number of ports on the spine switches).
- Jason Edelman created a nice 2D diagram that makes the multiple layers of leaf-and-spine fabrics more evident;
- Gary Berger wrote a long blog post analyzing the new Facebook fabric including a deep-dive into the port count limitations;
- You’ll find a bit more down-to-earth designs in my Leaf-and-Spine Fabric Architectures and Designing Private Cloud Infrastructure webinars, and I’m usually available for short consulting engagements.