Plexxi has an incredibly creative data center fabric solution: they paired data center switching with CWDM optics, programmable ROADMs and controller-based traffic engineering to get something that looks almost like distributed switched version of FDDI (or Token Ring for the FCoTR fans). Not surprisingly, the tools we use to build traditional networks don’t work well with their architecture.
In a recent blog post Marten Terpstra hinted at shortcomings of Shortest Path First (SPF) approach used by every single modern routing algorithm. Let’s take a closer look at why Plexxi’s engineers couldn’t use SPF.
One Ring to Rule Them All
The cornerstone of Plexxi ring is the optical mesh that’s automatically built between the switches. Each switch can control 24 lambdas in the CWDM ring (8 lambdas pass through the switch) and uses them to establish connectivity with (not so very) adjacent switches:
- Four lambdas (40 Gbps) are used to connect to the adjacent (east and west) switch;
- Two lambdas (20 Gbps) are used to connect to four additional switches in both directions.
The CWDM lambdas established by Plexxi switches build a chordal ring. Here’s the topology you get in a 25-node network:
And here’s how a 10-node topology would look like:
The beauty of Plexxi ring is the ease of horizontal expansion: assuming you got the wiring right, all you need to do to add a new ToR switch to the fabric is to disconnect a cable between two switches and insert a new switch between them as shown in the next diagram. You could do it in a live network if the network survives a short-term drop in fabric bandwidth while the CWDM ring is reconfigured.
Full Mesh Sucks with SPF Routing
Now imagine you’re running a shortest path routing protocol over a chordal ring topology. Smaller chordal rings look exactly like a full mesh, and we know that a full mesh is the worst possible fabric topology. You need non-SPF routing to get a reasonable bandwidth utilization and more than 20 (or 40) GBps of bandwidth between a pair of nodes.
There are at least two well-known solutions to the non-SPF routing challenge:
- Central controllers (well known from SONET/SDH, Frame Relay and ATM days);
- Distributed traffic engineering (thoroughly hated by anyone who had to operate a large MPLS TE network close to its maximum capacity).
Plexxi decided to use a central controller, not to provision the virtual circuits (like we did in ATM days) but to program the UCMP (Unequal Cost Multipath) forwarding entries in their switches.
Does that mean that we should forget all we know about routing algorithms and SPF-based ECMP and rush into controller-based fabrics? Of course not. SPF and ECMP are just tools. They have well-known characteristics and well understood use cases (for example, they work great in leaf-and-spine fabrics). In other words, don’t blame the hammer if you decided to buy screws instead of nails.
Dan Backman did a great job describing Plexxi ring architecture during the last Data Center Fabrics update session. If you’re even remotely interested in creative data center solutions you really should watch the recording of his presentation.