Are Fixed Switches More Efficient Than Chassis Ones?
Brad Hedlund did an excellent analysis of fixed versus chassis-based switches in his Interop presentation and concluded that fixed switches offer higher port density and lower per-port power consumption than chassis-based ones. That’s true when comparing individual products, but let’s ask a different question: how much does it take to implement a 384-port non-blocking fabric (equivalent to Arista’s 7508 switch) with fixed switches?
Disclaimer: In this post, I just wanted to show you how the perspective changes when you consider port count more favorable to the chassis switches. There are all sorts of other considerations, read comments to Brad’s post for more details. Also, I would not implement my data center network with a single humongous switch, no matter what the vendor claims about its uptime and redundancy features.
To implement a non-blocking fabric with fixed switches, you have to use a Clos fabric (aka leaf/spine fabric) with no oversubscription – half of the links of every edge switch have to be used for uplinks. Assuming you have fixed switches with 64 10GE ports (the usual configuration if you use Trident+ chipset), each switch has 32 customer-facing ports and 32 fabric ports. In total, you need 384/32=12 edge switches.
Let’s be a bit bolder and use Z9000 switches from Dell Force10 as the edge switches. They have only QSFP+ interfaces, so you’ll need a lot of breakout cables (and create a nice spaghetti nightmare), but you only need six switches, with each switch having 64 10GE customer-facing ports and 16 40GE fabric uplinks.
Next: spine switches. Based on the number of ports you need, three Z9000 switches would be enough, but it would be impossible to get perfect load balancing. It’s somewhat hard to split 16 40GE uplinks into three equally-sized bundles, so a real-life design would need four spine switches, but then the 384 port figure I started with was heavily biased toward a particular chassis solution, so I won’t split hairs over the details.
Grand total: you need 9 Z9000 switches with 1152 10GE port-equivalents (because we’re using QSFP+) to approximate a non-blocking 384-port chassis switch. With 2RU per switch, that’s 18RUs (compared to 11RUs for 7508) and power consumption of 18.3W per customer-facing 10GE port (compared to 17W for 7508). You’d also have nine devices to configure and manage, loads of intra-fabric wiring, and an approximation of a non-blocking fabric (ECMP load balancing can have problems with large flows).
Switch sizes and power consumption figures were taken from Brad’s post.
Summary: be careful when comparing apples and oranges. Always consider the total impact (cost, rack space or power requirements) of a solution, not datasheet figures.
You’ll find a lot more technical details in the Data Center Fabric Architectures webinar (now updated with the latest and greatest features vendors released in the last six months) and the upcoming Clos Fabrics Explained session featuring the one and only Brad Hedlund.
Again you have an issue of 32 ports divided by 6 spine switches, but since it's statistically non-blocking, that should't matter too much, and the ECMP won't care if some of the links terminate on the same spine switch.
You can argue that there would be a cabling mess, but I would suspect it would be less so than trying to run 384 fibers into one rack from all over the datacenter. With the 1U solution, the edge ports are distributed short copper, without expensive optics. MPO ribbon fibers and patch panels would be used from leaf to spine just as they would with 40G connectors to cut down on the number of cables needed, and you would probably divide the spine switches into just two locations in the DC.
The real benefit of the 1U solution is being able to adjust the number of leafs and spines you need based on actual traffic patterns, oversubscribing at the edge where that makes economic sense. You don't have that same flexibility with $350k+ chassis solutions (remember you need *two* of those beasts for redundancy).
Let's pick up 7050S or Nexus 3064X, which use trident's integrated phy's to yield typical power draw in the range 120-130W. The resulting cost per "front panel" port will be significantly lower as compared to a chassis box (unfortunately, I can't give the exact pricing, but one can get very good numbers for fixed boxes, especially when buying in quantities). Furthermore, the power draw per "front port" will be around 18*130/384=6W.
Downside - lot's of boxes and wiring, but that the price you have to pay :) Managing this many boxes is not that difficult, provided that automated provisioning/monitoring and remediation system is in place.
Conclusion - using single-chip boxes makes sense when building cheap larger fabrics. This being said, we should remember that number of links in butterfly/clos grows following the power low with the base of switch radix and power of number of stages. At some point you would rather increase switch radix than add another stage. So let's wait for Trident2 device to hit the market.
Many vendors offer better standard hardware warranties on their non-chassis switches, sometime offer basic software updates for free on some of their products. service contract costs are sometimes way lower on stackables. TCO can be lower
- Have you ever configured a A10500 with service contracts? Instead take a look at the A58XX series...
Chasis based solution is more expensive, mainly due to the voq design.
I wonder if fix switch only can support massive data flow?
(Average power draw under 50% load at 25C ambient is 3800W/loaded Arista 7508)