External Links on Spine Switches
A networking engineer attending the Building Next-Generation Data Center online course asked this question:
What is the best practice to connect DC fabric to outside world assuming there are 2 spine switches in the fabric and EVPN VXLAN is used as overlay? Is it a good idea to introduce edge (border) switches, or it is better to connect outside world directly to the spine?
As always, the answer is “it depends,” this time based on:
- Number of spine nodes. If you want to retain the perfect load balancing across spines, then the external links should be connected across all spines. That’s not a big deal if you have two spines, but if you ever expand to four spines you have to decide what you want to do. However, if you have an order of magnitude more bandwidth inside the fabric than toward the external world then this obviously doesn’t matter
- Interface speeds. Spines usually have higher-speed interfaces, so you might not be able to connect your WAN edge equipment to those interfaces. Even if that works, you’d be wasting several expensive interfaces, but that might still be better than buying two extra switches.
- Packet forwarding features. Juniper was the only vendor (I’m aware of) that thought it makes sense to have complex high-speed (expensive) ASICs in the spine switches. Most ASIC vendors have high-speed ASICs with simple packet forwarding and lower-speed ASICs with complex forwarding behavior. For example, spine switches based on Broadcom Tomahawk cannot do VXLAN routing without packet recirculation.
- Required buffer space. Switches connected to the WAN edge have to deal with significant incast (many sources sending WAN traffic) and speed disparity (traffic going from high-speed core links to low-speed edge links), which usually results in higher buffer requirements. Spine switches usually have relatively smaller buffers, unless you’re buying QFX10Ks from Juniper.
Want a rule-of-thumb1? In most cases it’s better to connect external links to a dedicated pair of leaf switches (assuming you can afford them), and connect network services (firewalls, load balancers…) and maybe even storage (which might require bigger buffers) to the same switches. Buying complex spine switches with large buffers is usually an exercise in wasting money.
More Information
We covered this dilemma somewhere in the Leaf-and-Spine Fabric Architectures webinar, probably as part of multi-pod and multi-site fabrics, but I can’t remember whether it was part of the materials or an answer to an attendee’s question.
-
Also known as best practice ↩︎
Depending on the routing architecture of the fabric, it may also be advisable to have direct links between the spines to run iBGP, in which case you lose even more interfaces on the spines.
You also need to let the spines participate in VXLAN (overlay). They usually don’t need VXLAN because they are just forwarding between the Leafs (underlay). It all adds to the complexity and table sizes for switches that usually only need to be fast.
> connect external links to a dedicated pair of leaf switches
Is dedicated really necessary? I would have thought that connecting external links to any pair of leaf switches would achieve the same outcome. A leaf switch pair might serve a physical physical rack, and if your external connectivity is housed in that rack (with or without "internal" server hardware) then you would simply connect to them.
Yes you can connect them whereever you like. Border or Edge Leafs are just Leafs. But it's a best practice to have a dedicated pair of Border Leafs. Maybe you even need different switches from the rest of your datacenter because there are a lot of copper ports. Or you have a lot of legacy servers with 1/10G copper and your border is all fiber. Then you just need different hardware. For the configuration, it's not that different and every leaf can do it.