Category: fabric
Lock-In Is Inevitable – Get Used to It!
For whatever reason (subliminal messages from vendor marketing departments?), I’m constantly brooding about the vendor lock-in, its inevitability, and the way supposedly disruptive companies try to use the fear of lock-in to persuade naive customers to buy their products.
vLAG Caveats in Brocade VCS Fabric
Brocade VCS fabric has one of the most flexible multichassis link aggregation group (LAG) implementation – you can terminate member links of an individual LAG on any four switches in the VCS fabric. Using that flexibility is not always a good idea.
2015-01-23: Added a few caveats on load distribution
Improving ECMP Load Balancing with Flowlets
Every time I write about unequal traffic distribution across a link aggregation group (LAG, aka Etherchannel or Port Channel) or ECMP fabric, someone asks a simple question “is there no way to reshuffle the traffic to make it more balanced?”
TL&DR summary: there are ways to do it, and some vendors already implemented them.
Load Balancing Elephant Storage Flows
Olivier Hault sent me an interesting challenge:
I cannot find any simple network-layer solution that would allow me to use total available bandwidth between a Hypervisor with multiple uplinks and a Network Attached Storage (NAS) box.
TL&DR summary: you cannot find it because there’s none.
Facebook Next-Generation Fabric
Facebook published their next-generation data center architecture a few weeks ago, resulting in the expected “revolutionary approach to data center fabrics” echoes from the industry press and blogosphere.
In reality, they did a great engineering job using an interesting twist on pretty traditional multi-stage leaf-and-spine (or folded Clos) architecture.
Just Published: Juniper Data Center Switches
Want to know what the difference between Virtual Chassis and Virtual Chassis Fabric is? How Local Link Bias works? How ISSU on QFX 5100 works even though the box doesn’t have two supervisor boards? You’ll find answers to all these questions in new videos describing Juniper data center switches.
Just Published: Brocade VCS Fabric Videos
The Data Center Fabric Architectures update session in late June included a whole new section on Brocade’s VCS fabric and new features they added in Network OS 4.0. The edited videos have been published and cover these topics:
Infrastructure as Code Actually Makes Sense
When I heard people talking about “networking infrastructure as code” I dismissed that as yet another Software-Defined-Everything one-controller-to-rule-it-all hype. Boy was I wrong.
Unnumbered OSPF Interfaces in Quagga (and Cumulus)
Carlos Mendioroz sent me an interesting question about unnumbered interfaces in Cumulus Linux and some of the claims they make in their documentation.
TL&DR: Finally someone got it! Kudos for realizing how to use an ancient trick to make data center fabrics easier to deploy (and, BTW, the claims are exaggerated).
Trident 2 Chipset and Nexus 9500
Most recently launched data center switches use the Trident 2 chipset, and yet we know almost nothing about its capabilities and limitations. It might not work at linerate, it might have L3 lookup challenges when faced with L2 tunnels, there might be other unpleasant surprises… but we don’t know what they are, because you cannot get Broadcom’s documentation unless you work for a vendor who signed an NDA.
Interestingly, the best source of Trident 2 technical information I found so far happens to be the Cisco Live Nexus 9000 Series Switch Architecture presentation (BRKARC-2222). Here are a few tidbits I got from that presentation and Broadcom’s so-called datasheet.
Can We Just Throw More Bandwidth at a Problem?
One of my readers sent me an interesting question:
I have been reading at many places about "throwing more bandwidth at the problem." How far is this statement valid? Should the applications(servers) work with the assumption that there is infinite bandwidth provided at the fabric level?
Moore’s law works in our favor. It’s already cheaper (in some environments) to add bandwidth than to deploy QoS.
… updated on Monday, February 15, 2021 15:00 UTC
How Line-rate Is Line-rate?
During yesterday’s Data Center Fabrics Update presentation, one of the attendees sent me this question while I was talking about the Arista 7300 series switches:
Is the 7300 really non-blocking at all packet sizes? With only 2 x Trident-2 per line card it can't support non-blocking for small packets based on Trident-2 architecture.
It was an obvious example of vendor bickering, so I ignored the question during the presentation, but it still intrigued me, so I decided to do some more research.
Queuing Mechanisms in Modern Switches
A long while ago there was an interesting discussion started by Brad Hedlund (then at Dell Force10) comparing leaf-and-spine (Clos) fabrics built from fixed-configuration pizza box switches with high-end chassis switches. The comments made by other readers were all over the place (addressing pricing, wiring, power consumption) but surprisingly nobody addressed the queuing issues.
This blog post focuses on queuing mechanisms available within a switch; the next one will address end-to-end queuing issues in leaf-and-spine fabrics.
Data Center Protocols in HP Switches
HP representatives made some pretty bold claims during Networking Tech Field Day 1, including “our switches will support EVB, FCoE, SPB and TRILL.” I took them three years to deliver on those promises (and the hardware they had at that time doesn’t exactly support all features they promised), but their current protocol coverage is impressive.
OpenFlow Support in Data Center Switches
Good news: In the last few months, almost all major data center Ethernet switching vendors (Arista, Cisco, Dell Force 10, HP, and Juniper) released documented GA version of OpenFlow on some of their data center switches.
Bad news: no two vendors have even remotely comparable functionality.