Supposedly it was a problem with the management network used by their optical gear, but it looks a lot like a layer-2 network spanning 15 data centers and no control-plane policing on the managed devices… proving yet again that large-scale layer-2 networks are a really bad idea.
Layer 2 Fabrics can't be extended beyond 2 Spine switches. I had a long argument with a $vendor guys on this. They don't even count SPB as Layer 2 fabric and so forth.
The root cause of this myth is the lack of understanding of what layer-2, layer-3, bridging and routing means. You might want to revisit a few of my very old blog posts before moving on: part 1, part 2, what is switching, layer-3 switches and routers.
Got this response to my Stretched Layer-2 Revisited blog post. It’s too good not to turn it into a blog post ;)
Recently I feel like it's really vendors pushing layer 2 solutions, rather than us (enterprise customer) demanding it.
I had that feeling for years. Yes, there are environment with legacy challenges (running COBOL applications on OS/370 with emulated TN3270 terminals comes to mind), but in most cases it’s the vendors trying to peddle unique high-priced non-interoperable warez.
One of my friends sent me this question:
Do you remember if VLANs came first or was it VRFs?
I remember VLANs using ISL (pre-802.1q encapsulation) on early Cisco Ethernet switches (mid 90s), the earliest reference I could track down on Wikipedia is from 1988.
One of my readers was wondering about the stability and scalability of large layer-2 domains implemented with VXLAN. He wrote:
If common BUM traffic (e.g. ARP) is being handled/localized by the network (e.g. NSX or ACI), and if we are managing what traffic hosts can send with micro-segmentation style filtering blocking broadcast/multicast, are large layer-2 domains still a recipe for disaster?
There are three major (fundamental) problems with large L2 domains:
There are two reasonable ways of building a layer-2 leaf-and-spine fabric: use VXLAN (the direction almost everyone in the industry is taking at the moment), or routing-on-layer-2 technology like TRILL or SPB.
Got this comment to one of my L2-over-VXLAN blog posts:
I found the Avaya SPBM solution "right on the money" to build that L2+ fabric. Would you deploy Avaya SPBM?
Interestingly, I got that same question during one of the ExpertExpress engagements and here’s what I told that customer:
One of my readers sent me a lengthy email asking my opinion about his ideas for new data center design (yep, I pointed out there’s a service for that while replying to his email ;). He started with:
I have to design a DR solution for a large enterprise. They have two data centers connected via Fabric Path.
There’s a red flag right there…
Strange how inter-DC clustering failure is considered a certainty in this blog.
Call it experience or exposure to a larger dataset. Anything you build will eventually fail; just because you haven’t experienced the failure yet doesn’t mean that the system will never fail but only that you were lucky so far.
A few months ago I met a number of great engineers from Avaya and they explained to me how they creatively use Shortest Path Bridging (SPB) to create layer-2, layer-3, L2VPN, L3VPN and even IP Multicast fabrics – it was clearly time for another deep dive into SPB.
It took me a while to meet again with Roger Lapuh, but finally we started exploring the intricacies of SPB, and even compared it to MPLS for engineers more familiar with MPLS/VPN. Interested? Listen to Episode 54 of Software Gone Wild.
My latest spanning tree protocol (STP) posts generated numerous comments, some of them so relevant that I decided to summarize them into another blog post.
Weird Things Happen
The unidirectional link scenario mentioned by Antonio is pretty well known:
Theoretically STP should avoid bridging loops, and yet you claim they cause data center meltdowns. What am I missing?
In theory, STP avoids bridging loops. In practice, there are numerous reasons STP got a bad name.
My friend Christoph Jaggi, the author of fantastic Metro Ethernet and Carrier Ethernet Encryptors documents, sent me this question when we were discussing the Data Center Fabrics Overview workshop I’ll run in Zurich in a few weeks:
When you are talking about large-scale VLAN-based fabrics I assume that you are pointing towards highly populated VLANs, such as VLANs containing 1000+ Ethernet addresses. Could you provide a tipping point between reasonably-sized VLANs and large-scale VLANs?
It's not the number of hosts in the VLAN but the span of a bridging domain (VLAN or otherwise).
Another week, another ExpertExpress session, as is often the case focusing on two data centers with stretched VLANs spanning both of them. However, this one was particularly irksome, as the customer ran a firewall cluster stretched across two locations.
I gave the customer engineers my usual recommendations:
One of the responses I got on my “What is Layer-2” post was
Ivan, are you saying to use L3 switches everywhere with /31 on the switch ports and the servers/workstation?
While that solution would work (and I know a few people who are using it with reasonable success), it’s nothing more than creative use of existing routing paradigms; we need something better.
Update 2015-04-22 14:30Z - Added a link to Cumulus Linux Redistribute Neighbor feature.