Category: bridging
STP and Expert Beginners
Maxim and myself continued our STP discussion and eventually agreed that while STP might not be the best protocol out there (remember: it had to run on Z80 CPU), it’s the only standardized thing that prevents nasty forwarding loops, prompting Maxim to ask another seemingly simple question:
What's so wrong with STP, that there are STP haters out there turning it off wherever they see it?
Welcome to the wonderful world of Expert Beginners.
Is STP Really Evil?
Maxim Gelin sent me an interesting question:
Can you please explain to me, why is STP supposed to be evil? What's wrong with STP?
STP’s fundamental problem is that it’s a fail-close, not a fail-open protocol.
Layer-3 Switching over VXLAN Revisited
My Trident 2 Chipset and Nexus 9500 blog post must have hit a raw nerve or two – Bruce Davie dedicated a whole paragraph in his Physical Networks in Virtualized Networking World blog post to tell everyone how the whole thing is a non-issue and how everything’s good in the NSX land.
It’s always fun digging into more details to figure out what’s really going on behind the scenes; let’s do it.
STP in Brocade VCS Fabric – an Interesting Solution after a Long Wait
A few years ago I lambasted the lack of STP support in Brocade’s VCS fabric. It took Brocade over two years to solve the problem, but they finally came up with an interesting end-to-end solution.
Here are a few highlights; for more details read the Configuring STP-type Protocols section in Network OS Administrator Guide.
Whose Failure Domain Is It?
Draco made a valid comment to my Keep Your Failure Domain Small post:
What could a small ISP do to limit failure domains? Metro Ethernet and MPLS Virtual Private LAN service are all the rage, and offers customers the promise of being able to connect all their branch offices together, and use the same set of VLANs with free Layer 2 connectivity between their sites. It's either: extend the failure domains, or lose out in selling the service, b/c the customer will buy from another ISP.
Well, your customer’s failure domain doesn’t have to be yours.
Keep Your Failure Domains Small
A week after the disastrous sleet that kicked whole regions of Slovenia off power grid the servicemen of the local power distribution company (working literally days and nights) managed to restore electricity to the closest town … but it still might take days or even weeks before everyone gets it. One of the reasons: huge failure domains.
TTL in Overlay Virtual Networks
After we get rid of the QoS FUD, the next question I usually get when discussing overlay networks is “how should these networks treat IP TTL?”
As (almost) always, the answer is “It depends.”
Layer-2 Extension (OTV) Use Cases
I was listening to the fantastic OTV Deep Dive PQ Packet Pushers podcast while biking around the wonderful Slovenian forests. They started the podcast by discussing OTV use cases, Ethan throwing in long-distance vMotion (the usual long-distance L2 extension selling point), but refreshingly some of the engineers said “well, that’s not really the use case we see in real life.”
So what were the use cases they were mentioning?
Layer-2 DCI with Enterasys Switches
The second half of the Enterasys DCI Solutions webinar focused on real-life case studies. First the less interesting one: long-distance live VM migration (you know my feelings about the whole concept, but sometimes you just have to do it) and the role of fabric routing and host routing in the process.
Sooner or Later, Someone Will Pay for the Complexity of the Kludges You Use
I loved listening to OTV/FabricPath/LISP Packet Pushers podcast. Ron Fuller and Russ White did a great job explaining the role of OTV, FabricPath and LISP in a stretched (inter-DC) subnet deployment scenario and how the three pieces fit together … but I couldn't stop wondering whether there is a better method to solve the underlying business need than throwing three new pretty complex technologies and associated equipment (or VDC contexts or line cards) into the mix.
Extending Layer-2 Connection into a Cloud
Carlos Asensio was facing an “interesting” challenge: someone has sold a layer-2 extension into their public cloud to one of the customers. Being a good engineer, he wanted to limit the damage the customer could do to the cloud infrastructure and thus immediately rejected the idea to connect the customer straight into the layer-2 network core ... but what could he do?
All it takes is a single misdirected STP packet ...
... and the rest is history ;)
VM BPDU spoofing attack works quite nicely in HA clusters
When I wrote the Virtual switches need BPDU guard blog post, I speculated that you could shut down a whole HA cluster with a single BPDU-generating VM ... and got a nice confirmation during the Troopers 13 conference – ERNW specialists successfully demonstrated the attack while testing the security aspects of a public cloud implementation for a major service provider.
For more information, read their blog post (they also have a nice presentation explaining how a VM can read ESXi hard drive with properly constructed VMDK file).
NEC ProgrammableFlow Scalability Features
Once you get rid of spanning tree and associated kludges (not too hard in OpenFlow-based networks), BUM flooding becomes your biggest enemy. NEC’s engineers implemented some interesting features in the ProgrammableFlow switches and controllers: rate-limiting of unknown unicast frames, flooding control, and ARP snooping (if only they’d go for ARP proxy).
VXLAN Gateway Design Guidelines
Mark Berly spent plenty of time explaining the in-depth intricacies of VXLAN-to-VLAN gateways during our VXLAN Technical Deep Dive webinar. He’s obviously heavily immersed in this challenge and hits 9+ on the Nerd Meter, so you might have to watch the video a few times to get all the nuances. What can I say – we’ll have fun times in the coming years ;)