Is STP Really Evil?
Maxim Gelin sent me an interesting question:
Can you please explain to me, why is STP supposed to be evil? What's wrong with STP?
STP’s fundamental problem is that it’s a fail-close, not a fail-open protocol.
Ethernet bridges (later renamed to layer-2 switches) were designed to be transparent plug-and-pray devices that you could drop anywhere into the network and hope they’ll work. They could not rely on having a control-plane protocol between adjacent nodes (like most modern routing protocols do) – lack of control-plane communication indicated lack of adjacent bridges.
That’s all nice and dandy until a bridge loses its mind, and stops sending BPDUs (control plane activity) while still forwarding traffic (data plane activity). Adjacent bridges think they have hosts plugged into the affected ports (this is the fail close part), and start forwarding traffic through those ports, resulting in a nice forwarding loop (been there, seen that).
Fail-Open or Fail-Close?
As Chris Marget mentioned in his comment, the “fail-open” or “fail-close” is a clunky terminology bound to be misunderstood.
Being an oldtimer, I always see computer networks as part of generic electrical circuits and switching landscape – for me, “fail-close” = “pass current or traffic on failure” and “fail-open” = “stop passing current or traffic”.
Other people think about computer networks in valve or door analogies. For them “fail close” means “the door or valve is closed on failure – there’s no traffic” and “fail open” obviously means “the door or valve is opened on failure, and the traffic passes”.
In the context of this blog post “fail close” means “a failed/confused bridge continues to forward the traffic, and the bridged network will send the traffic across such bridge.”
You might have a different opinion on what “open” or “close” means, and it’s as valid as any… but quoting Cisco’s documentation won’t make your point any more valid (it just proves that the writer of that document agrees with your view of what opens or closes on failure). I would however appreciate a pointer to a more authoritative source (although I doubt it exists).
Back to Bridging and STP
The solution to the confused bridge traffic forwarding problem is quite simple: Cisco IOS has bridge assurance – you configure a port to expect an adjacent bridge, and the port doesn’t forward traffic if it doesn’t receive BPDUs from the other end.
The generic solution to this particular problem (and a few others, including hosts turning into bridges) seems to be extremely simple: allow a switch port to be a host-facing port (implicitly configuring BPDU guard and a few other things) or a fabric port (implicitly configuring bridge assurance and VLAN trunking). Why hasn’t any vendor implemented such a simple concept? I can’t figure it out – your comments are most welcome!
It Gets Worse
Fail-close nature of STP isn’t its only drawback. The original STP had numerous other challenges, from slow convergence to lack of VLAN awareness. Unfortunately the IEEE decided to keep heaping kludges on top of STP until the whole thing nearly toppled over – it’s like trying to build the global Internet by tinkering with RIP ad nauseam instead of designing BGP.
- Inserted fail open or fail close section to reduce the terminology confusion.
So Radia Perlman invented STP constrained by several requirements. For a station it should not matter if it communicates to a station on the same LAN segment or via a bridge to a station on another LAN segment. Since Ethernet has no TTL value, however, frames could loop forever on a ring. Thus, STP needs to maintain the illusion of a single shared medium. This is why STP builds a tree.
Later on with the introduction of twisted pairs as the wired medium, Ethernet’s physical bus topology converged to a physical star topology requiring a central hub that would create the illusion of a shared medium. After all a hub forwards any incoming frames to all ports but the port a frame was received from. With the introduction of learning bridges (switches) Ethernet could have transformed into something even more powerful. However, with a huge installed basis compatibility becomes a key issue and revolution turns into evolution.
We all know that STP and large L2 domains have their (severe) limitations. However, without understanding the past it’s very easy to complain about how things have developed.
One of the disadvantages of STP that you didn’t mention is the waste of capacity. Because of the tree structure redundant paths cannot be used. Thus, frames might travel much farther than the physical topology would require. Shortest Path Bridging (SPB, 802.1aq) solves all of (R/M)STP’s problems boosting Ethernet into another dimension. At its core SPB runs IS-IS – another protocol invented by Radia Perlman …
... and a few others.
you write, STP’s fundamental problem is that it’s a fail-close, not a fail-open protocol.
Should this read, STP’s fundamental problem is that it’s a fail-open, not a fail-close protocol?
"Fail-open nature of STP isn't its only drawback..." should instead read "Fail-closed nature of STP isn't its only drawback...". It leads to confusion.
Bridge Assurance works nice in a pure STP topology (802.1w and 802.1s only, 802.1d does not support BA) since it has blocked ports by definitions and these ports prohibit L2 loop creation. That's fine for a campus environment (if your hardware/software supports it as well :) )
But in a pure DC environment where you deployed Nexus 5k/2k with the vPC feature, Cisco definitely does not recommend to enable Bridge Assurance other than on the vPC peer link.
The biggest problem with STP in my opinion is that it's a dangerous combination of "just works", "difficult to diagnose" and "somewhat fragile", making it easy to screw up in ways that are hard to understand while it's broken.
There have been some wonderfully documented cases of folks who should know better pushing STP beyond its breaking point. The STP process didn't stop responding, but it couldn't do its job anymore.
No mention of loopguard? It's a bit like bridge assurance (in one direction, anyway), and quite a bit more commonly available.
Fail open/closed is clunky language that begets confusion. It's worse in the case of bump-in-the-wire devices with bypass relays, but the confusion has already appeared here in the comments.
I tend to view the close/open question from the electrical (switch) perspective, but a quick google of "ethernet tap fail open" demonstrates that folks marketing and using network devices sometimes mean exactly the opposite.
If we are networking engineers or trainee in networking, must remember that the main activity in a digital device is 0 and 5 volts, and because that, the Ivan's point of open and close is correct, like a elemental electronic circuit.
I think you mean:
STP’s fundamental problem is that it’s a fail-open, not a fail-closed protocol.
“Transparent bridging is the result of a long technological evolution that was guided by the desire to keep the property of the thick coaxial cable that was the base for the original Ethernet networks. Transparent means that the stations using the service are not aware that the traffic they are sending is bridged; they are not participating in the bridging effort. The technology is similarly transparent to the user, and a high end Ethernet switch running STP is still supposed to be plug-and-play, just like a coaxial cable or a repeater were. As a result, unlike routers, bridges have to discover whether their ports are connected to peer bridges or plain hosts. In particular, in the absence of control message reception on a port, a bridge will assume that it is connected to a host and will provide connectivity. Therefore, the most significant differences between routing and bridging with STP (spanning tree protocol) are as follows:
• A routing protocol identifies where to send packets.
• STP identifies where not to send frames.
The obvious consequence is that if a router fails to receive the appropriate updates, the parts of the network that were relying on this router for communication will not be able to reach each other. This failure tends to be local, as the communication within those distant network parts is not affected. If a bridge misses control information, it will instead open a loop. As it has been observed, this will most likely impact the whole bridging domain.”
I summarize this using language that aligns with network security discussions, in which "fail closed" is a safe failure condition vs "fail open":
"To summarize the two major points being made in the quote from Cisco:
1. a routed interface will “fail closed” with no impact on any other routed interfaces, while a set of transparently bridged ports defined as a single virtual LAN will “fail open” with an impact on all network components of the VLAN
2. the scope of a routed network failure is limited to the ports connected to the routed interface, while the scope of a bridged VLAN failure can impact all networking equipment in the entire bridging domain (entire data center, or multiple data centers in the case of stretched VLANs)
If you want STP to fail close you need bridge assurance.
It's a share Ivan's posts gets mixed up.
For the implicit enablement of bpdu-guard or BA , since they are opposite things, when bpdu-guard says if I see bpdu from here , I will not allowed and take an action , while BA expects bidirectional hello messages , for the switches where the uplink logic applies it might be enabled IMO. But in the data center it may not be an easy since server and switch places may change and when you enable bpdu-guard for the server implicitly, if same port would be changed with switch, port Stp expectation automatically should change as BA. Is this possible, maybe yes. Once switch port see the Ethernet source mac address , from the vendor assigned part of MAC address , switch could act based on it. You might say connected device might be belong to Cisco switch and server since many vendor has switches, servers, firewalls so on in the data center , then for the different product type vendor could assigned MAC addresses hierarchically such as 00-00-01 is our switches , 00-00-02 is servers so on.
OTOH, if you don't know whether another switch or a server is connected to a port of a DC switch, you might have bigger problems than STP on your hands ;)
If a bridge loses its mind??
You are in essence saying you need a configuration variable (such as "bridge assurance") to make a switch stop forwarding traffic in case it, or an adjacent switch, has a buggy control plane implementation!
A better rule might be, just buy well-tested switches.
A correctly implemented bridge/switch must guarantee that it processes spanning tree with the highest level of priority, and treats send/receive BPDUs on the network with the highest level of priority.
Sorry, but once you implement a control plane protocol in a buggy manner, all bets are off. It's your fault for buying a switch which does not function as specified.
The switch that lost its mind (in my case) came from one of the large vendors, and the loss of control plane was caused by a slow memory leak.
You have mentioned the comment "because the forwarding entry for the STP multicast address still punts packets to the CPU".
Can you elucidate a bit please?
If I am mistaken not a BPDU being punted to CPU is still processed (though any PDU will be punted to CPU irrespective the platform).
What's the correlation you are tying to make here?
By the way, your blog is like a new Network Engineer born out of silos.