Bridging and Routing, Part II
Based on the readers’ comments on my “Bridging and Routing: is there a difference?” post (thanks you!), here are a few more differences between bridging and routing:
Cost. Layer-2 switches are almost always cheaper than layer-3 (usually combined layer-2/3) switches. There are numerous reasons for the cost difference, including:
- Mass-market low-end switches are usually simple bridges. Low-cost high-speed bridging silicon is thus readily available.
- MAC address lookup is simpler than IP table lookup and easier to implement in silicon. You need simple CAM (Content Addressable Memory) to perform MAC address lookup and TCAM] (Ternary CAM) with additional output logic to perform longest-IP-prefix matching.
- Layer-3 switches are expected to perform IP packet filtering. Implementing access lists in hardware (usually with even larger TCAM) is expensive.
Zero configuration. In their simplest incarnation, the bridges are plug-and-play devices (magically transforming themselves into plug-and-pray devices as the network grows); it’s quite easy to find a perfectly working switch named Switch with no non-default configuration in a badly managed network. Routers always require configuration (at the very minimum, you have to configure IP subnets and IP routing protocols).
However, as soon as VLANs are introduced into the network or you need to fine-tune STP, the zero-configuration benefits are gone.
Equal-cost multipath. Routers can load-balance traffic between equal-cost paths across the network. Bridges can load-balance traffic between parallel bonded links (port channel). Redundant paths in bridged networks are disabled to prevent forwarding loops.
Enhancements to port channel technology (VSS and vPC) allow links connected to multiple switches to be bonded. TRILL (and similar technologies) solves the problem, allowing unrestricted equal-cost multipath.
Security. Packet filters between IP subnets are a standard feature of every decent router, allowing the network designer to segment the network into security zones.
Some layer-2 switches have similar functionality (port ACL), which turns a L2 switch into a layer-3-aware L2 device, increasing configuration and troubleshooting complexity.
Predictability. L3 forwarding tables are modified only by the control plane (routing) protocols based on messages exchanged by the routers, not by the data traffic flow. L2 forwarding tables are modified on-the-fly by the data plane snooping functionality based on source MAC addresses in the frames forwarded by the switch.
Troubleshooting. It’s impossible to troubleshoot a bridged network from an end-host; the network is designed to be invisible. The error reporting mechanisms built into most L3 protocols allow an end-host to trace a path across the network, giving the network operator at least an initial snapshot of the network conditions and a troubleshooting starting point.
End-host mobility. The source MAC address snooping (which makes the bridged networks less predictable) allows instant host mobility – as soon as the host is attached to another network segment and sends a broadcast (a gratuitous ARP is a perfect candidate), all bridges readjust their L2 forwarding tables.
You can implement seamless host mobility in a routed network, but the delay is much higher, as the dissemination of changed information is done by the routing protocol.
Impact of link failure. Link failure in a routed network results in temporary loss of traffic forwarded over that link (until routing protocol convergence). Link failure in a bridged network running STP can impact unrelated parts of the network.
TRILL uses a routing protocol (IS-IS); a network built with TRILL RBridges behaves like a routed network.
Impact of physical errors. Most layer-3 routing protocols detect unidirectional links and wiring errors (which usually result in subnet mismatch errors). The same conditions can easily result in a forwarding loop in a bridged network, unless you use UDLD and bridge assurance.
TRILL and other similar technologies no longer have this problem, as they use a routing protocol inside the network.
Impact of network overload. When a L2 switch is overloaded to the point where it stops sending STP packets (for example, due to data plane overload impacting control plane functionality), remote switches might unblock their ports, resulting in a forwarding loop and a total network meltdown.
When a router stops sending routing protocol hello packets, other routers detect a dead neighbor and recomputed the network topology (not necessarily resulting in a working network, but at least they’re not aggravating the problem).
Bridge assurance solves this issue, as does TRILL.
Size of fault domain. Whole bridged network is a single fault domain (a fault anywhere in the network can impact the rest of it). A fault domain in a routed network is a single subnet.
The fault domain issue is usually related to the behavior of STP, but extends to the forwarding plane as well. A single misbehaving host attached to a bridged network can affect the whole network.
Anything else? Have I still missed something? Leave a comment!
You’ll find an even more comprehensive discussion of this topic in Switching, Routing and Bridging part of How Networks Really Work webinar.
I've tried to pay some attention to the TRILL technology (and went through Radia Perlman's web video you kindly pointed to a few days ago), and I do think that in the urge to please everyone on the RFC definition, there might still be a door open regarding the impact of physical errors.
In fact for what I've understood the IS-IS adjacency between RBridges can cross non-TRILL enabled segments, making it somewhat different than Cisco's proposal of L2MP, but more universal.
As such I believe that if you happen to have a problem on the cloud between RBridges, knowing that that cloud will "fail open" (they are L2-based switches running some sort of STP), the risk for hello packets to flow through a storm caused by unidirectional links for instance may occur, but the network will have the erratic behaviour that we've all seen during STP meltdowns.
I'm perhaps pointing to a very corner case (I don't really think that ISIS hello's would be able to go through data plane of "cloud" switches maintaining ISIS adj for a while), but I'd like, if you will, to have your comment/clarification on that.
thank you very much
Gustavo Novais
I think Ethernet OAM&PM is worth mentioning in the troubleshooting section. 802.3ah, 802.1ag and Y.1731.
If someone decides (in his infinite wisdom) to deploy STP-based switched network between TRILL RBridges, he'll suffer exactly the same consequences as someone deploying STP-based switched network between IS-IS routers. It works today and it will work with TRILL, but you'll experience interesting nightmares in both cases.
Obviously just my €0.002, I have no hands-on TRILL experience (but neither has anyone else).
But what is really compared is the CURRENT bridging concept using spanning tree versus routing. Things (and minds) are changing (slowly, as the dominant paradigm is that link state routing at layer two is almost perfect and some people think that no further evolution of bridging is possible beyond link state routing).
But the concept of bridging is evolving (see ARP Path aka Fastpath proposals at IEEE 802.1 repository, IEEE Communication Letters July 2011, HPSR 2011 conference and demos (Sigcom 2011, LCN 2010) so that while some characteristics of bridging persist, like guessing vs calculating, no predictability of the path, other disadvantages dissapear: shortest paths are obtained, all links can be used, link failure does not affect working links.
Simpler, reliable and powerful bridging is possible.