bridging « ipSpace.net blog

Tuesday, September 6, 2011 16:48 +0200

Large-Scale Bridging = Nuked Earth

If you’re not working for a data center fabric vendor, you’ll probably enjoy the excellent analogy Ethan Banks made after reading my TRILL-over-WAN post:

Think of a network topology like a road map. There's boulevards, major junction points, highways, dead ends, etc. Now imagine what that map looks like after it's been nuked from orbit: flat. Sure, we blew up the world, but you can go in a straight line anywhere you want.

... and don’t forget to be nice to the people asking for inter-DC VM mobility ;)

add comment

Thursday, September 1, 2011 06:06 +0200

VXLAN, OTV and LISP

Immediately after VXLAN was announced @ VMworld, the twittersphere erupted in speculations and questions, many of them focusing on how VXLAN relates to OTV and LISP, and why we might need a new encapsulation method.

VXLAN, OTV and LISP are point solutions targeting different markets. VXLAN is an IaaS infrastructure solution, OTV is an enterprise L2 DCI solution and LISP is ... whatever you want it to be.

Imagine the Ruckus When the Hypervisor Vendors Wake Up

It seems that most networking vendors consider the Flat Earth architectures the new bonanza. Everyone is running to join the gold rush, from Cisco’s FabricPath and Brocade’s VCS to HP’s IRF and Juniper’s upcoming QFabric. As always, the standardization bodies are following the industry with a large buffet of standards to choose from: TRILL, 802.1ag (SPB), 802.1Qbg (EVB) and 802.1bh (Port extenders).

EVB (802.1Qbg) – the S component

Update 2021-01-03: IBM implemented EVB in Linux bridge, and Juniper added EVB support to Junos, but I haven't seen (or heard of) a single EVB implementation since I wrote this blog post almost 9 years ago.

The Edge Virtual Bridging (EVB; 802.1Qbg) standard solves two important layer-2-based virtualization issues:

Automatic provisioning of access switches based on hypervisor-signaled information (discussed in the EVB eases VLAN configuration pains article)
Multiplexing of multiple logical 802.1Q links over a single physical link.

Logical link multiplexing might seem a solution in search of a problem until you discover that VMware-related design documents usually recommend using 6 to 10 NICs per server – an approach that either wastes switch ports or is hard to implement with blade servers’ mezzanine cards (due to limited number of backplane connections).

TRILL/Fabric Path – STP Integration

Every Data Center fabric technology has to integrate seamlessly with legacy equipment running the venerable Spanning Tree Protocol (STP) or one of its facelifted incarnations (for example, RSTP or MST). The alternative, called rip-and-replace when talking about other vendors’ boxes or synchronized upgrade when promoting your wares (no, I haven’t heard it yet, but I’m sure it’s coming), is simply indigestible to most data center architects.

TRILL and Cisco’s proprietary Fabric Path take a very definitive stance: the new fabric is the backbone of the network routing TRILL-encapsulated layer-2 frames across bridged segments (TRILL) or contiguous backbone (Fabric Path). Both architectures segment the original STP domain into small chunks at the edges of the network as shown in the following figure:

Don’t Try to Fake Multi-chassis Link Aggregation (MLAG)

Martin sent me an interesting challenge: he needs to connect an HP switch in a blade enclosure to a pair of Catalyst 3500G switches. His Catalysts are not stackable and he needs the full bandwidth between the switches, so he decided to fake the multi-chassis link aggregation functionality by configuring static LAG on the HP switch and disabling STP on it (the Catalysts have no idea they’re talking to the same switch):

Does Bridge Assurance Make UDLD Obsolete?

I got an interesting question from Andrew:

Would you say that bridge assurance makes UDLD unnecessary? It doesn't seem clear from any resource I've found so far (either on Cisco's docs or on Google)."

It’s important to remember that UDLD works on physical links whereas bridge assurance works on top of STP (which also implies it works above link aggregation/port channel mechanisms). UDLD can detect individual link failure (even when the link is part a LAG); bridge assurance can detect unaggregated link failures, total LAG failure, misconfigured remote port or a malfunctioning switch.

Traffic Trombone (what it is and how you get them)

Every so often I get a question “what exactly is a traffic trombone/tromboning”. Here’s my attempt at a semi-formal definition.

Traffic trombone is a term (probably invented by Greg Ferro) that colorfully describes inter-VLAN traffic flows in a network with stretched (usually overlapping) L2 domains.

In a traditional L2/L3 data center architecture with small L2 domains in the access layer and L3 forwarding across the core network, the inter-subnet traffic flows were close to optimal: a host would send a packet toward the first-hop (ingress) router (across a bridged L2 subnet), the ingress router would forward the packet across an optimal path toward the egress router, and the egress router would deliver the packet (yet again, across a bridged L2 subnet) to the destination host.

Local Area Mobility (LAM) – the true story

Every time I mention that Cisco IOS had Local Area Mobility (LAM) (the feature that would come quite handy in today’s virtualized data centers) more than a decade ago, someone inevitably asks “why don’t we use it?” LAM looks like a forgotten step-child, abandoned almost as soon as it was created (supposedly it never got VRF support). The reason is simple (and has nothing to do with the size of L3 forwarding tables): LAM was always meant to be a short-term kludge and L3 gurus never appreciated its potentials.

Layer-3 gurus: asleep at the wheel

I just read a great article by Kurt (the Network Janitor) Bales eloquently describing how a series of stupid decisions led to the current situation where everyone (but the people who actually work with the networking infrastructure) think stretched layer-2 domains are the mandatory stepping stone toward the cloudy nirvana.

It’s easy to shift the blame to everyone else, including storage vendors (for their love of FC and FCoE) and VMware (for the broken vSwitch design), but let’s face the reality: the rigid mindset of layer-3 gurus probably has as much to do with the whole mess as anything else.

How Did We Ever Get Into This Switching Mess?

If you’re confused about the numerous meanings of a switch, you’re not the only one. If you wonder how the whole mess started, here’s the full story (from the biased perspective of a grumpy GONER):

In the early 1980s, there were no bridges or routers. Hosts communicated directly with each other or used intermediate nodes (usually hosts, sometimes dedicated devices called gateways) to pass traffic. Networking engineers’ lives would have remained simple were it not for a few overly bright engineers at DEC who decided their application (LAT) would run directly on layer 2 to make it faster.

Their company imploded (actually, it was sold in pieces) in the previous millennium, but their eagerness to cut corners still haunts every one of us.

PFC/ETS and storage traffic: the real story

Data Center Ethernet (or DCB or CEE, depending on who you are) is a hot story these days and it’s no wonder that misconceptions galore. However, when I hear several CCIEs I highly respect talk about “Priority Flow Control can be used to stop all the other traffic when storage needs more bandwidth”, I get worried. Exactly the opposite is true: you use PFC to stop the overzealous storage traffic (primarily FCoE, but also iSCSI) to make sure you don’t drop it.

Introduction to 802.1Qbb (Priority-based Flow Control — PFC)

Yesterday I wrote that you don’t need DCB technologies to implement FCoE in your network. The FC-BB-5 standard is quite explicit (it also says that 802.1Qbb is the other option):

Lossless Ethernet may be implemented through the use of some Ethernet extensions. A possible Ethernet extension to implement Lossless Ethernet is the PAUSE mechanism defined in IEEE 802.3-2008.

The PAUSE mechanism (802.3x) gives you lossless behavior, but results in undesired side effects when you run LAN and SAN traffic across a converged Ethernet infrastructure.

FCoE and DCB standards

The debate whether the DCB standards are complete or not and thus whether FCoE is a standard-based technology are entering the metaphysical space (just a few more blog posts and they will join the eternal angels-on-a-hairpin problem), but somehow the vendors are not yet talking about the real issues: when will we see the standards implemented in shipping products and will there be a need to upgrade the hardware.

Read more ... (yet again @ etherealmind.com)

see 1 comments

Friday, August 6, 2010 13:32 +0200

How many large-scale bridging standards do we need?

Someone had a “borderless data center mobility” dream a few years ago and managed to infect a few other people, resulting in a networking industry pandemic that is usually exhibited by the following “facts”:

Unhindered Virtual Machine mobility across the globe is the absolute prerequisite for any business agility. Wrong. There are other field-proven solutions and although inter-site VM mobility has been demonstrated, it’s still a half-baked idea and has many caveats.
You can only reach that Holy Grail by extending your layer-2 domains across vast distances. Totally wrong. It would be easier to fix L3 routing and signaling protocols than to invent completely new technologies trying to fix L2 problems. Users of Microsoft NLB are might disagree ... in which case I wish them luck in scaling their architecture.
Large-scale bridging is absolutely mandatory if you want to build cloud solutions with tens of thousands of servers. Not sure about that. Google is there, Facebook, Twitter and Amazon are (at least) close, large web hosting providers have been around for years ... and yet they somehow managed to survive with existing technologies and good network designs.

Just today XKCD published a very relevant comic, so I can skip my usually sarcastic comments and focus on the plethora of emerging large-scale bridging standards and implementations. Let’s walk through them:

Category: bridging