Where would you need bridging in the Data Center

Thursday, June 24, 2010 09:35 +0200

Where would you need bridging in the Data Center

In the recent months, there’s been a lot of buzz about next-generation Data Center bridging, including the Earth Is Flat rediscovery from Brocade (I thought that was settled in middle ages) and a TRILL article in SearchNetworking (which quoted both Greg and me as being on the opposite sides of the TRILL debate).

The more I think about this problem, the more I’m wondering whether we really need large-scale bridging in data centers (it looks like Google can live quite happily without it). We definitely need some bridging, but generic large-scale inter-site monstrosity? I doubt.

Please try to help me: forget all the “this is how we do it” presumptions, figure out a scenario where you absolutely need bridging and describe it in the comments.

There are four obvious answers:

FCoE (you should really use iSCSI);
Distributed firewalls (more about this one in a separate post);
Microsoft Network Load Balancing (which should have been banned from the DC environment before it got implemented);
Live virtual machine migration (aka VMotion if you use VMware).

Anything else?

Recent posts in the same categories

switching

data center

workshop

6 comments:

Jeremy Filliben 24 June 2010 15:35

Ivan,

I've held my ground on DC to DC L2 interconnect for a few years at my primary employer. VMotion is perhaps the only application requirement that is compelling enough to convince me to be flexible. I went with a flat "No" on Microsoft NLB.

My employer is toying with the idea of relocating a data center to a new location about 100 miles from the existing one. I'm planning to investigate DCB as a potential migration strategy. Depending on what I learn, we could go this route or take a more traditional dedicated L1 circuit pair. Either way, it's an 18 month effort and I fully intend to get back to Layer 3 as soon as possible.

It's VMotion and the benefits it can provide for DR (and secondarily, load-balancing) that I feel will drive DCB.

Jeremy

stretch 24 June 2010 18:26

I'll start off by saying I have virtually (no pun intended) no datacenter experience, but I don't understand from where this need arose. Back before the virtualization era, everything was a physical server and admins didn't arbitrarily relocate them to different datacenters (servers are heavy).

Now we have VMotion, which gives us the ability to do so, provided we can figure out how to connect two datacenters at layer two. But to me, this seems like a solution without a problem. I can understand moving VMs around within a datacenter, but what is the motivation for moving them to different datacenters entirely? Why allow such a huge gap between the application and its storage?

Again, I'm not challenging this strategy, just hoping someone could explain it for us non-DC folks. Ivan? :-D

Dirk Versavel 24 June 2010 18:36

Hi Ivan,

I've been working on quite some projects in various enterprises last 10 years and these were some things for which we were pushed by the business to provide in L2 bridging:

- a middleware application used throughout the company that could only work in "same subnet" mode or in multicast mode. It turnded out that not a single specialist for that middleware application could be found who knew how to convert it to multicast. There was some cisco feature developped for it but it got abandonded years ago. And multicast reconverge time was also an issue. (2004)
- A huge billing application based on X25 over ethernet LLC2 that nobody knew about. (2005)
- a non-routable PLC-protocol (proprietry CLNP-derivate) for PLCS that performed production-process tasks on a huge PLANT. No L3 header was provided in the packet structure (2005)
- older windows clusters are non-geocluster. Same applies for some older unix clusters.
- 1 datacenter is hosting a huge backup robot. The SAN environment spans 3 datacenters and relies on ip backups. Same subnet was needed.

Before 2004 we just bridged and used RPVST.We had DC's down several times due to STP-loops, that was before the arrival of loopguard. Later stackwise switches where used using MEC. Now there's VSS and nexus also offers MEC. I've also worked with redundant VPLS solutions not using STP.

Like always we were always pushed by the business (OSI layer 8 :-))
My experience:
- once you start bridging it's extremely difficult to go back to a routed environment. In big companies you wil sometimes never get rid of it anymore.
- because bridging is so easy you will see that behind your back other departments have put all kinds of other applicatins/infrastructure and after a few years you come to the conclusion that 70% of your applications reside on the "bridged infrastructure".

Nowadays every vendor is fighting for it's market share. As a result firewalls offer more&more router functionality, routers offer more&more firewall features and SAN vendors are probably looking into other markets as well.

Dirk

Dirk Versavel 24 June 2010 19:36

Corrections, I was in a hurry:

RPVST => PVST+
proprietary
developed

PS: Even in 2010 i still get almost every year involved in the effects of a L2 loop in a big company.
L2 is vulnerable as most of us know: storms,flooding,ddos.....this is one of nature's laws.
I would only bridge if I was placed with my back against the wall or if there was a VERY GOOD reason to do so.

Jon Barry 25 June 2010 19:36

Stretch,

Sometimes its DR related or in one recent case it was they idea that a master scheduler would migrate tasks between DC's (move the task related VM via Vmotion) based on some criteria (load, yadda yadda yadda)

Ivan Pepelnjak 26 June 2010 12:21

Unless you have way too much bandwidth, moving VMs between data centers doesn't make much sense as they stay connected to the "old" storage and probably (depending on the design) the old exit points from the DC.

Add comment