VPLS is a technology, not just a service provider offering

The Internet Exchange and Peering Points Packet Pushers Podcast is as good as the rest of them (listen to it first and then continue reading), but also strangely relevant to the data center engineers. When you look beyond the peering policies, route servers and BGP tidbits, an internet exchange is a high-performance large-scale layer-2 network that some data center switching vendors are dreaming about ... the only difference being that the internet exchanges have to perform extremely well using existing products and technologies, not the shortest-path-bridging futures promised by the vendors.

It was therefore extremely interesting to hear Stephen Wilcox mention VPLS as one of the technology options (more so as it relies on MPLS transport ;). The four-letter word predictably triggered Greg’s service-providers-are-idiots rant (and I agree with him to a point – it’s hard to trust your data to someone who can’t produce a readable bill at the end of the month), so unfortunately the more important message was lost: VPLS is a technology that you can use to build large-scale data center layer-2 domains.

As I explained in “The big picture with a VPLS example” post (and in the Choose the Optimal VPN Service webinar), VPLS is the technology the service providers use to build any-to-any layer-2 domain across an MPLS backbone (even more, you can have partially-meshed pseudo-LAN networks, but let’s not get there at the moment).

Now imagine that you build your data center core as an IP+MPLS backbone:

  • It uses IP routing, so it’s stable, offers optimal equal-cost load balancing and responds to outages in seconds (or however well you tune your routing protocol).
  • If you deploy MPLS Traffic Engineering (MPLS TE) on top of the IP core (preferably with automatic fully-meshed tunnels and auto-bandwidth), your point-to-point loads will be shifted around the core as needed. With proper design, you’ll get way better results than with equal-cost load balancing.
  • MPLS TE fast reroute (FRR) gives you the ability to temporarily shift traffic around failure points in milliseconds (two orders of magnitude faster than non-optimized IP routing protocols).

When you have a stable MPLS backbone, you can deploy VPLS over it. Point-to-point pseudowires are established automatically (you should use BGP-based autodiscovery) and the MPLS core takes care of traffic engineering and failure rerouting.

Last but definitely not least, the venerable (and lovingly hated) Spanning Tree Protocol stops at the VPLS edge. You might want to use it in the access network, or you could build an STP-free network and connect your hosts directly to the VPLS cloud.

By now you should be wondering what’s stopping the VPLS adoption in data center architectures. The reasons are mostly non-technical:

  • VPLS is perceived to be a service provider technology, so it must be complex and unreliable (see above).
  • VPLS designer/implementer must have thorough knowledge of numerous technologies, including MPLS, IGP and BGP (preferably also MPLS-TE and FRR).
  • Good VPLS design is a serious piece of engineering. It’s easier to believe someone will sprinkle a future wonder-technology (TRILL, for example) like a fairy dust across your data center core to make it work better with zero design/configuration effort (my kids are Winx Club fans, so I’m very familiar with the Believix charms ;).
  • Some vendors have gaps in their data center portfolio, so they are not keen to talk about VPLS. For example, the only data center switch from Cisco supporting VPLS is the venerable Catalyst 6500 (which is no longer listed as a data center switch).

As you know, I usually suffer from a particularly bad case of vendor-blindness. Tell us which vendors you would use to build a VPLS network like I've described above in the comments ... and don't forget, we're looking for high-end switches with decent number of GE/10GE ports.

26 comments:

  1. > service-providers-are-idiots

    I feel this statement is too broad. SPs typically have large teams of people working there, and ones you get to talk to as an outsider are almost always less-technical-but-with-good-people-skills pre-sales and marketing types.

    > you build your data center core as an IP+MPLS backbone

    I think it may turn out to be quite an expensive proposition. You typically need a fair lot of bw in a data centre, and those 40G router line cards are not cheap. I won't even mention the 100G.

    > It uses IP routing

    Isn't it possible to get away with CLNS+IS-IS? ;)

    > use BGP-based autodiscovery

    Uh-oh. This is where it gets seriously religious - BGP vs LDP. ;)

    > Spanning Tree Protocol stops at the VPLS edge

    Doesn't have to. You could either transparently pass STP BPDUs, block them, or even participate in an attached STP instance. Sometimes it can prove very useful (one case is when you take RX and TX fibres on an access port connected to a VPLS and plug them together "to check port and cable integrity"). Depending on your vendor implementation, port may come up and create an instant line-rate L2 forwarding loop. This has happened - plugging RX and TX together is apparently a common practice for telco field staff, who are used to install and commission "old school" transmission gear (think ATM/SONET/SDH). Switching on STP inside VPLS helped to stop it from happening (yes, there are better methods now, but it was "then", not "now").

    Anyway, back to the point - my opinion is that you may be pressed hard to find equipment which supports VPLS well and has port configurations suitable for DC deployment, at a decent price. Would be glad to be proven wrong.
  2. "Packet Pushers" podcast should have been renamed into "Cisco datacenter with an optional F5 load balancer for dummies" podcast. Actually this is what they have proudly committed to in their Q&A section.
  3. The SP (and IX) I used to work for a few years back now, used Brodcade MLX gear for their MPLS (VLL/VPLS) networks. The port costs were quite reasonable for 1 gige and 10 gige line cards. The sonet cards cost about the same as the rest of the populated chassis combined :(

    You could even do 32 x 10 gig LAG if you really needed it....
  4. Excellent feedback, Dmitri ... as always. Thank you!

    Now to the individual points (and we agree on most of them):

    ==> Service Providers are ****** (insert whatever you wish here)

    I completely agree with your explanation. To extend it a bit, there are always good and bad examples (and as I wrote, I was qouting Greg, I've met a few really good service providers), but the generic truth is that even the best engineers cannot prove their value to the customer in a wrong environment ... and if an organization cannot get the basics right (their accounting and revenue-collecting machinery), you have to wonder about the quality of all the other internal processes.

    It's not the skills or commitment of individual engineer that matters here (although they are prerequisites), it's also the capability of an organization to deliver constant high-quality experience to the customers.

    ==> IP+MPLS backbone in the DC

    Some of the startups have reasonably-priced L3 ports. Even the lowly Cat 6500 could provide you with IP+MPLS backbone (although not exactly at the price/performance point some people are looking for).

    ==> Need for IP routing

    I'm positive You need IP addressing and IP routing for BGP/LDP, MPLS TE and PW endpoints. If I'm wrong, please let me know.

    ==> BGP-based autodiscovery

    I don't care what protocol you use (as I wrote, I'm vendor-blind and the religion I'm familiar with is based on BGP) as long as endpoint autodiscovery works.

    ==> STP

    Thanks for the tip about TX/RX craziness. BPDU guard should solve that, I guess.

    ==> Equipment

    As I wrote, we would both like to be proven wrong. I know you can do what I propose with Cat 6500, but that's too liitle these days.
  5. How exactly is your comment contributing to the "can we build it with VPLS" discussion?

    I'm positive the Packet Pushers team will appreciate your questions or your offer to appear as a guest on their podcast to bring us closer to another aspect of networking ... you just need to get in touch with them.
  6. Thank you for the feedback! So the equipment is there to make it work 8-)
  7. Indeed AMS-IX use the MLX-32 etc and have been pushing more than 1 Tb/s through their VPLS.

    http://www.ams-ix.net/statistics/

    and the VPLS design is here

    http://www.ams-ix.net/infrastructure-detail/

    My network only peaked in the tens of gigabits/s
  8. So, man up pretty boy and come on the show and tell us how it's done. Use the contact form at http://packetpushers.net to send me your details and you can tell the world what your point of view is.
  9. You should also note that the Foundry gear is quite cheap. But that wouldn't have anything to with it.

    It does work though and has the necessary throughput. But they are a PITA to configure.
  10. From an MPLS point of view I found them just as easy to configure as a Cisco (easier in some respects).

    BGP is identical. Vlans were handled differently, but certainly not difficult.

    The only thing that annoyed me was certain configuration directives couldn't be entered until a preceding directive which required a reboot to take affect was enacted. So we had to go through about 3 reboots to get it to our SOE config. Once the SOE config was fully installed we didn't have to repeat that process thankfully.


    And yes they were quite cheap and that did have a lot to do with my company going with them. I saw the initial quotes from Cisco and Foundry, 90% off comes to mind. Especially with what Cisco was trying to get away with price wise for 10 gig long range optics.
  11. The problem with VPLS is that you eventually are limited by several things:

    - # of Vlans
    - # of MACs
    - duplicate MAC addresses
    - limitations of tagging schemes to overcome the limitations above, such as H-VPLS with QinQ tagging (which can cause duplicate MACs) or with MACinMAC encapsulation

    VPLS/H-VPLS could cause a fair bit of havoc in an environment where HSRP/VRRP was used, as the MAC address is determined by the HSRP group number/Virtual Router # used
  12. Aren't those the limitations of every L2 solution?

    Just to set the record straight: I'm a quite vocal opposer to the "bridge everywhere" craze that's spreading like wildfire, but if you want to build a large L2 domain (for whatever reason), is VPLS a viable solution or not?
  13. Sam, quite the opposite:

    and Ivan is right about the large L2 domain - a classic example is RTP VPLS for the VOIP traffic from MSAGs - VPLS is what every SP (that i deal with at least) does for this service

    won't you agree that VPLS scales much better than any L2-switch network?

    - # of Vlans - you can create number of VPLS instances, and you have the flexibility of not having to drag the vlan tag into the VPLS.

    - # of MACs - you can use H-VPLS and PBB with the VPLS to scale it

    - you can create any topology you like - full mesh/partially mesh
    - you can use split-horizon groups
    - use of active/standby PWs and signal states over the PW
    - basically it's much easier (with the 2 above features) to create a loop-free topology and avoid STP - which is what most SPs are looking for

    and much more...
  14. > the capability of an organization

    Agree with this. Sadly, too many companies are too concerned with cramming whatever warez they peddle down customers' throats, instead of working out what those customers *really* want and would be happy to pay for.

    > need IP

    I think you're right. My bad.

    > endpoint autodiscovery

    Things may have developed a fair bit since I had a close look last time (which was longer ago than I would like to admit), but LDP-based VPLS didn't use to support endpoint autodiscovery, leaving the BGP-based VPLS the only option if you wanted to have it.

    > Equipment

    Yes, I thought of mentioning Foundry (Brocade) / Extreme, but I haven't looked at them for a long while and not sure of how well they fare. Most of the places I know of well seem to favour Alcatel-Lucent ESS/SR, and these tend be on a bit of an expensive side. They are very nice and both speed and port density is there, but they could be a bit of a capability overkill for a fairly simple DC scenario (what with being purpose-made for service edge and whatnot).
  15. @Anon: can you help me figure out how to configure standby (backup) PW for a VPLS neighbor with Cisco IOS. Doing it for standalone PW is easy, I can't make it work within VPLS definition.
  16. Last time i checked this was not supported...but maybe has changed by now

    Cisco is your friend in the L3VPN world
    but they lack features with PW/VPLS - i think this is one of them.

    while ALU/JNPR (maybe other as well) already support PW into L3 interfaces (GRT or VRF, and can even enahce it with Active/Standby functionality), Cisco is quite behind i'm afraid and requires an extra box
  17. oh yeah, even Huawei is able to terminate PW into VRF :)
  18. I operate a large Brocade NetIron XMR network and I find them significantly easier to configure than our old Cisco 7600s. We offer all services on a single platform with predictable results. At this point, I'd select Brocade over Cisco boxes even if the price was higher.
  19. my issue with VPLS is that:

    when someone says "Bridge it using VPLS", you don't really consider duplicate MAC addresses. with H-VPLS using QinQ encapsulation (IEEE802.1ad), what used to be in different vlans gets placed into one vlan, which means you have a duplicate MAC in the same vlan now when this MAC is a HSRP/VRRP MAC, or perhaps and potentially a randomly chosen Virtual machine MAC address.

    What's the behavior of VPLS when there's 2 or more things at different locations with the same MAC, which are in the same Vlan because you used 802.1ad/QinQ to give you H-VPLS to be able to scale? Your devices with the conflicting MACs have their communication broken... So H-VPLS with QinQ/802.1ad is *more* failure prone, with respect to a duplicate MAC issue, than just building a full/partial mesh of portchannels.

    What's the alternative? H-VPLS with PBB/MACinMAC/802.1ah-2008. Great! What IOS supports it? 12.2 SRE train the cutting edge.. (bugs anyone?) What cards? Only the most expensive and latest cards, and only in a 7600 router...

    Faced with this, and duplicate MACs you cannot control, do you implement 7600 PE pairs are $400k/pair and use H-VPLS with PBB, or do you sit down and drill into "do you really need any-any connectivity which can support duplicate MACs?"

    Do you go try to control the duplicate MACs? (good luck!)

    Do you rip out VPLS and put in EoMPLS port-mode pseudowires instead (because maybe you really need bridging between 2, not X>2 locations?

    Having been on the sharp end of having to fix someone else's hosed H-VPLS with QinQ implementation, implemented BEFORE MACinMAC was available (circa 2008), in environment where the service had to be able to deal with duplicate MACs used on things which should not be geographically split over a bridge WAN connection , but were (ie HSRP router pairs, VRRP firewall pairs), given the requirement really for only 2 locations, and given the futility of controlling the choosing of MAC addresses by the server admins, standby group numbering of external customers, etc... and given the cost of retrofitting H-VPLS with PBB on expensive new cards on the latest SRE train IOS, I ripped out a complicated H-VPLS with QinQ implementation and replaced it with EoMPLS port-mode pesudowires on routed ports. And I even had the full stamp of approval of the TMEs at Cisco that Advanced Services consulted on it.
  20. So I guess... given unlimited money and requirement to support duplicate MACs in different Vlans, given a great engineering team that all understand MPLS and VPLS, and given a requirement to support any-any bridging to 3 or more locations... yes, H-VPLS with PBB/802.1ah... or possibly A-VPLS, but I need to read up on that more...


    given other constraints (money is not unlimited, really only 2 locations, engineering team are not service provider engineers well versed in MPLS/VPLS, but enterprise engineers who say, VFI-what?), then maybe I look for a tried and true, simpler solution for bridging that I need, or my personal favorite, go attack the IMHO specious need for bridging in the first place. In one instance, it was EoMPLS instead, and it's been solid as a rock with no issues, simple to understand for the enterprise engineers which support it, and ultimately only connects 2 locations, so the any-any VPLS benefit is not needed.
  21. Thanks for the extensive feedback. Now I understand where you got burnt ... so did I and I didn't even need VPLS to hit the Q-in-Q problem, plain bridging between a few switches was more than enough.

    So it seems you've been faced with stretched L2 domains AND duplicate MAC addresses. Stretched L2 domains are (in my book) the utmost stupidity someone could do, running FHRP on them is going one step further and throwing VPLS in the mix is obviously the grand finale.
  22. Sam,

    Well, i thought it was a technology disscution... :)

    Sounds like you had a bad experience with CSCO, and that's ok - i can relate

    but you get much more with less money if you'll use other vendors (IMO at least)

    Will the world stop being so Cisco-centric alrady?! ;)
  23. plain PW and VPLS are completely 2 distinct services - the first is P2P, hance no mac learning is needed and it's basically more of a L1 service rather than a L2...

    the second is a multipoint services - so compare the 2 and your'e back to the apples to pears story

    I dont understand why you're stuck on the QinQ story - you can workaround it with VPLS instance per VLAN, qualified learning, some even support (some plan to support) Vlan-ranges into a VPLS instance

    anyhow, the thing is it's not the VPLS's "fault" - you can find bad design everywhere...

    I do agree VPLS should not replace a well-desinged L3-segmented network. But i do think VPLS is your best alternative for a large L2 domain
  24. as a customer, do you really want to contact the Service Provider everytime you want to bridge one more Vlan over your VPLS service?

    Obviously not. Do you want to have the VPLS service break your network because of duplicate MACs?

    When you have had dumb design in the past that split FHRP's or worse, split HA clusters of firewalls, load balancers, servers, etc... duplicate MACs are a fact of life. No these things should not be split over a WAN... but people ignore good advice all the time. If you need to BRIDGE between data centers, the technology needs to be able to deal with whatever bad behavior has been implemented in the past.

    H-VPLS and VPLS are not it, not until you can do H-VPLS with MACinMAC encapsulation, because only H-VPLS with MACinMAC encapsulation gives you the actual isolation so that the service can deal with dumb things like duplicate MACs, scales up to fairly arbitrary customer MACs, and allows the customer to make what ever changes (good or bad) in their network without the VPLS services causing a problem, and without having to contact the service provider every time they bridge an additional vlan.....

    MACinMAC is a fairly recent development (2008) and just now being implemented in products and deployed... still buggy.
  25. for a large L2 domain, yes VPLS is probably your best choice IF you really need multipoint any-any connectivity. If you need point-to-point connectivity, VPLS is probably overkill complexity-wise and cost-wise, and only recently has VPLS in the form of H-VPLS with MACinMAC been able to fully isolate the customer network from the service provider network in terms of vlans used, mac addresses carried etc.

    I think that in most instances where you have data center interconnect for bridging being the driver for VPLS services, the # of datacenters == 2, so how valid is the argument for any-any connectivity for all datacenter interconnect designs.

    If you are doing data center consolidation, do you attack consolidation of multiple datacenters into fewer all at once? Not in any of the efforts I have been involved in. You typically *divide* a problem to conquer it, which means while you may need temporary bridging of data centers for consolidation, it is limited, and you typically deal with consolidating one DC to another at a time, so you typically need point-to-point temporary bridging, not permanent.
  26. the issues of VPLS and H-VPLS scalability and dealing with duplicate MACs and full encapsulation of MACs to fully isolate customer and SP networks is a technology issue, not a vendor centric issue.
Add comment
Sidebar