MPLS/VPN-over-GRE-over-IPSec: Does It Really Work?

Short answer: yes, it does.

During the geeky chat we had just after we’d finished recording the Data Center Fabric Packet Pushers podcast, Kurt (@networkjanitor) Bales asked me whether the MPLS/VPN-over-DMVPN scenarios I’m describing in Enterprise MPLS/VPN Deployment webinar really work (they do seem a bit complex).

I always test the router configurations I use in my webinars and I usually share them with the attendees. Enterprise MPLS/VPN Deployment webinar includes a complete sets of router configurations covering 10 scenarios, including five different MPLS/VPN-over-DMVPN designs, so you can easily test them in your lab and verify that they do work. But what about a live deployment?

To be honest, we don’t have a large-scale MPLS/VPN-over-DMVPN live deployment yet (if you do, please share as much as you can in the comments), but Phase 1 DMVPN is not much different from MPLS/VPN-over-(P2P)GRE-over-IPSec ... and we’ve built a 1500+ site network using that solution for one of our customers.

It all started pretty innocently: the customer wanted to reduce costs by replacing their Frame Relay/ATM core with MPLS/VPN WAN services offered by the local service providers. They have to keep different departments using their network strictly separate and MPLS/VPN was the only scalable solution (you don’t want to hear about the design I did for them 15+ years ago).

None of the service providers that they could use was able to provider Carrier’s Carrier services; some of them were severely limited in their routing options (BGP? What BGP? How do you spell that?). As the customer wanted to be totally provider-independent, GRE tunnels were the only option – if you use connected interfaces as tunnel sources, you don’t have to exchange any routing information with the WAN connectivity provider. By building multiple parallel GRE infrastructures (one over each SP network), our customer got total WAN independence and is able to mix-and-match service providers on as-needed basis (they only have to make sure critical sites are connected to at least two providers for redundancy reasons). It’s amazing how that ability helps you in the negotiation process.

Transporting sensitive data across IP infrastructure operated by the service providers was never an option, so we had to add IPsec to the mix, resulting in the stack mentioned in the article title.

Was it easy? Definitely not. Most of the problems were caused by the scale of the project: if you want to run IPsec at gigabit speeds, you need hardware encryption. When we were building the network, Catalyst 6500 was the only reasonable option ... but while it can easily handle MPLS/VPN or GRE or IPsec, it hiccups when you try to do all three things on a single packet. In the end, we had to deploy dual tier architecture similar to this design.

Device configuration was also a challenge: when adding a new site, you have to add bits-and-pieces of configuration to multiple boxes (including the firewalls I haven’t even mentioned yet) and relying on manual configuration process would quickly result in a total mess. Solution: configuration builder, a custom-developed tool that accepts a few parameters describing a new site (or modified parameters of an already deployed site) and generates the configuration snippets that are then downloaded to the network devices.

15 comments:

  1. One of the "Critical Things" that people tend to forget while doing anything (Anytihng means all domains in life including Developing Apps, Configuring networks, civil engineering etc) is the simplest way to find the solution is THE best solution.

    But unfortunately, the general perception is that the more complex you get, the most expert you are. :( I agree with the expert part but not an Intelligent Expert. 8-)
  2. Absolutely agree with all you say. Do you see a simpler solution satisfying the following business requirements:

    (A) Using IP connectivity (MPLS/VPN services or otherwise) from multiple SPs and being completely independent from them and their (in)capabilities of supporting customer's routing and convergence requirements;
    (B) Encrypting sensitive traffic;
    (C) Maintaining strict isolation between departments.
  3. I dont mean to get in a conflict here or something and please dont take this personally as I just expressed an thought that came in my mind. O:-)

    These conditions "completely independent from ISP" and "(in)capabilities ....". Its like someone thinks of them as the greatest expert and just dont believe in anyone else's capabilites. I understand that there are incapabilites dealing with ISP but as I said, we have to work with people as well (to get things resolved) rather than going around people.

    For instance, If I think that my organzation's network team is not capable enough to handle STP/L3 Routing issues. Then I can just configure some workaround through Flex Links or EEM to do the job. I am sure somebody can come up with a working solution but the real solution is to get the right people or train the exisitng ones.
  4. Now is my time to say "I didn't want this to become confrontational". Obviously I was having a harried day yesterday O:-)

    Unfortunately, the reality of MPLS/VPN services (as offered by some SPs) is that they simply cannot satisfy the customers' needs. If the only routing option a SP offers is "OSPF or static routes" and you want to use two SPs, you're (almost) stuck.

    It's not that the engineers working for that particular SP would be bad. They are usually pretty good engineers and some of them are great people. However, they have to live with the business reality (read: service definition) of their organization and can't help you even when they would know how to.

    Anyhow, thanks for the nudges - you gave me food for at least 3 additional blog posts on this topic.
  5. There is a very large deployment at a company called first data. They use 2547overDMVPN and cisco is well aware of this deployment. ASR platforms are being used to deliver high performance routing in this environment.
  6. One of the biggest problems with MPLSoDMVPN is around the establishment of spoke to spoke tunnels, ie it is not really supported. So voice for example has some issues, going back via a hub is not always ideal.

    Now there is this:
    http://www.cisco.com/en/US/docs/ios/interface/configuration/guide/ir_mplsvpnomgre.html#wp1074480

    NHRP and and IGP is no longer needed and the NBMA address is gleaned fro BGP. This combined with GET VPN really gives me hope for MPLSomGREoIPSEC.

    I have labbed it up and it seem to work fantastically. The only thing that is hard to deal with is MTU and ensuring that the encryption is always done in the fast path by not fragmenting after gre encapulation or after encryption.
  7. There are at least three ways (documented in my webinar, together with tested lab configs) to get spoke-to-spoke traffic flowing directly in MPLSoDMVPN environment ... but admittedly you might stumble across someone claiming it's not supported (that's always a great excuse when trying not to focus on the problem).

    MPLS over mGRE is another great solution (also covered in my webinar 8-) ) which works best when you have only MPLS traffic. If you have to add a few VPNs on top of existing DMVPN network, it's hard to justify re-engineering the whole network. Also, GETVPN is not working on Cat 6500 (at least it did not when I last checked), which many people use as the hub encryption platform.
  8. "not supported" was the official stance from out cisco account rep about 6 months ago.

    But we did get it going in the lab.

    We would be looking to deploy with asr 1000 hubs and 3900/asr 1000 series spokes.
    That said I havent checked the latest XE to see if it has support.

    How do you think the best way to tackle mtu? We may be fortunate where we have a core layer behind the spoke pe to reduce the mtu there as we cannot fragment at the same time as label imposition with the above feature.
  9. I would usually set the MTU (+ mpls mtu if needed) on the GRE tunnel interface, which would cause the ingress PE-router to fragment original packets (or send back an ICMP reply) before labeling and encryption. Combined with "ip tcp mss" it solves almost all problems (oversized UDP packets might still be a problem, but usually minor).

    Are you saying this does not work for you? If so, what's the problem?
  10. Ivan,

    This is a very impressive combination of technologies used to create a cool solution, but for example would this be the way to go for a Tier 2 or 3 ISP that didn't have deep enough pockets to run it's own Layer 1 connectivity (DWDM/SDH etc...) and instead chose to get IP services from T1 providers in the form of MPLS Psuedowires?

    Thanks,
  11. Pseudowires are probably a better choice in your scenario. MPLS-over-GRE is really useful only when you have no other transport option but IP connectivity.
  12. How many VRF's were we talking about here? Would another solution be a DWVPN per VRF, if less then say 3? Is that possible?
    Replies
    1. Per-VRF DMVPN tunnel is perfectly doable and works well if the number of VRFs is small (but remember: you'll also have to run a routing protocol per VRF).
  13. I am having issues with MPLS over GRE on PFC3B that equipped the SUP32, does it actually work/supported. SUP32 datasheet indicates it does with some less performance as compared to SUP720.

    Any comments/suggestion will be really helpfull, Many thansk in advance.
Add comment
Sidebar