Scalability of Common Services MPLS/VPN topology

Nosx added a very valid point-of-view to the MPLS/VPN Common Services Design that uses a shared common service Route Target across numerous client VRFs:

This is an overly complex and unsupportable approach to shared services. Having to touch thousands of VRFs to create a shared services VPN is unacceptable. The correct approach is to touch only the "services" vrf, and import/export to each RT that you wish to insert the services into.

As always, the right answer is “it depends.” If you have few large customers, it makes way more sense to add their RTs to the common services VRF. If you have many small customers, adding RTs to the common services VRF does not scale.

Ignoring for a moment the great fun you’d have trying to troubleshoot an MPLS/VPN network where some BGP routes have hundreds of route targets (so much for supportable), you’re bound to hit some hard limits as you increase the number of customers.

Route targets are propagated around the network as extended community attributes attached to BGP prefixes. You get the first hiccup when you have around 100 route targets attached to a route: extended community attribute gets too large and cannot fit into small buffers any more. Every router receiving the BGP update will generate a syslog message like this:

%BGP-6-BIGCHUNK: Big chunk pool request (832) for extcommunity. Replenishing 
with malloc

Around 500 route targets, the BGP path attributes (primarily the extended community attribute) will get larger than the maximum size of the BGP UPDATE message (4096 bytes, specified in the RFC). At that moment, the originating router will refuse to send the update and every other PE-router will lose the routes to the common services VRF:

%BGP-5-BGP_MAX_MSG_LIMIT: BGP failed to send update message because the 
message size reached bgp maximum message size 4096.

I’ve tested the limits with IOS release 15.0M. Not sure whether older releases would degrade as gracefully as 15.0M does.

Conclusion: the theoretical limit of the “add customer RT to CS VRF” design is 500 customers.

Does it matter?

Common Services VPN topology is hard to use in typical service provider networks as it requires non-overlapping customer address space (as Nosx said: “It created far more problems than it solved, and better solutions (more secure, more scalable, more managable) are available now”).

It still has some utility in environments with coordinated address space (large enterprise/governmental networks) as described in my Enterprise MPLS/VPN Deployment webinar.

7 comments:

  1. "better solutions (more secure, more scalable, more managable) are available now”

    What are some examples of solutions that would match the above?
  2. It's probably best to read his comments.

    They were using Common Services topology for managed VoIP and replaced it with VRF-aware SBC.

    Another common use of Common Services is managed network service (outsourced network management, now known as Management as a Service or Cloud-Based-Management :-P ). Yet again, it's simpler to start per-customer VM with two NICs (one in customer network, one facing the central NMS) that does all the polling.
  3. This goes beyond just SP networks though, I don't see an issue with adding Common RTs to the "Cust" VRF when used in the enterprise. For services separation and whatever policies are in place. So again, the "it depends" comes into play allot in this stuff :)
  4. Makes sense. The first service that popped in my mind was VoIP. Previous employer had Verizon MPLS with SIP trunking and my current employer does as well. As best as I can remember at my previous job we were using the same /24 network to reach Verizon's SBC as we are at my current job over the MPLS.
  5. BTW, on the second limit (max msg size), wait and see if http://tools.ietf.org/html/draft-ymbk-bgp-extended-messages-02 goes through...
  6. Ivan: You may have surmised by now I work for a large application service provider and we have thousands of customers. Shared services and customized delivery of those services is critical to our bottom line. There is one other concern with modifying the shared-services VRF: risk and change-management. Why touch the one shared thing that could potentially jack-up service for thousands? Better to reduce the scope of potential impact by creating individual VRFs for customers and modifying those as necessary...
  7. My employer also runs a VoIP hosted application service. Originally our small customers were content connecting to us via the open internet. Our newer and larger customers are demanding private connections.

    We were going down the road of Shared Services / aka Extranet VRF. Since we are application layer people we didn't really know the pros and cons. Our main network provider is not that helpful either since they only sell a standard service with no customization.

    Seems from above, going with a VRF or VLAN aware SBC maybe a better option for a VoIP only scenario.
Add comment
Sidebar