Load sharing in MPLS/VPN networks with route reflectors

Some of the e-mails and comments I received after writing the “Changing VPNv4 route attributespost illustrated common MPLS/VPN misconceptions, so it’s worth addressing them in a series of posts. Let’s start with the simplest scenario: load balancingsharing toward a multi-homed customer site. We’ll use a very simple MPLS/VPN network with three customer sites, four CE-routers, four PE-routers a route reflector:

Let’s assume that we use the default MPLS/VPN RT/RD design rules: one RD and one import/export RT per simple VPN. The IPv6 (or IPv4) default routes received by PE-A and PE-B are transformed into VPNv6 (or VPNv4) routes ([RD]::/0 or RD:0.0.0.0/0) and sent to RR.

RR receives two identical VPNv6 (or VPNv4) routes from two sources (PE-A and PE-B), installs both of them in its BGP table, selects the best one and sends the best one to the other BGP neighbors. PE-C and PE-D thus receive only a single default route and forward all traffic toward PE-A or PE-B (based on the decision BGP made on RR). There is absolutely no way to change the RR behavior – it’s one of those BGP rules that nobody wanted to touch (yet): only the best routes in the BGP table are propagated to BGP neighbors.

The above statement is not entirely correct – the BGP Best External feature is violating that rule and advertising best external route even when better internal route exists.

To enable PE-C and PE-D to forward traffic toward PE-A and PE-B, you have to make the two default routes somehow different. The only trick that works is changing the RD on one of them:

  • PE-A advertises the default route received from CE-A as [RD1]::/0 (or RD1:0.0.0.0/0)
  • PE-B advertises the default route received from CE-B as [RD2]::/0 (or RD2:0.0.0.0/0)
  • RR receives two different routes (within the VPNv6 address family, [RD1]::/0 and [RD2]::/0 are different routes) and propagates both of them to PE-C and PE-D.
  • PE-C and PE-D receive both routes and import both of them into the same VRF (remember: imports are based on RT, not RD) , enabling true load sharing toward PE-A and PE-B.

You have to configure BGP load sharing with the maximum-paths ibgp number router configuration command within the IPv4 VRF address family on PE-C and PE-D, otherwise they will not insert more than one BGP route into the VRF IP routing table (even though two routes are present in the BGP table).

More information

If you’re considering MPLS/VPN deployment in your enterprise network, register for my Enterprise MPLS/VPN Deployment webinar.

If you were not familiar with this trick and plan to implement MPLS/VPN networks, I would strongly recommend reading my MPLS and VPN Architectures book (based on the technologies you want to implement, you might want to read Volume 2 as well). Definitive MPLS Network Designs is also a good choice if you’re involved in MPLS network design.

10 comments:

  1. Hopefully we'll see some BGP Add-Path implemented in more "real production" deployments along with PIC :) It's been a while since they have been kicking it down the IETF hallways...

    ReplyDelete
  2. in case you need to load share using bgp multipath, see
    http://www.cisco.com/en/US/docs/ios/12_2t/12_2t11/feature/guide/ft11bmpl.html
    or, Ivan, is it load balancing? :-E
    http://puck.nether.net/pipermail/cisco-nsp/2006-November/036134.html
    please, enlighten us, I hate to speak with impreciseness :)

    ReplyDelete
  3. Yes using different RDs to advertise unique VPNv4 prefixes through RR
    http://blog.shafagh.com/2010/09/05/bgp-multipath-part-two/

    ReplyDelete
  4. I meant actually this one! ;)
    http://blog.shafagh.com/2010/09/07/bgp-multipath-part-three/

    ReplyDelete
  5. Who am I to go against the community-agreed terminology ;) Changed the heading and the text (thank you!).

    Also, you HAVE TO enable IBGP multipath on PE-C and PE-D, otherwise they will not install additional BGP routes into the VRF routing table. Somehow I had the feeling it wasn't needed, but a quick lab test proved otherwise.

    ReplyDelete
  6. Thanks for the links!

    ReplyDelete
  7. FYI, it still says "load balancing" in the RSS title

    ReplyDelete
  8. That takes a while ... Eventually it gets fixed 8-) Ah, the beauties of mashups :-P

    ReplyDelete
  9. Precisely. You read my thoughts 8-)
    I hope I get green light to implement it in a production network (2 8M IMA bundles)...

    ReplyDelete
  10. Precisely. You read my thoughts 8-)
    I hope I get green light to implement it in a production network (2 8M IMA bundles)...

    ReplyDelete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.