LDP-IGP synchronization in MPLS networks

A reader of my blog planning to migrate his network from a traditional BGP-everywhere design to a BGP-over-MPLS one wondered about potential unexpected consequences. The MTU implications of introducing MPLS in a running network are usually well understood (even though you could get some very interesting behavior); if you can, increase the MTU size by at least 16 bytes (4 labels) and check whether MTU includes L2 header. Another somewhat more mysterious beast is the interaction between IGP and LDP that can cause traffic disruptions after the physical connectivity has been reestablished.

Here’s a typical BGP-over-MPLS design (applies equally well to MPLS/VPN, 6PE, 6VPE, VPLS or pseudowires):

  • Edge routers (PE-routers) run BGP between themselves to exchange external (customer) prefixes;
  • Edge and core (P) routers run IGP (usually OSPF or IS-IS) to find optimum path toward BGP next hops;
  • P- and PE-routers use LDP to exchange labels for known IP prefixes (including BGP next hops). LDP indirectly builds end-to-end LSPs across the network core.

IP packets can be forwarded across BGP-free core even though the core routers don’t know how to forward them. Ingress PE-router labels incoming IP packets with MPLS labels for BGP next hops, labeled packets are sent across the core (core routers don’t perform IP lookup), last P-router pops the top label (penultimate hop popping) and the egress PE-router performs IP lookup and sends the datagram toward an external destination (the process is slightly different when you use technologies like MPLS/VPN that need a two-label stack).

If a core link fails, IGP quickly finds an alternate path. LDP does not need to converge; when using independent label distribution and liberal label retention mode (default settings on most modern routers), every LSR saves labels advertised by all its neighbors. In our network, when A discovers that it can use B to reach the egress router, it already has the label B assigned to EG prefix in its Label Information Base (LIB). LDP thus causes no interruption in traffic flow.

The situation is completely different after the physical link is restored. IGP quickly discovers new neighbors and reconverges; LDP is slower. In the short interval between IGP convergence and LDP synchronization router A sends IP packets through the new shortest path toward the egress router with no label (it hasn’t received a label from D yet and thus has no outgoing label to use). Router D, not running BGP, has no idea what to do with those packets and thus drops them.

The same problem can occur if you clear an LDP session or disable MPLS on an interface with no mpls ip.

There are two solutions to this problem:

LDP-IGP synchronization: Link cost of a newly established adjacency is set to the maximum value until LDP tells IGP it’s OK to use the link. You can configure the LDP-IGP synchronization in OSPF or IS-IS, either with the mpls ldp sync command in the routing protocol configuration or with the mpls ldp igp sync interface configuration command where you can also specify an additional delay (to ensure both IGP and LDP are completely stable before you start using the new link for packet forwarding).

The details of LDP-IGP synchronization are a bit tricky. Read the corresponding documentation before enabling this feature in your network.

LDP session protection: A router tries to retain LDP sessions with all its neighbors even if they are currently not directly connected (A and D in our link failure scenario). Since the router no longer receives multicast LDP hellos from such neighbors, it uses targeted LDP hellos (unicast UDP packets sent to neighbor’s LDP transport address) to prevent session timeouts.

Targeted LDP sessions allow the routers to retain LIB information that can be used immediately after IGP convergence. There’s also no need to reestablish LDP sessions with the newly-adjacent neighbors as they were never disconnected.

To configure this feature, use the mpls ldp session protection global configuration command. You can use an ACL to specify the neighbors you’re interested in (it makes more sense to use this feature on core links than on non-redundant access links) and the duration of the session protection (default: 24 hours).

More background information

Enterprise MPLS/VPN Deployment webinar (recording) describes the basics of LDP, LSP establishment and packet forwarding across MPLS networks (including label stack used by MPLS/VPN and penultimate hop popping). If you need a bit more in-depth details, buy one of the MPLS books (unfortunately none of my books covers the new features like LDP session protection).

21 comments:

  1. MPLS vs. traditional design during an earthquake:

    http://ripe63.ripe.net/archives/video/184/

    Worth watching, real data from Randy (IIJ)

    ReplyDelete
  2. slides: http://ripe63.ripe.net/presentations/128-111102.ripe-quake.pdf

    ReplyDelete
  3. How to make this funny graphs?

    ReplyDelete
  4. Not exactly (at least not the slides). Randy explained how a traditional well-designed routing-only IP network survived a major link outage. No surprises there, he knows what he's doing ;)

    ReplyDelete
  5. See the last paragraph in this blog post: http://blog.ioshints.info/2011/10/l2-or-l3-switching-in-campus-networks.html

    ReplyDelete
  6. Another interesting case where IGP-LDP sync is useful is when (OSPF or IS-IS) adjacency is "up" but LDP session is down. Without IGP-LDP sync a router does not change IGP path (since it does not see any topology change), therefore traffic between the two routers is black-holed (label from the other router missing !). With IG-LDP sync enabled, the IGP change path even when the LDP session is down and the (OSPF or IS-IS) adjacency is "up" (tested with IOS, works fine !). Principle is obviously the same, advertise link where LDP session fails with the highest metric possible.

    ReplyDelete
  7. I think if LDP run order mode, then there is no need for this. whenever ingress router get a label, it has to be distributed by egress, there is no black-hole in design.

    ReplyDelete
  8. I started from different assumptions.
    As Ivan wrote "when using independent label distribution and liberal label retention mode (default settings on most modern routers)", I assumed independent label distribution and liberal retention mode. In this case black-holes are surely possible, you get a black-hole as soon as LDP session on a link drops down (in VPN service, this happens even if LDP session drops down on the last link of the path).

    ReplyDelete
  9. juniper support order mode and igp-ldp sync, not sure what might be the deployment scenario? independent mode is default for IOS, not juniper, right?
    http://www.juniper.net/techpubs/en_US/junos11.3/topics/reference/standards/ldp.html

    ReplyDelete
  10. Not sure ordered mode makes much difference (but have to check what Junos really does). Anyhow, if an LDP session breaks (or is not established yet), LSP along IGP best path either breaks or is not available at all (which might be the case in ordered mode). No difference from the packet forwarding perspective.

    ReplyDelete
  11. after heavy reading, i think in case of order ldp mode, router a will retain the lbl and ingress router won't have lsp tunnel to forwarding traffic. ldp/igp sync up do make sense in case of order mode.

    ReplyDelete
  12. Another valuable IGP command to prevent blackholing is:
    max-metric router-lsa on-startup 120

    Also, for the edge devices BGP PIC can again accelerate repair for backup paths. I would like to see an article on BGP PIC, especially the "no bgp recursion host" piece.

    ReplyDelete
  13. JUNOS also defaults to independent mode and liberal retention. One difference with IOS is that JUNOS advertise label bindings only for loopback interface instead of advertising label bindings for each network in routing table (as IOS does). You may change the default behaviour in both cases. My opinion is that JUNOS view is neater, since in a well designed IP/MPLS backbone, what you really need is a full mesh of LSP MPLS between loopbacks of PE routers (used for the full logical mesh of iBGP sessions). Adveritising label bindings for networks used to numeber point-to-point links or broadcast segments inside backbone infrastructure is a quite useless exercise.

    ReplyDelete
  14. Right, if you break LSP traffic is black-holed (as a matter of fact there is an exception, if the LSP breaks on the last link and traffic is forwarded through a single MPLS label, because of PHP, traffic is not black-holed). As I pointed out in a previous post, this problem may be solved also by IGP-LDP sync.

    ReplyDelete
  15. If the last hop in LSP breaks, BGP-free core continues to work, but MPLS/VPN breaks. Unless the egress PE-router signals implicit-null, the penultimate hop's outbound label is "NO LABEL", not "POP", so all the labels are removed from the stack.

    ReplyDelete
  16. Actually, Junos uses ordered, not independent mode. Just tested it yesterday - unless you have LDP session on the last hop, the routers a few hops away don't see the label for egress PE loopback.

    ReplyDelete
  17. I apologize, you are right.
    JUNOS defaults:
    Downstream Unsolicited label distribution (as opposed to Downstream on Demand),
    Ordered label distribution control (as opposed to Independent),
    Liberal label retention (as opposed to Conservative)

    Anyway, my claim on traffic black-hole remains valid, independently of label allocation mode.

    ReplyDelete
  18. Excellent post again Ivan. Not an MPLS expert but read your books and others. I can understand the Targeted option and to another extent fast reroute to provide a "feasible successor" like quick change.
    What about the impact of NSF and BFD features in the IGP, they too if used have an affect on the process as well. Any thoughts?

    Also, what about " donwstream on demand" vs. "unsolcited downstream" label distribution methods as part of the equation, any thoughts?

    ReplyDelete
  19. also, it seems we are getting to a point in the industry that the MTU is becoming an item to be constantly considered and not overlooked or taken for granted across many network technologies, such as wireless, wan, lan, ipv6, mpls, vpns(all flavors), vpls, fabrics, security etc.

    ReplyDelete
  20. Curious, if it were an L2VPN between In/E-gress in your diagram above, would router A still send it as native IP packet to D (after the physical link between A and D is restored) ? Would there not be a VPN label on top of the IP packet ? Saw this in the 2-label VPN link you had mentioned (that there'll be a vpn label)

    Thanks.

    ReplyDelete
  21. Before A receives a label mapping for Eg from D, the outbound action for Eg would be "untagged". Any labeled packets trying to cross that link would probably be dropped. Just guessing though ...

    ReplyDelete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.