BGP and route maps

This is a nice email I got from an engineer struggling with multi-homing BGP setup:

We faced a problem with our internet routers a few days back. The engineer who configured them earlier used the syntax: network x.x.x.x mask y.y.y.y route-map PREPEND to influence the incoming traffic over two service-providers.

... and of course it didn’t work.

We weren't getting the desired results so I changed the configuration from up above to: neighbor x.x.x.x route-map PREPEND and to my surprise everything started working fine as before. Is there a difference between to the two sets of configuration?

The network route-map command uses a route map to change the attributes of an IP prefix as it’s inserted in the BGP table. You can set BGP communities and change MED, local preference and next hop (I wouldn’t change the next hop), but not AS-path (Quagga can do that as well). If the route-map doesn’t match the IP prefix, the attributes are not changed (but I wouldn’t use a route-map with a match statement in the network command).

Neighbor route-map in|out command applies a route map to BGP updates received or sent to a neighbor. It’s not just an attribute modifying tool, it’s also a filter – if the route map doesn’t match a BGP prefix, the inbound update gets dropped before it’s inserted in the BGP table, or the BGP prefix never gets sent to the BGP neighbor.

Route map applied to a BGP neighbor can change most BGP attributes that make sense (example: you cannot change local preference if the neighbor is not in the same AS) and prepend AS numbers to the AS path (you’re not allowed to modify AS path directly, as that might bypass BGP loop prevention mechanisms).

On top of obvious sanity checks, the router applies the usual BGP route reflector safeguards: if the neighbor is a route reflector client, you cannot change any attributes of reflected routes with an outbound route-map (but you can change attributes of routes you’ve received from EBGP neighbors). If you want to change attributes of IBGP prefixes sent to route reflector clients, you have to modify them as you receive them from other neighbors with inbound route-maps.

19 comments:

  1. I think that mistakes as the one described in this blog post happen since in IOS what is misleading is that you can use a route-map to change some (not all !) BGP attributes when you “create” a new entry in the BGP table. For instance you can change MED, ORIGIN, LOCAL_PREF, (even weight, which is not a BGP attribute), but not AS_PATH. Curiuosly, when you use the “network… mask …” command with a route-map setting “as-path prepend …” you do not get any error message, but if you do the same with “aggregate-address …” and “redistribute …” you get the following error message “ % "ROUTE-MAP-name" used as BGP attribute route-map, set as-path prepend not supported” . I think that guys from Cisco should fix it up, for otherwise people make mistakes such the one you reported in your post.

    ReplyDelete
  2. (FOLLOW-UP FROM MY PREVIOUS COMMENT)
    Moreover, as a general personal consideration about IOS style to inject prefixes in a BGP process, I think IOS has a clumsy way to do it. Summarizing, in IOS you may use three different methods:
    - Manual, through the "network ... mask ..." command. You may associate to this command a route-map to change some attributes (es. MED) of the route inserted in the BGP table.
    - Through aggregation through the "aggregate-address ..." command.
    - Through redistribution.
    This generates different methods to inject the same prefix in a BGP process, leading some time to different BGP attributes. Let me make an example. Suppose you have a directly connected network that you want to inject in a BGP process. You may do it using the "network ... mask ..." command or the “redistribute connected” command (with a route-map if you want to inject only the directly connected network we are examining). Both ways produce the same result with a small difference: if you use the "network ... mask ..." command the ORIGIN attribute is set to IGP, using “redistribute connected” command the ORIGIN attribute is set to INCOMPLETE. Does this make any sense ? In my opinion surely no ! To tell the truth, I do not even understand why IOS sets the ORIGIN to different values when you use the "network ... mask ..." and the "aggregate-address ..." commands (ORIGIN = IGP) and redistribution (ORIGIN = INCOMPLETE). Probably there are historical reasons behind that, but I do not think today this makes any sense, and this can lead you to errors since ORIGIN is part of the BGP selection process (to be sincere, I have the crazy idea that ORIGIN attribute is useless and should be taken out from BGP selection process !!!)
    If you compare IOS to JUNOS, JUNOS has only one (unified) way to inject prefixes into a BGP process, which is essentially similar to the redistribution process used by IOS. You put in some way the prefix you want to inject in the BGP process in the routing table (be it a directly connected route, an aggregate route or route advertised by a dynamic routing protocol), create an “export routing policy” and then apply it to a BGP session (or a group of BGP sessions). Going back to AS_PATH prepending, you cannot make any mistake, you have only one way to do it, just use the “as-path prepend …” command as one of the possible actions in the “export routing policy”. At last, JUNOS always sets the ORIGIN attribute to IGP (giving you a knob if you want to change it).

    ReplyDelete
  3. If I would have to guess, I would say that one of these mechanisms was implemented way before the others (and before they realized error checking is not a bad idea). Should they fix it? Of course. Will they? I doubt ...

    ReplyDelete
  4. There's a simple reason for so many methods - history ;)

    You have to remember that BGP existed in Cisco IOS long before there were route maps (and probably before there were redistribution filters). Also, BGP was used in an environment where they cared about control and stability (so listing the networks manually made perfect sense).

    Aggregation was an add-on feature (as BGP moved from classful BGP-3 to classless BGP-4) and was initially designed as a proxy functionality - a router would aggregate prefixes from other routers running BGP-3 and thus being unable to aggregate. The BGP-3 to BGP-4 migration is also where the "summary-only" keyword came from. Nobody was willing to just aggregate and hope that it would work - the classful prefixes were left in the BGP table as a safety measure.

    And we all know that once you implement a feature, it's impossible to get rid of it, because there's always at least one huge network out there that relies on that feature and would break if you change the way the feature behaves.

    Comparing that to Junos is unfair. It's like comparing OS/370 to Linux :-P

    ReplyDelete
  5. And finally, the ORIGIN attribute. It's a leftover from the days when the Internet had to migrate from EGP to BGP and it was important to know whether a route is a native BGP route (originated in BGP and transported to the current router only through BGP) or an EGP route redistributed into BGP. Native BGP routes would obviously be preferred over EGP routes. See http://tools.ietf.org/html/rfc1268#section-5

    BTW, the "Incomplete" ORIGIN is a misnomer. It should be "Unknown".

    Also, might actually make sense to set ORIGIN to "Incomplete" on route redistribution (unless you use a route-map to set it to IGP). If one router advertises a prefix with a NETWORK statement and another one through blind redistribution, it's better to listen to the one that (supposedly) knows what it's doing.

    Do we still need the ORIGIN attribute? Probably not, but as I said before, there's probably a huge network out there that relies on ORIGIN for proper route selection :-P

    ReplyDelete
  6. "Also, might actually make sense to set ORIGIN to "Incomplete" on route redistribution"

    I fear I may be missing something really obvious here but isn't this set by the very act of redistribution into BGP without any manual route-map action on top? *DONT_KNOW*

    ReplyDelete
  7. You're right - I was just explaining to Tiziano that setting origin to "Incomplete" (which should really be "unknown") when doing redistribution isn't necessarily a bad idea.

    ReplyDelete
  8. If eBGP is used as PE-CE protocol (CE is a multi-VRF CE) and a customer vrf in PE is configured to send both BGP communities towards CE neighbor,
    will PE send RT values associated with its BGP routes entries in the BGP route update sent towards CE?
    Will these BGP update be accepted by PE VRF?
    If accepted will those RT values be retained with the routing entries in PE BGP tabl?


    what happens if BGP process in CE VRF is configured to send both communities to PE VRF?
    Will BGP route update will be sent with RT values ?
    Will these BGP update be accepted by PE VRF?
    If accepted will those RT values be retained with the routing entries in PE BGP table?

    ReplyDelete
  9. Have you considered setting up a two router lab to test all these questions?

    ReplyDelete
  10. This kind of configuration is implemented in our production network. The route advertised by PE is learnt by CE router, however no RT value is associated with those routing entries in the CE routing table.

    However I found the routes learnt by PE from CE are having the rt values associated with it.

    I am not sure what actually happen at the background. I just wanted to know what is the default IOS behaviour when BGP is configured to send community / extended community attribute from within the VRF context.

    Thank you.

    ReplyDelete
  11. Very interesting. I have been banging my head against the wall last week wiht the rotue-map applied to network or neighbor command. The task was simple, apply the no-export community to a prefix. For whatever reason my first instinct was to apply the route-map using the network command, well the prefix just did not get sent to the neighbor. Ran few tests, repeated the command this time just settin a test community nn:nn and the prefix gets passed along with the correct community, tried again with the no-export, nothing, no prefixes on neighbor. Apply the route-map to the neighbor command and the prefix gets passed on with the correct community of no-export.
    I found this even more confusing, you can pass some community but no others, there must be a fundamental programming/desing reason for this, it woudl be very good to know it so it can be understood and retained by my poor brain. Fulvio
    fallegretti@hotmail.com

    ReplyDelete
    Replies
    1. Maybe you need "neighbor send-community" as well?

      Delete
    2. I have it, maybe I have not explained the issue correctly:

      network x.y.z.t mask a.b.c.d route-map TEST
      if TEST set community to no-export, the prefix does not get send to the neighbor, never mind the community.
      if TEST set community 11:11, the neighbor gets the prefix with the community set in TEST (this implies I am sending community to neighbors)

      if I use the same route-map TEST with set community no-export, but I apply it via the neighbor command, the neighbor gets the prefix with the correct community (no-expoert) in this case.
      In other words, it seems I can't pass the no-export community with the network command.
      Hope my issue is better explained now.
      I think your opening statement
      The network route-map command uses a route map to change the attributes of an IP prefix as it’s inserted in the BGP table
      has something to do with it, I am confused by the inconsistency of the network route-map command, it looks like it passes some but not all communities.
      Fulvio

      Delete
    3. Let me guess - the neighbor is in a different AS?

      Delete
    4. yes (something tells me this is more than an educated guess)
      Fulvio

      Delete
    5. "No-Export" community means "do not send this prefix over EBGP". What should happen if you set it with network command?

      Delete
  12. Well, the reason I am setting the no-export community is to tell my neighbor in the other AS, do not send this prefix over EBGP, but I suppose if I apply it via the network command, bearing in mind
    "The network route-map command uses a route map to change the attributes of an IP prefix as it’s inserted in the BGP table"
    I am actually telling the local router not to send this prefix over EBGP....is that right?

    ReplyDelete
    Replies
    1. More details here: http://blog.ioshints.info/2012/10/setting-no-export-bgp-community.html

      Delete
  13. I always confuse when to use IN or OUT in BGP route-map. I already do some searching using google but I cant find a clear explanation. Anyone here who wants to share their ideas regarding this? TIA

    ReplyDelete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.