Responsible Generation of BGP Default Route

Chris sent me the following question a while ago:

I've got a full Internet BGP table, and want to [responsibly]{.emphasis} send a default route to a downstream AS. It's the "responsibly" part that's got me frustrated: How can I judge whether the internet is working and make the origination of the default conditional on that?

He’d already figured out the neighbor default-originate route-map command, but wanted to check for more generic conditions than the presence of one or more prefixes in the IP routing table.

Let’s start with the easy part: conditional origination of a BGP default route. If you attach a route map to the neighbor default-originate router configuration command then the default route will be sent to the specified neighbor only when the configured route map matches at least one prefix in the IP routing table. The catch is in the “match in the IP routing table” part – you cannot use any of the BGP attributes as matching criteria in the route map.

Here’s a simple example: if the IP prefix 10.255.255.7/32 is in the IP routing table, the BGP default route will be sent to BGP neighbor 10.0.7.9. I’m using a loopback interface to generate the host route; you could easily create a track-object-based static route to null 0 and use EEM (or any other mechanism) to change the state of the track object.

interface Loopback202
 ip address 10.255.255.7 255.255.255.255
!
router bgp 65000
 neighbor 10.0.7.9 remote-as 65100
 neighbor 10.0.7.9 default-originate route-map CheckDefault
!
ip prefix-list CheckDefault seq 5 permit 10.255.255.7/32
!
route-map CheckDefault permit 10
 match ip address prefix-list CheckDefault

Chris already had a great idea how to solve the problem:

I'd really like to use that sort of construct to match AS-paths: If I see ASes belonging to Google, Facebook, etc, then that's probably a good sign.

There’s a way to check whether paths from a certain AS are in the BGP table without listing the whole table – the show ip bgp paths command. The AS paths are stored in a hash table (to save memory) and this show command dumps the AS paths table without walking through the whole 350.000 routes (or whatever the BGP table size might be when you read this article).

You can use the show ip bgp paths command in an EEM applet combined with an output filter matching individual AS numbers you’re interested in to change the state of a track object (and thus influence the IP routing table).

For example, you could use show ip bgp paths | include _(32934|13413)_ to display paths containing Facebook’s or Twitter’s AS (two most important parts of the Internet in some people’s opinion) in your BGP table and check for the presence of ‘0x’ string (which is always present in a non-empty show ip bgp paths printout).

However, even the show ip bgp paths command burns a lot of unnecessary CPU cycles which you might need for more useful things on your PE-routers. It’s best to offload the Internet connectivity test to a central server; you can do it on your route reflector or you could deploy a BGP daemon on a Linux host, check the BGP tables there, and insert (or revoke) a BGP prefix that signals all PE-routers to send (or revoke) the BGP default route. In case you want to go down this path it might be worth watching the free FRRouting Architecture and Features webinar.

11 comments:

  1. Pretty cool Ivan.
    If your ToR switches are L3, running BGP, you could use a similar technique to gracefully take a Spine switch out of service for maintenance, or other problems. Just log in to the Spine switch and shutdown the Loopback interface that would trigger BGP to remove route advertisements to the ToR switches. The ToR move the flows to another Spine switch, at which point you can begin your maintenance. RIght?

    Cheers,
    Brad
  2. How about looking at PfR for this?

    Thanks,
    Andy
  3. Pick 4 less than /22 prefixes from 4 different continents and then match them in a route map so that they are ANDed.

    Basically, if all 4 are down then you could responsibly say that your connectivity to Internet has some huge problem and should recall the default route.
  4. You just proved there's always a simpler way ;) Thank you!

    Sometimes I forget to step back and look at the bigger picture.
  5. Last year I worked on a internet service outage problem for a client. Their load balancers had an "internet health" criteria as a part of providing the load-balanced services. The check involved pinging well-known internet services that would "never go down":
    - altavista
    - friendster
    - kozmo.com
    - pets.com

    I betcha can guess what went wrong...

    Relying on specific prefixes that I don't control seems risky for the same reasons. They could get deaggregated, and I'd never know until the last one went offline.

    BGP table size, presence of AS-paths longer than 3 hops, and presence of well-known ASes all seem like safer long-term, zero-maintenance schemes.
    Replies
    1. Something more fundamental and well-known like anycasted root nameservers should work well for this though, shouldn't it? Those are already very nearly hardwired in some software (e.g. local DNS resolvers), and so should rarely change.
    2. That's what I would usually recommend these days.
  6. I used to put 4/8 and 12/8 as anchors for default and filter everything else. Maybe some ASs for traffic engineering. It works as it should, ATT and L3 must see everybody on Internet being Tier-1s.
    I don't like 0/0 walking inside ISP so each iBGP peer had 2 statics 0/0 -> 4/8 and ->12/8. And for even better safety last resort on borders poining to connected ebgp peers.
    Never failed.
    If both these T1s go offline you may safely change IT industry for new opportunities.
  7. Hi,

    regarding this comment:

    "The catch is in the “match in the IP routing table” part – you cannot use any of the BGP attributes as matching criteria in the route map."

    Well, you can use BGP attibutes as long as you match an ip prefix in the routing table. So, Ip prefix in the routing table AND BGP Attribute will work. By the way, running some tests I found out something interesting:

    - An Ip prefix, 1.0.0.0/8, is generated by two different routers in different AS'es (lets say AS 1 and AS 2) . The eBGP neighbor of those is advertising the default conditionally as long as it receives the 1.0.0.0/8 from AS 1. So, it generates the default just in case it receives it from AS 1 and it works. If we receive it from AS 2, it will not be generated.

    However, if we create a static route 1.0.0.0/8 pointing to null 0, it generates the default route, even though is local and obviously is not receiving it from AS 1. But, if you add a sentence matching the next hop of the eBGP neighbor (AS 1) in the route-map, it will not generate the default. So, it seems that it checks the routing table first and then, if there is a valid match, it checks the Local BGP Table.

    Hope this helps,
    Jose.
  8. So the configuration that I have, where I match a prefix, and its next-hop attribute, is just matching the prefix?

    It appears to be working on a foundry I have (knock on wood), but I'm definitely headed back to the lab.
  9. Track 8.8.8.8, 208.67.222.222 and 208.67.220.220. Then use 'bool' in your sla tracking.
Add comment
Sidebar