How could we filter extraneous BGP prefixes?

Did you know that approximately 40% of BGP prefixes polluting your RIB and FIB are not needed, as they could be either aggregated or suppressed (because an aggregate is already announced)? We definitely need “driver’s license for the Internet”, but that’s not likely to happen, and in the meantime everyone has to keep buying larger boxes to cope with people who cannot configure their BGP routing correctly.

Before you start writing a comment explaining how multiple prefixes are needed due to lack of traffic engineering capabilities in BGP – the report generated by Geoff Huston takes at least some of that in account.


This is how some ISPs are approaching the fixes to their broken BGP routing

Bill (the reader who wrote me about this issue) is facing a very painful problem: he cannot fit the full BGP routing information he’s receiving from two upstream ISPs into his Sup720-3B and would have to upgrade to Sup720-3BXL.

He noticed that in many cases a BGP prefix and one or more more-specific prefixes share the same next hop, so he could easily drop the more-specific prefixes without changing the forwarding behavior. He simulated this idea on the actual contents of his BGP table and figured out that he could safely drop around 40% of the prefixes he receives. He just needs the inbound filter.

There is no easy way to implement the “drop superfluous more-specific prefixes” filter in Cisco IOS. You could create multiple RIBs (with neighbor soft-reconfiguration command) and implement scripting kludges that would generate inbound filters, but those kludges wouldn’t react to real-time changes in BGP tables.

An alternative might be a host-based BGP daemon (like Quagga) that would connect to the upstream ISPs, collect the BGP prefixes, and pass the minimum subset required to the Catalyst 6500. I was never really interested in BGP daemons and thus have no idea where to start looking for such a beast (OK, I do know how to use Google to find Quagga ;). Could you help Bill, preferably with pointers to solutions that already implemented what he’s looking for? Thank you!

13 comments:

  1. Wasnt something for this just presented recently by Cisco? Its been discussed on the NSP list within the past few days:
    http://www.data.proidea.org.pl/plnog/6edycja/materialy/prezentacje/Robert_Raszuk.pdf
  2. Woud be handy for a Juniper solution as well
  3. It seems Robert was presenting (pre-?)EFT ideas. Yes, Bill would like to have something like that, but applicable to all prefixes, not just the default route.
  4. ... and interestingly, Graham was asking exactly the same question as Bill (and got no replies).

    https://puck.nether.net/pipermail/cisco-nsp/2012-January/083087.html
  5. First I would consider if a full table really is necessary. Do you use it because of business needs? Many times it's enough to use default routes or just receive peering prefixes and use a default for the rest. If you really need the full table then there should be money to get new hardware.

    I assume that nothing smaller than /24 is being received? You could take it one step further by filtering anything smaller than a /23. Yes, you would loose some minor prefixes but do you really need them? Maybe you could use a default route for those prefixes. Otherwise you could try to do something a bit more clever. If you know that you will never need to send traffic to Asia or some part of the world then you could try to filter ranges allocated to APAC. It will not be 100% accurate but close enough.

    SUP720-3B is very old by now so it might be time for an upgrade anyway but I know it can be difficult getting money allocated.
  6. In addition to the work by Raszuk et al on Virtual Aggregation,

    I would have a look at recent coference papers on the topic:

    "On Route Aggregation"
    http://conferences.sigcomm.org/co-next/2011/papers/1569470145.pdf

    "SMALTA: Practical and Near-Optimal FIB Aggregation"
    http://conferences.sigcomm.org/co-next/2011/papers/1569469057.pdf
    http://conferences.sigcomm.org/co-next/2011/slides/Uzmi-SMALTA.pptx

    "Making Routers Last Longer with ViAggre"
    http://www.usenix.org/event/nsdi09/tech/full_papers/ballani/ballani.pdf
    http://www.cs.cornell.edu/~hitesh/talks/talk-nsdi09-va.pdf
  7. The question is: Where should the too specific prefixed be dropped / blocked? Should they already be dropped on the edge device or should the edge device receive them and simply not forward these prefixes to the internal routers?
    For the second case, you could probably mess arround with some conditional advertisement. For the first case, some feature like "conditional automatic aggregation" would be a cool thing, but it doesn't exist afaik (at least in IOS). Maybe it's possible to do that in real-time in Junos with custom tools or scripts since it's basicly a BSD system, but for IOS...
  8. Not so orthodox, but if he doesn't mind a bit of suboptimal routing, he can filter out all prefixes longer or equal than /18 at the edge with a neighbor X prefix-list in and use a default-route for the rest. Although they won't go into the routing table, the 720s still have to process each one of it, so I still wonder if that will be enough to protect the device.

    If he has full-table-capable routers near the 720s, he can originate a default-route from the router and suboptimal routing should stay near enough.
  9. Few thoughts:

    1. The best 'PC' based BGP implementation is currently OpenBGPD (run on on OpenBSD) . No serious BGP engineer can afford to ignore it these days. Even has MPLS & VRF support.

    2. Not sure which of the 3B's resources you are specifically discussing, but soft-reconfiguration would double some memory requirements for each peer, so that is unlikely to be feasible anyway.

    3. Standard tricks in this situation are to implement prefix length filtering e.g. drop > /24, if that still doesn't fit, then > /23 etc. Maybe throw in a static default to null, and then judiciously log non-RFC5735 packets that go there to see who's not aggregating.

    4. In the end the 3B can't cut the mustard anymore. Especially if it's holding v6 tables as well. If you have traffic to justify the peerings, then you can presumably afford the 3CXL (or whatever). Otherwise see (1.) for alternative cheaper option..
  10. yes, I remember this and it fits perfectly to solve this issue
  11. Thanks for the links.

    Wrote about Virtual Aggregation two years ago (http://searchtelecom.techtarget.com/tip/Virtual-aggregation-Lifeline-for-exploding-Internet-routing-tables), that one solves the edge FIB problem by using large-scale core routers. Although the idea seemed interesting, not much has been done in the meantime.

    SMALTA seems interesting and they claim they have a Quagga/Zebra implementation. Probably a lot of extra glue would be needed to get it running ...
  12. Implement filtering based upon RIR allocation policies. Loose and strict updated filters can be found here:

    ftp://ftp-eng.cisco.com/cons/isp/security/Ingress-Prefix-Filter-Templates/

    Will drastically reduce the # of prefixes loaded onto the SUPs. Also take defaults along with the full tables from providers just to cover any small cases that don't adhere to RIR policies.
  13. Yes. If he had Juniper devices. But I think without any knowledge of his network configuration, it maybe like chasing th e winds to give a solution. He can definitely use routing policies to limit the to limit the prefixes he accepts, but we do not know what prefixes he is concerned about. I mean, that is what BGP is all about right? Making a decision on the best path, given a number of paths to the same destination from different sources. Does he have multiple conections to different or the same AS? If he has a single connection a provider, then using BGP is...hmmm...dont know. Again, not knowing the set up is a problem. If only one connection, then set a default route to the internet and firewall filters for any specific policy you want to enforce. Then have the provider only advertise a single aggregate to you.
Add comment
Sidebar