BGP: time to grow up
If you’re in the Service Provider business, this is (hopefully) old news: on Friday, RIPE decided to experiment with the Internet causing routers running IOS-XR to hiccup. They stopped the experiment in less than half an hour and only 2% of the Internet was affected according to Renesys analysis (a nice side effect: Tassos had great fun decoding the offending BGP attribute from hex dumps).
My first gut reaction was “something’s doesn’t feel right”. A BGP bug in IOS-XR affects only 2% of the Internet? Here are some possible conclusions:
- Most other intermediate routers (IOS and JunOS based, one would assume) decided to silently drop the offending attribute and thus only those IOS XR routers directly peering with RIPE were exposed. Not likely, that would be a direct violation of current BGP standards.
- IOS XR is not widely used (read: not many people have CRS routers). Not likely, at least some very big providers have them.
- Most people don’t run BGP on IOS XR and use the high-end boxes only in their IP+MPLS core.
- IOS XR is typically not in the BGP update propagation path. If your core routers are receiving BGP updates solely from the BGP route reflector, there’s nobody behind them and nobody would notice the malformed updates (yet another reason to have good network design).
On a more serious note: the experiment has unintentionally exposed another long-term problem we’re facing: anyone can obviously attach any garbage to a BGP prefix and cause global memory consumption. The only thing you’d notice is increased BGP memory utilization that would be extremely hard to troubleshoot manually. Cisco IOS and IOS XR have no relevant filtering or scrubbing mechanisms (like they have for BGP communities) that you could use to protect yourself (and JunOS is probably no better).
The first line of defense could be BGP monitoring services like bgpmon.net. They could detect unknown transitive BGP attributes and report all memory-consuming attributes.
However, it’s high time we get away from “everyone is a trusted good guy” model BGP uses today and (at least) get a knob in BGP implementations that allows us to drop unknown attributes (today, unknown transitive attributes are silently propagated). Ideally, we would have a route-map/route-policy mechanism that would allow us to match BGP attributes based on their ID and accept BGP routes with select unknown attributes based on the attribute ID and its length.
Last but not least (before someone starts yelling at me): I know the “drop unknown attributes” knob will make all the future extensions to BGP harder to deploy, but the alternative is worse.
UPDATE (2010-09-01): Russell Heilling makes a very good point in his Unexpected Consequences post: it would be better to drop IP prefixes with unknown (or oversized) attributes than to silently scrub the attributes. In any case, we need conditions in the route maps/route policies that can match unknown attributes and the size of unknown (or all) attributes of a BGP route; the action to take (drop/permit/scrub) can then be specified in the route map/route policy.
Finding ways to protect those transitive attributes is just like a FW with a "permit any" entry at the end, with the explicit deny entries before it - that's just wrong.
Non IOS-XR intermediate routers passed the unknown transitive attribute unaltered (as per the RFC) and didn't cause any problems for their peers.
I think this, coupled with your fourth conclusion are the most likely explanation for the limited impact.
I am not sure I agree with enabling the dropping of unknown transitive attributes. The default handling of unknown transitive attributes is the reason that BGP has been able to operate on the same major version number for 15 years and without this behaviour there is no way that 4 byte ASNs could have been deployed so quickly. Memory consumption due to RIB growth is much more worrying to me than growth due to large path attributes.
The ability to inject prefixes with large attributes attached in order to cause memory consumption is an interesting idea, but as a vector it seems to have limited value. It is something that would appear to the script kiddie end of the market, who generally don't have access to unfiltered BGP sessions. The real bad guys are much more interested in attacks such as injecting bogus prefixes in order to hijack traffic or cause a DoS which they can monetise. The sorts of random effects seen on Friday are unlikely to appeal here.
In my mind the real crime in BGP development is the amount of time it is taking to get a working implementation of S-BGP / SO-BGP. Lack of basic route update security is a far greater risk to operations than optional transitive attributes...