BGP Optimal Route Reflection 101

Thursday, October 14, 2021 06:23 UTC

BGP Optimal Route Reflection 101

Almost a decade ago I described a scenario in which a perfectly valid IBGP topology could result in a permanent routing loop. While one wouldn’t expect to see such a scenario in a well designed network, it’s been known for ages¹ that using BGP route reflectors could result in suboptimal forwarding.

Here’s a simple description of how that could happen:

Multiple edge routers advertise the same prefix (IPv4, IPv6, or VPNv4).
BGP route reflector (RR) receives all alternate BGP paths to that prefix, and selects one of them as the best one. When the BGP paths are too similar, it uses IGP cost to the BGP next hop² as the tie breaker.
The best path selected by the BGP RR is advertised to its clients.

The challenge: RR clients might be better served using a different prefix (due to a different position in IGP topology), or could use multiple prefixes with identical IGP cost for IBGP multipathing.

We had a solution to that challenge for years: Advertisement of Multiple Paths in BGP (RFC 7911) aka BGP AddPath³, and it’s available in most modern BGP implementations… but the remaining flies in the ointment still bother some people:

BGP RR clients receive more information than needed, resulting in memory- and CPU overhead.
With most BGP AddPath implementations the operator can limit the number of alternate routes sent to the BGP RR clients… but what is the minimum number of alternate paths you need to get optimal end-to-end packet forwarding?

As you might have expected: whenever there’s a niche challenge to be solved, there’s an IETF draft or RFC solving it (sometimes in five different ways). This time, it’s BGP Optimal Route Reflection (BGP ORR) (RFC 9107).

Here’s the CliffsNotes version of that idea: the BGP route reflector imagines how it must feel to be its client, selects the best BGP paths from its client perspective and sends them to the client.

Hope you got two questions while reading the previous sentence:

Are the best BGP paths calculated for every client (and how much overhead would that generate)? Fortunately, the BGP ORR implementations are smarter than that, and allow you to configure groups of clients. Also, you need to run the client-specific calculations only for otherwise-identical paths where IGP cost is the tie breaker unless you want to support client-specific route selection policies – a morass into which we won’t look.
How does the BGP RR know what it feels like to be a client? BGP RR and its clients could be part of the same link-state IGP area, or the RR clients could sent their topology information to the reflector via BGP-LS⁴

Has anyone implemented BGP ORR? I found IOS XR and Junos implementations, and someone has been promising to implement it in FRR for a year, so it might happen in not-too-distant future.

Is it useful? In theory, you could use it whenever a BGP RR is far enough outside of the optimal ingress-to-egress forwarding path⁵. In practice, I prefer structured network designs that can work without extra magic.

BGP Optimal Route Reflection 101

Further Reading

Latest blog posts in BGP in Data Center Fabrics series

Further Reading

Latest blog posts in BGP in Data Center Fabrics series

Recent posts in the same categories

BGP