Alternatives to IBGP within Multihomed Sites
Two weeks ago I explained why you might want to run IBGP between CE-routers on a multihomed site. One of the blog readers didn’t like my ideas:
In such a small deployment I assume that both ISPs offer transit, so that both CEs would get a default route from their upstream.
In this case I would not iBGP the CEs together but have HSRP running on the two CEs and track the uplink (interface and/of BGP session) to determine the active gateway.
Let’s see what could possibly go wrong with that design.
To IBGP Or Not to IBGP
Assuming both PE-routers advertise only the default route, a CE-router know where to propagate a packet it receives through the LAN interface if:
- The PE-CE link is up
- The PE-CE BGP session is operational
- PE-router advertised a default route over the PE-CE BGP session.
It’s easy to adjust HSRP/VRRP priority based on uplink status. I never tried to do it based on a state of a BGP session, and it’s interesting to try to do it based on the presence of a specific prefix in RIB.
Some network operating systems can adjust HSRP/VRRP priority based on a complex tracked object, and on some network operating systems it’s possible (with enough effort) to have the BGP default route as that tracked object1. However, it might be simpler to have that IBGP session in place.
I also received an interesting comment on LinkedIn saying:
You need a static default route pointing towards the second CE with a metric [sic] inferior to the route installed by EBGP for failover purpose.
That would also work. I still think IBGP session is simpler, and it helps ensure that all (BGP) routers in an autonomous system have the same view of the network.
Another commenter on LinkedIn wanted to demonstrate his BGP prowess and wrote a lengthy treatise on BGP next hop processing (spoiler alert: here’s a better version) including the recommendation to set the next hop on IBGP session to the loopback interface. Interestingly, although that’s the recommended best practice, you don’t need the loopback interface or IGP if you have only two directly-connected routers in an autonomous system – the road to hell is often paved with best practices.
- I would still use an IBGP session between the CE-routers
- I would establish that IBGP session between IP addresses assigned to LAN interfaces – assuming the CE-routers have a single LAN interface (or a port channel) and the site does not have any intermediate routers.
Default Route or More Specifics?
The original comment continued along the lines of we don’t need more than the default route:
And if you wanted to IBGP them anyway, I would put a route-map on it to only exchange the default route from the upstreams, so that both CEs have a 0/0 route with different distance. The only thing I don’t understand is in which failure scenario traffic would end up on a CE without an active BGP uplink.
Using just the default route makes sense if:
- You’re using the uplinks in pure active/backup setup or
- You want to do ECMP load balancing between two uplinks connected to the same ISP2.
In any case, if you decide to go with the default route, it might be better to filter BGP updates on the PE-CE EBGP session, not on the CE-CE IBGP session. Why would you accept a default route and the full DFZ table, spend CPU cycles to process all the updates (all of them having the same BGP next hop) and pass just the default route to the IBGP peer?
While two default routes might work well for a content consumer (because it’s hard to influence incoming traffic anyway), if you happen to be content provider (there’s more traffic going out than coming in), you might want to optimize WAN link utilization. For example, you might want to use the direct uplink for prefixes belonging to ISPs and their customers, or you could do a traffic flow analysis combining NetFlow with BGP data, and accept prefixes that represent large percentage of your traffic (even more details).
We discussed whether to use just the default route, a subset of prefixes, or a locally-generated default route in September 2022 session of ipSpace.net Design Clinic. You might also want to watch the Surviving the Internet Default Free Zone webinar.
- Added a “you might need FHRP on LAN interfaces” note based on a comment from Mr. Anonymous.
If nothing else, you could develop some crazy EEM magic on Cisco IOS – read some ancient blog posts on this site if you’re interested in that particular strain of job security. ↩︎
Trying to do ECMP load balancing across links connected to different ISPs is usually a bad idea. The proof is left as an exercise for the reader; if you decide to go down that path, you might find some of the older blog posts useful. ↩︎
It depends if your southbound devices (often firewalls) support routing protocols (preferably BGP). Most enterprises use static routing on WAN edge. So you'll have to use FHRP on your CE routers.
Also you can't assume that the default route from your service providers is somewhat reliable. There's no default route in the internet (hence the name default-free zone). You'll have to do your own tracking e. g. pinging multiple external destinations.
Normally you don't need full BGP table. Just filter AS numbers "directly" connected to your service provider (one AS away) and run IBGP session (direct links or BFD) between CE routers to get some more "direct" external routing. All other traffic you catch with your tracked default route. This has also the benefit of comparing peerings between your service providers. Also your routing convergence is much faster.
Thank you, you're absolutely right. Added a note.
I have isps that act as bgp peer with transit, but the default route they feed me is static... Had to setup path monitoring for fail over (python cron) after a CE blackholed my traffic on a beautiful Tuesday morning...
It doesn't sound correct:
"You need a static default route pointing towards the second CE with a metric inferior to the route installed by EBGP for failover purpose."
Wouldn't it be more like this? :
"You need a static default route pointing towards the second CE with A LESS PRIVILEGED ADMINISTRATIVE DISTANCE AND a LESS PRIVILEGED metric THAN the route installed by EBGP for failover purpose."
You're obviously correct. Added a note. Thank you!