The Road to Complex Designs Is Paved with Great Recipes
A while ago someone asked me to help him troubleshoot his Internet connectivity. He was experiencing totally weird symptoms that turned out to be a mix of MTU problems, asymmetric routing (probably combined with RPF checks on ISP side) and non-routable PE-CE subnets. While trying to figure out what might be wrong from the router configurations, I was surprised by the amount of complexity he’d managed to introduce into his DMZ design by following recipes and best practices we all dole out in blog posts, textbooks and training materials.
His DMZ was a typical redundant DMZ design: two routers connected to two ISPs and running BGP with them, and a redundant pair of firewalls, as illustrated in the following RFC-ready diagram:
PE-ISP-A PE-ISP-B
| |
CE-A CE-B
| |
=======PUB-SUB=======
| |
FW-A FW-B
EBGP sessions were established between CE-A and PE-ISP-A and between CE-B and PE-ISP-B (perfect). There was an IBGP session between CE-A and CE-B (perfect), but it was running between loopback interfaces.
OSPF was running in the DMZ to propagate loopback interface addresses between CE-routers (otherwise IBGP session would not start) and default route to the firewalls. He was also redistributing PE-CE subnets into OSPF to fix BGP next hop issues.
Both CE-routers had network statements to advertise the public IP subnet (PUB-SUB) to the Internet (perfect) and a static route to null 0 to ensure the PUB-SUB would always be advertised.
I could easily recognize each and every design choice he made; the whole DMZ was a perfect implementation of various BGP recipes that I can trace back to (at least) the BGP course I developed for Cisco Europe in mid-1990s. Note: I don’t think I could claim to be the author of any one of them; they were always considered (at least by some) best practices.
While most of the recipes made his design more complex than necessary, the last one (static route to null 0) was actually harmful (as the academics say: the proof is left as an exercise for the reader – post it in the comments).
There are numerous changes one could make to simplify this design, for example:
Run IBGP session over directly-connected interface (PUB-SUB). If you have two routers connected with a single link, it makes no sense to run IBGP session between loopback interfaces; loopbacks are useful if you have multiple alternate paths between the IBGP neighbors or if the IBGP neighbors are not directly connected.
Use static default route on the firewalls and HSRP on the CE-routers. This design is almost equivalent to the OSPF-in-the-DMZ design from the firewall perspective; track objects and HSRP priorities can get pretty close to whatever OSPF default route manipulation you can do on the CE-routers.
Use next-hop-self on the IBGP session. When IBGP routers advertise themselves as the BGP next-hop, the redistribution of PE-CE subnets into OSPF is no longer needed.
Remove the static route to null0. The IP subnet the CE routers have to advertise to the Internet is directly connected, so there’s no need to create an artificial IP prefix in the IP routing table to support the BGP network statement.
Last but definitely not least, remove OSPF from DMZ, as all the reasons for using it are gone.
Anything else? Please write a comment! And while speaking of misapplied recipes, Knowledge or recipes blog post comes to mind.
But traffic won't probably even make it that far. If connection to PUB-SUB is down on CE-A then traffic will be blackholed (assuming only one interface to PUB-SUB).
One thing I will throw out there (in case it may help anyone) is that we are also running two types of tracking...
1) If the CE router cannot see the firewalls, shutdown the BGP neighborship to the ISP
2) If the BGP neighborship fails, decrement HSRP
An exercise for the reader (to continue the academic lingo): do #1 in a way that does not require changes to router configuration.
To do the HSRP decrementation, I would use a track object with an IP SLA pinging the PE, or... introduce a dummy route on the PE (local, not redistributed to the ISP backbone, aggree in a custom BGP community like 1111:0 so you can find it all the time), and use a tack object to check if the route is in the routing table. With the BFD it is fine, and you can easily assign a track object to the HSRP monitoring.
If you are really tricky, you can use the EEM to do the LAN BGP monitoring, but HSRP is much faster and easier.
BTW: That was an easy one: http://www.nil.com/ipcorner/DesigningBGPNetworks ;)
As for external routing, we're splitting our netblock up between the two sites, advertising half at each, plus the whole netblock to ensure reachability to our entire address space if one of the sites goes down. The ASRs advertise those netblocks to the edge routers that peer with our upstream ISPs.
===============================
ip sla monitor 10
type echo protocol ipIcmpEcho FW-IP
timeout 1000
frequency 10
ip sla monitor schedule 10 life forever start-time now
track 10 rtr 10
router bgp ASN
neighbor PE fall-over route-map TRACK-FW
route-map TRACK-FW permit 10
match 1.1.1.1/32
ip route 1.1.1.1 255.255.255.255 Null0 track 10
===============================
But as already said - I've never used the feature and even if it would work, I would not be very happy to use a configuration like above...
It's a little complicated, but if you don't trust your firewalls and have diverse locations, this is an option.
in the event the CE router loses connectivity to the firewall would we need to really shutdown the neighbour to the ISP if we were to deploy a separate dedicated iBGP link between both CE routers. If CE-A lost connectivity to the firewall the inbound traffic would learn an alternate path via the iBGP link and route to the firewall via CE-B?