I encountered an ancient problem during one of my ExpertExpress engagements:
- Customer network is split into two autonomous systems (core and access);
- Links within access network are way slower than links within core network;
- Customer would like to have optimal core-to-access traffic flow.
Challenge: what’s the simplest possible configuration to get it done?
The big picture
Here’s the simplified network diagram:
Links within the access network (A1 to A2, for example) are way slower than parallel links within the core network (C1 to C2) if they exist at all. Traffic sent from core network into the access network should therefore use the optimal egress point. For example, traffic sent to E1 should be sent through A1 and not A2 or A3.
As we have parallel EBGP sessions between the autonomous systems, MED is the best tool for the job (assuming the core network doesn’t use local preference), but how do you set MED on the access network routers?
In a simple IP network, you could redistribute access network IGP prefixes into BGP on the routers connecting access AS to core AS (A1…A3). IGP cost would be automatically copied into MED. Mission accomplished.
Unfortunately, this customer uses L3VPN, and the user prefixes are advertised into BGP as VPNv4 prefixes on the PE routers (E1 … E3 in our diagram). Setting MED at that point makes no sense.
Getting It Done
What we need is a way of indicating the cost of transit from A1…A3 to E1…E3 with the MED attribute set on A1…A3. Ideally, we’d have a mechanism to copy IGP cost toward the BGP next hop into MED attribute, only I haven’t found one in Cisco IOS (that the customer is using). If I missed something, please write a comment.
Alternatively, one could tag routes originated on E1…E3 with BGP communities and use route maps on A1…A3 to set MED values based on those communities, but that seems clunky.
Finally, one could use an SDN controller as a route reflector and set desired MED values on the SDN controller… only I’m not aware of an SDN controller that would be doing that.
Anything else I’ve missed apart from using a single IGP and next-hop-unchanged which is a nonstarter? Feedback highly appreciated!