Optimal Inter-AS Routing Challenge

I encountered an ancient problem during one of my ExpertExpress engagements:

  • Customer network is split into two autonomous systems (core and access);
  • Links within access network are way slower than links within core network;
  • Customer would like to have optimal core-to-access traffic flow.

Challenge: what’s the simplest possible configuration to get it done?

The big picture

Here’s the simplified network diagram:

Links within the access network (A1 to A2, for example) are way slower than parallel links within the core network (C1 to C2) if they exist at all. Traffic sent from core network into the access network should therefore use the optimal egress point. For example, traffic sent to E1 should be sent through A1 and not A2 or A3.

The Problem

As we have parallel EBGP sessions between the autonomous systems, MED is the best tool for the job (assuming the core network doesn’t use local preference), but how do you set MED on the access network routers?

In a simple IP network, you could redistribute access network IGP prefixes into BGP on the routers connecting access AS to core AS (A1…A3). IGP cost would be automatically copied into MED. Mission accomplished.

Unfortunately, this customer uses L3VPN, and the user prefixes are advertised into BGP as VPNv4 prefixes on the PE routers (E1 … E3 in our diagram). Setting MED at that point makes no sense.

Getting It Done

What we need is a way of indicating the cost of transit from A1…A3 to E1…E3 with the MED attribute set on A1…A3. Ideally, we’d have a mechanism to copy IGP cost toward the BGP next hop into MED attribute, only I haven’t found one in Cisco IOS (that the customer is using). If I missed something, please write a comment.

Alternatively, one could tag routes originated on E1…E3 with BGP communities and use route maps on A1…A3 to set MED values based on those communities, but that seems clunky.

Finally, one could use an SDN controller as a route reflector and set desired MED values on the SDN controller… only I’m not aware of an SDN controller that would be doing that.

Anything else I’ve missed apart from using a single IGP and next-hop-unchanged which is a nonstarter? Feedback highly appreciated!

11 comments:

  1. Some blue sky thinking... BGP-LU AF between border routers, redistribute IGP into BGP-LU (loopbacks only) and set AIGP attribute. Would need to advertise labelled VPN routes also (assume it is option A now).
  2. Not sure if Accumulated IGP could help you here? Might only be supported in global table though.

    http://www.cisco.com/c/en/us/td/docs/ios-xml/ios/iproute_bgp/configuration/15-s/irg-15-s-book/bgp-accumulated-igp.html
  3. You could use ExaBGP as SDN controller, but you will need some coding to extend to this functionality :)
  4. Dear,

    If you are using Inter-AS Option A between ASes, you can try to change route-type for all prefixes advertised from A1 to C1 via eBGP from EGP to IGP ( Origin = IGP). I had similar task for Huawei routers and it's works fine. Create a route-map and apply it eBGP neighbor at export direction.
    Replies
    1. It might work for Inter-As Option B as well.
  5. It is my understanding that in Cisco IOS, the MED sent to an eBGP peer is derived from the IGP metric by default.
  6. I've stumbled across a similar issue / demand some time ago:
    As already pointed out, MED or BGP Accumulated IGP might be an option, but that greatly depends on the boxes and their operating systems that are in place right now.
    With Junos, I believe you have the option to copy the IGP costs into MED through a route-policy but they don’t support BGP A-IGP for VPN (yet), I think.
    In the current IOS-XR and IOS XE releases, BGP A-IGP should be supported, also for VPN whereas there is no way to copy the IGP metric into MED up to now.

    Some kind of an overlay through the core network might also be an option but that might not scale, depending on the size of the network.
  7. Without know the protocol running at the core interlinks I would suggest using summarization. For example A1 would advertise specific prefixes for E1 but A2 and A3 would only advertise a larger summary for those prefixes this techniques should be protocol agnostic and thus resulting in choosing the path advertising the most specific prefix but retaining the failover properties.
  8. I'm making some assumptions here based on the diagram, requirements and limitations, one assumption is that there is an iBGP full mesh with the IGP as an underlay as transport for the iBGP sessions, vpnv4 AF is used between E and A devices in the access network? The A devices therefore have peerings to all other A devices and E devices? If so this should be simple, create session templates on all A devices, 1 for peerings to E and 1 for peerings to Other A, for A peer group set all to a higher MED and for E peer set all to a lower, all NLRI will be in adj-rib-in but only those with lower MED will be in loc-rib and therefore only the path direct to E devices will go into adj-rib-out to be advertised to the C devices, this will mean that the NLRI that the C devices receive are only the ones for direct A to E paths(unless there is a failure) this removes the need for the C devices to support MED which is optional-non transitive anyway. That's the simplest way I can think of without more detail and having labbed this up!
  9. There is a feature to convert AIGP to MED in recent versions of IOS XR and XE. Maybe it could help solve this problem.
    neighbor ip-address aigp send med
Add comment
Sidebar