Changing Cisco IOS BGP Policies Based on IP SLA Measurements

This is a guest blog post by Philippe Jounin, Senior Network Architect at Orange Business Services.


You could use track objects in Cisco IOS to track route reachability or metric, the status of an interface, or IP SLA compliance for a long time. Initially you could use them to implement reliable static routing (or even shut down a BGP session) or trigger EEM scripts. With a bit more work (and a few more EEM scripts) you could use object tracking to create time-dependent static routes.

Cisco IOS 15 has introduced Enhanced Object Tracking that allows first-hop router protocols like VRRP or HSRP to use tracking state to modify their behavior.

Although it is not documented, I was curious to test if object tracking may be used inside a BGP route-map. This would provide a nice alternative to BGP knobs like inject-maps or advertise-maps, but also make BGP react to ip SLA measurements, just like a brand new SD-WAN network!

My lab consists of 2 routers connected with 2 links (replace direct links with IPSec tunnels if you are thinking about SD-WAN). The traffic between the two endpoints should go through Link1 if the link’s delay is below 10 ms, and go through Link2 otherwise (obviously only if Link2 is available).

Collecting the delay between the routers is quite easy with the IP SLA Monitoring feature:

ip sla 65000
 udp-echo 10.0.1.2 3456
 threshold 10
 timeout 100
 frequency 3
ip sla schedule 65000 life forever start-time now
ip sla responder

The IP SLA measurement is linked with an object in order to monitor the SLA compliance. The tracking object will be up if the delay is below 10 msec and down otherwise. I’m also using dampening timers (15 seconds on delay increase, 30 seconds on decrease).

track 65 ip sla 65000
 delay down 15 up 30
 match track 65

Using the tracking object inside a route-map allows BGP to change some attributes after a the tracking object changes state. In this lab, we change Link1 local preference in order to reflect the IP SLA status.

router bgp 65000
 bgp log-neighbor-changes
 network 192.168.7.0
 neighbor 10.0.1.2 remote-as 65000
 neighbor 10.0.1.2 description -- Link 1 --
 neighbor 10.0.1.2 route-map DynamicLP out
 neighbor 10.0.2.2 remote-as 65000
 neighbor 10.0.2.2 description -- Link 2 –

route-map DynamicLP permit 10
 description -- SLA in profile: primary path --
 match track 65
 set local-preference 200
!
route-map DynamicLP permit 20
 description -- SLA out of profile: backup path --
 set local-preference 50

As expected, the local preference is correctly set by the route-map depending on the link SLA.

At t=10s, we set a jitter of 50ms, randomly increasing the delay on the Link1. The tracking object status changes after about 20 seconds and BGP advertises the new local preference after another 30 seconds, moving the traffic to Link2.

Of course, if we remove the jitter, the traffic returns to Link1

10 comments:

  1. Along with setting a higher LP I would add a community string to signal the other end it has to switch to the main route.

    and on the other end I'd implemented something like that:
    policy-statement main_bgp2rib {
    term 1 {
    from community self-asn;
    then {
    local-preference 200;
    }
    }
    }
    community self-asn members 65000:65XXX;
  2. A nice try. BTW, with EEM more embedded triggers can be built which reconfigure, scale or optimize QoS, routing, the whole configuration + remotely login to other routers and execute commands. :)

    On the other side, good SD-WAN solutions have inline packet performance and measurement. This means that stats are measured with every packet. IP SLA packets can give a different hash than the production traffic. Therefore IP SLA can check a different path in the underlay network.
    Replies
    1. Ah, you're opening the Heisenbergian can of worms ;)) You have to do IP SLA measurements from the outside interfaces (and address space not belonging to the customer) or you could get dubious results.
    2. Still the Source IP Addr of IP SLA probes can be different comparing to the Source IP Addr of a tunnel. So a hash for a load balancing can be also different. This can lead to disjoint paths in the underlay network where multipath is enabled.
    3. While you're absolutely correct from the academic perspective, it's quite rare to see ECMP (at least as "non-augmented flow-by-flow ECMP") in WAN environments.
    4. ECMP in WAN networks from an edge's perspective is rare but ECMP in the MPLS network is more frequent. At that stage when packets are coming to a PE of the MPLS cloud.

      An another case may be treating packets on the same path differently. So IP SLA packets are small and the production traffic packets are bigger. Which introduces discrepancy in characteristics. Inline performance monitoring in let's say "real" SD-WAN can detect it. To imitate it in IP SLA we should generate sweep range of probe sizes.

      Agree, with some tuning we could have a similar feature to this in SD-WAN. :)
    5. If you ever think about using IP SLA to influence ECMP on PE-router you did something badly wrong in your design. Just saying ;)

      Just because we have nerd knobs doesn't mean that we have to use them, or that we should fix every broken thing out there with a potpourri of nerd knobs. Long-term a good design always beats a heap of kludges.
    6. Not to influence ECMP but switch over to a backup path via a different service provider. Using IP SLA is not giving a guarantee comparing to an inline monitoring that degraded network characteristics will be detected. That's why the inline monitoring in modern SD-WAN solutions is a good design and not a heap of kludges. :)
    7. inline monitoring is not the biggest benefit of SD-WAN products. They can send several copies of the same packet across several links and choose on the receiving side the packet that arrives first. And it garantees the lowest latency and packet loss before any decision made based on monitoring. Also, it is possible to send a control packet for each several data packets using a different path and if one of these packets is lost it would be possible to restore lost packet from remaining packets in this group + control packet. Much less overhead than in the first case with dublicating packets across several links. SD-WAN is a new technology that provides a new features which we cannot achieve by using just routing.
  3. @Vladimir: SD-WAN often bundles different technologies together that also are available separately. Neither those technologies nor the bundling are exclusive to SD-WAN products. For backbone networks it is simpler and less expensive to go straight to Carrier Ethernet. Most SD-WAN products do have a lot of room to improve in terms of security.
    https://github.com/sdnewhop/sdwannewhope
    http://www.scada.sl/2019/09/silverpeak-sd-wan-7-cve.html
    Replies
    1. Hello @Christoph
      it is true, they combine some technologies, but the same they can say about Cisco routers. Also, could you please remember me what kind of technology on routers/switches allowed what I mentioned before?
      About security, could you please tell me the version of some IOS-XE which is secure in your opinion?
  4. @Vladimir: All SD-WAN pedge products I know so far have an embedded router. Sometimes it helps to take a step back: What is the difference between a sopgisticated hybrid WAN solution and an SD-WAN? Could one do a spohisticated hybrid network without using SD-WAN? No separation of data plane and control plane, no central orchestrator? The answer is a simple yes. So it is not SD-WAN that provides those features, it is the edge device. And if there is some "intellegence" built into the edge device and the edge device supports hybrid WAN, then it is entirely possible to achieve the same thing without any need for SD-WAN. Network optimization using a set of different technologies is not anything new, FEC is nothing new, hybrid network is nothing new, load balancing is nothing new. Inline monitoring is not anything new.
    Sending the same packets over a different links to figure out which one is the best, only gives you relyable information for that instant. Sending the same packets over multiple lines consistently is an obvious waste of bandwith.

    In terms of security, I pointed to two publicly available resources. If you specifically want to know more about the security of IOS-XE I suggest that (1) you have a look at the CVEs issues so far, (2) have a look at the overall architecture, and (3) do an extensive pentest.
    Replies
    1. Well, I wouldn't separate edge devices and SD-WAN. SD-WAN consists of edge appliances and some sort of orchestrator, not just orchestrator.
      So, what is the answer, could you remember the technology we could use in traditional networks based on routers and switches that provided the same I described in the first answer? Btw, you don't need to send to copies of packets for all traffic, you can do it only for some most critical, so there is no any waste of bandwidth. And benefit - zero time outage for this kind of traffic in case of problem on some link.

      About security - the question was about what is secure in your opinion comparing to SD-WAN products. I read those articles from your link, ok, some vulnerabilities mentioned. What is the alternative? Is there any secure product without such vulnerabilities? Oh, they found vulnerability in API, but stop, didn't we have the same API vulnerability on Cisco recently? Default SNMP community Public for RO - seriously? I saw the same on the equipment from most of other vendors I worked with.
      As I understand security issues related to SD-WAN, the main reason is that it positioned as so nice tool that unqualified people can easily manage it. But if SD-WAN admins only knows the web-interface of SD-WAN orchestrator and don't understand the underlying technologies, after they will have SNMP and API opened from the public internet, for example.
  5. @Vladimir
    You keep referring to Cisco vulnerabilities, but fail to mention which SD-WAN solution you are referring to. Security is part of a product and delivered to a large extent by products. For starters: It is quite riky to run all processes with root access. If you don't separate encryption from reouting you have an inherent problem. Encryption requires secure keys, i.e. a proper key management system. The SD-WAN products I saw and looked at in more detail did for sure not excel in terms of key management system. Some have a level tat would have been acceptable in 2010, but that is not OK since 2012. I respect NDAs (contrary to some vendors), so I will not go public with details concerning those vendors. I am restricted to point at other sources.
    Decent network encryption equipment is not available from mainstream vendors. It also seems news to most people that network encryption includes firewall functionalities at the native layer (process, bypass, drop). There is equipment out there that is secure and provides a high level of security and doesn't need contant patches or unplanned maintenance windows. You might just not be familiar with it.
    Replies
    1. no, I do not familiar with such equipment, what is the vendor do you mean?
  6. E.g. (in alphabetical order) Atmedia, Genua, Rohde & Schwarz (partially), Secunet, Securosys, Senetas and Thales, to name a few. There are different classes of networkw encryption solutions that addres different requiremnts. There are vendors that put their priority on time-to-market, performance and cost, while there are vendors that put their priority on security, longevity and performance. Currently, the latter are predominantly used for classified governement data use and for critical infrastructures. However, an increasing number of enterprise customers realizes the cost (TCO) and operational benefits of solutions designed for security, longevity and performance.
    For the current state of SD-WAN security you might also want to have a look at this: https://github.com/sdnewhop/sdwannewhope/blob/master/slides/securityfest-2019.pdf
    Replies
    1. Thanks for recommendations. Still I don't believe they are absolutely secure. Maybe they are in different class, but since they forget to renew the ssl certificate on one of their public site, who knows, what they can forget inside their product:
      https://www.ssllabs.com/ssltest/analyze.html?d=www.thales-esecurity.com
      Common names *.thales-esecurity.com
      Alternative names *.thales-esecurity.com thales-esecurity.com
      Serial Number 14024b6c9a33c98c107ec87d9dd35696
      Valid from Wed, 14 Jun 2017 00:00:00 UTC
      Valid until Thu, 12 Sep 2019 23:59:59 UTC (expired 22 days, 15 hours ago) EXPIRED

      Personally, I don't believe that it is possible to create any absolutely secure product, so there should be many layers of security, or, let's say, security on all layers, not just encryption.
  7. @Vladimir
    Thales e-Security is in a transitional phase as Thales has acquired Gemalto and is restructuring the e-Security portfolio as part of an organizational restructureing.
    I agree that there is no 100% security. In terms of Datacryptor 5000 Thales cannot forget anything in their product, especially not certificates. The Datacryptor 5000 does not use certificates and is built on the Atmedia platform.
    Network security is one security layer and for those transits covered vy it, it also covers for shortcomings in application security. The more code and the more dependencies, the lower the probabiliy of a high level of security. When looking at network encryption the objective is to have a secure device using secure algorithms and interacting with secure devices. That is much more of a challenge than one would assume. One has to really dig deep to understand network encryption systems and how they are implemented in order to make a reasonable security assessment. Some vendors are cooperative, other less so, and mostly with good reason.
  8. G'day guys,

    A couple of questions on this configuration example:

    • Shouldn't the DynamicLP route-map be applied inbound and not outbound as shown here?

    • Also, will this work between Cisco and Juniper?

    Not an expert by any stretch of imagination, so just basic questions from me to begin with.

    Thank you.

  9. @colossus: You're right (but interestingly, it did work ;) - if you want to change local route selection, you should apply route map inbound. Applying it in the outbound direction changes the route selection on the other end of the link (but it still works because it's just two routers, two links, and IBGP, so LocPref is propagated).

    As for "working between Cisco and Juniper": if you influence local route selection, then it doesn't matter what else you're using (by definition), if you're using Local Preference to influence remote route selection then it will always work within the same autonomous system (because that's how BGP works).

  10. This is an awesome explanation and example! Thank you!

Add comment
Sidebar