Use ThousandEyes to Implement IP SLA on Steroids

You did read my blog post on ThousandEyes, didn’t you? What I forgot to mention was that they have this cool API that allows you to extract measurement data (including BGP topology) from their system. Can we do something cool with that?

Enter the SDN Kool-Aid Factory

Here’s a typical Internet connectivity setup used by reasonably-sized entities: two edge routers running EBGP with upstream provides and IBGP between them.

Now let’s sprinkle a pinch of SDN pixie dust on this design.

First, we need ThousandEyes probes (or something equivalent) that monitor the external service (SaaS provider) availability through individual upstream ISPs (pretty easy, a bit of PBR or VRFs on the edge routers do the trick).

Next, we need a controller that extracts measurement data (and path topology) from ThousandEyes API and decides how the traffic to SaaS providers should flow. You won’t be able to buy that controller any time soon, but it shouldn’t take long to program it (you did start polishing your programming skills, didn’t you?)

The communication between the ThousandEyes probes and the controller would actually go through the ThousandEyes cloud service, but drawing arrows that way would really destroy the beauty of the picture ;)

Finally, we need a mechanism to propagate the forwarding decisions to the edge routers. This is where the SDN madness could start – you could use OpenFlow or I2RS or whatever fancy new protocol is being invented to solve old problems … or you could use the venerable BGP that the edge routers already use anyway.

Using BGP for SDN Forwarding Policy Propagation

If you want to use BGP as the protocol that propagates your forwarding policy from the SDN controller to the edge routers, you need a BGP daemon within the controller. You can use Net::BGP if you want low-level control, or something like BGP Inject, Quagga or even Cisco’s CSR (where you’d have to settle for NETCONF-like interface until it gets OnePK API) – just make sure you use an abstraction layer (see, even old grunts learn from high-level diagrams) so your policy code isn’t tied too tightly to the delivery mechanism.

Speaking of abstraction layer, you just might consider using OpenDaylight for the policy distribution task and be a good netizen and contribute your code to the project.

Your BGP daemon should establish IBGP connectivity with all edge routers – using IBGP allows you to use local preference to enforce your custom-picked routes over any routes the routers might choose themselves based on the usual BGP route selection rules.

From there on, it’s mostly the usual BGP stuff – making sure you get the paths from the edge routers, select the best ones based on BGP attributes and results of SLA probes, and send whatever you feel your routers should be using to them (your controller doesn’t have to advertise the whole BGP table, just the exceptions).

For more details, check out Petr Lapukhov’s BGP SDN work … and you can always engage me to help you figure out the details.

1 comments:

  1. I mentioned it in the BGP/SDN datacenter post but this is how Internap's performance routing juju works, down to using things like PBR to make sure probes go through each provider connection. They generally have enough connectivity points and connections to enough Tier-1 providers to get a valid view of most of the internet.
Add comment
Sidebar