Reliability of SD-WAN and Hybrid WAN Solutions

My Business Case for SD-WAN blog post received numerous comments pointing out the potential pitfalls of hybrid WAN, including reduced security, unreliable Internet services and denial-of-service attacks.

While all those comments are perfectly valid, I still think hybrid WAN (whether implemented with traditional technologies or SD-WAN products) makes perfect sense.

However, like with any new technology, you have to understand the fundamentals of SD-WAN (or hybrid WAN) solutions, and use them correctly.

If your CIO decides (in his infinite wisdom gained by reading vendor whitepapers and listening to product pitches) to replace MPLS/VPN circuits with SD-WAN-over-Internet solution, he’ll eventually get the disaster he deserves. The same might happen to anyone believing VPN-over-Internet solution can be made as reliable as a more traditional WAN solution.

Fortunately, we’ve been using solutions similar to SD-WAN for at least a decade, so we’ve already learned a few useful lessons.

We all know a zillion things can go wrong with Internet uplinks (and eventually they will):

  • If a link that costs you $100 a month is down, you have zero leverage with your ISP. It will be fixed… eventually;
  • If you’re experiencing packet drops on that same uplink, sometimes the only thing you can do is change the ISP;
  • If someone decides to blast you with a DDoS attack, you’re toast… unless you have a high-end router sitting at a large Internet exchange, or you’re paying for DoS scrubbing service (which you should consider doing for your hub site).

On the other hand, it’s amazing how well Internet usually works, so it would be a shame not to use it. Also, most traffic transported across enterprise WAN is not really mission-critical, and it’s a waste of money to transport it across high-quality infrastructure.

Long story short: don’t ever count on reliability or availability of your Internet uplinks (particularly at remote sites).

Redundancy is King

The usual way of dealing with unreliable components is to use redundancy. Apply the same thinking to your hybrid WAN design.

Use a combination of MPLS/VPN and Internet VPN, or Internet VPN with 3G backup. Use multiple access methods, so the cable-seeking backhoe doesn’t bring down all uplinks.

Keep Calm and Be Prepared

I guess we all agree the Internet uplinks will eventually fail. At that moment it’s important to

  • Have a working backup solution that has been properly tested. The last thing you need when your high-capacity links fail are routing loops and traffic blackholes;
  • Have enough bandwidth available on the backup path to carry mission-critical traffic, together with a mechanism that will block non-critical traffic (otherwise the non-critical traffic would hose the backup links).

There are numerous tricks you can use to be prepared. Some organizations send mission-critical traffic over MPLS/VPN WAN all the time to ensure the MPLS/VPN links have enough bandwidth to carry that traffic when the Internet uplinks fail; others monitor the state of backup links (which should be a standard procedure anyway).

Cisco IOS has a 3G MIB, so we were able to write a monitoring solution for one of our customers that would alert them (and their mobile operator) when the 3G signal strength deteriorated below acceptable level.

5 comments:

  1. Hi Ivan, have you had any exposure to Barracuda's TINA protocol? Apparently, it runs on the NextGen FW, creates an overlay tunnel similar to some SD-WAN solutions
  2. and operates over TCP or UDP...
  3. As one of the people making comments on the original post, I'd like to thank you for providing this update.

    Internet VPNs can't be ignored, and will prove to be at least part of the right solution for some Enterprises. The cost difference is irresistible.

    But that also means it will attract a lot of folks looking to put "I saved $$$" on their resume and then move on before the disaster strikes.

    It would be really great to see a realistic business case analysis taking into account things like the additional cost of securing and monitoring Internet connections at hundreds or thousands of sites vs. maybe less than 10. Or the cost of the Plan B for weathering enterprise-wide or near enterprise-wide Internet outages measured in days, where you're running on your backup solution (assuming it really was tested and works).

  4. What do you think is the value of being able to set *Application Performance*, in terms of Business Level Objectives, and let a solution such as SALSA by Ipanema Technologies ensure that the best path, priority, queue and other rules are all applied to guarantee your business runs smoothly in a hybrid networking environment?
  5. Gartner report on the future of SD-WAN
    http://www.gartner.com/technology/reprints.do?id=1-2JRZ2US&ct=150722&st=sb
Add comment
Sidebar