A reader of my blog was “blessed” with hands-on experience with SD-WAN offered by large service providers. Based on that experience he sent me his views on whether that makes sense. Enjoy ;)
We all have less-than-stellar opinion on service providers and their offerings. Its well known that those services are expensive and usually lacking quality, experience, or simply, knowledge. This applies to regular MPLS/BGP techniques as to - currently, the new challenge - SD-WAN.
In January 2020 Doug Heckaman documented his experience with VeloCloud SD-WAN. He tried to be positive, but for whatever reason this particular bit caught my interest:
Edge Gateways have a limited number of tunnels they can support […]
WTF? Wasn’t x86-based software packet forwarding supposed to bring infinite resources and nirvana? How badly written must your solution be to have a limited number of IPsec tunnels on a decent x86 CPU?
It’s amazing how quickly you get “must have feature Y or it should not be called X” comments coming from vendor engineers the moment you mention something vaguely-defined like SD-WAN.
Here are just two of the claims I got as a response to “BGP with IP-SLA is SD-WAN” trolling I started on LinkedIn based on this blog post:
Key missing features [of your solution]:
- real time circuit failover (100ms is not real-time)
- traffic steering (again, 100ms is not real-time)
Let’s get the facts straight: it seems Cisco IOS evaluates route-map statements using track objects in periodic BGP table scan process, so the failover time is on order of 30 seconds plus however long it takes IP SLA to detect the decreased link quality.
It’s clear that we have two types of SD-WAN vendors:
This is a guest blog post by Philippe Jounin, Senior Network Architect at Orange Business Services.
You could use track objects in Cisco IOS to track route reachability or metric, the status of an interface, or IP SLA compliance for a long time. Initially you could use them to implement reliable static routing (or even shut down a BGP session) or trigger EEM scripts. With a bit more work (and a few more EEM scripts) you could use object tracking to create time-dependent static routes.
Cisco IOS 15 has introduced Enhanced Object Tracking that allows first-hop router protocols like VRRP or HSRP to use tracking state to modify their behavior.
Christoph Jaggi sent me this observation during one of our SD-WAN discussions:
The centralized controller is another shortcoming of SD-WAN that hasn’t been really addressed yet. In a global WAN it can and does happen that a region might be cut off due to a cut cable or an attack. Without connection to the central SD-WAN controller the part that is cut off cannot even communicate within itself as there is no control plane…
A controller (or management/provisioning) system is obviously the central point of failure in any network, but we have to go beyond that and ask a simple question: “What happens when the controller cluster fails and/or when nodes lose connectivity to the controller?”
SD-WAN is the best thing that could have happened to networking according to some industry “thought leaders” and $vendor marketers… but it seems there might be a tiny little gap between their rosy picture and reality.
This is what I got from someone blessed with hands-on SD-WAN experience:
This blog post was initially sent to subscribers of my SDN and Network Automation mailing list. Subscribe here.
I made a statement along these lines in an SD-WAN blog post and related email sent to our SDN and Network Automation mailing list:
The architecture of most SD-WAN products is thus much cleaner and easier to configure than traditional hybrid networks. However, do keep in mind that most of them use proprietary protocols, resulting in a perfect lock-in.
While reading that one of my readers sent me a nice email with an interesting question:
A while ago we published a guest blog post by Christoph Jaggi explaining the high-level security challenges of most SD-WAN solutions… but what about the low-level details?
TL&DW: some of the SD-WAN boxes are as secure as $19.99 Chinese webcam you bought on eBay.
Here’s some feedback I got from a subscriber who got pulled into an SD-WAN project:
I realized (thanks to you) that it’s really important to understand the basics of how things work. It helped me for example at my work when my boss came with the idea “we’ll start selling SD-WAN and this is the customer wish list”. Looked like business-as-usual until I realized I’ve never seen so big a difference between reality, customer wishes and what was promised to customer by sales guys I never met. And the networking engineers are supposed to save the day afterwards…
How did your first SD-WAN deployment go? Please write a comment!
Christoph Jaggi, the author of Transport and Network Security Primer and Ethernet Encryption webinars published a high-level introductory article in Inside-IT online magazine describing security deficiencies of SD-WAN solutions based on the work he did analyzing them for a large multinational corporation.
As the topic might be interesting to a wider audience, I asked him to translate the article into English. Here it is…
One of my readers sent me this question:
I'm in the process of researching SD-WAN solutions and have hit upon what I believe is a consistent deficiency across most of the current SD-WAN/SDx offerings. The standard "best practice" seems to be 60/180 BGP timers between the SD-WAN hub and the network core or WAN edge.
Needless to say, he wasn’t able to find BFD in these products either.
Does that matter? My reader thinks it does: