On November 7th SDx Central published an article saying “OpenFlow is virtually dead.” There’s a first time for everything, and it’s a real fun reading a marketing blurb on a site sponsored by SDN vendors claiming the shiny SDN parade unicorn is dead.
On a more serious note, Tom Hollingsworth wrote a blog post in which he effectively said “OpenFlow is just a tool. Can we please find the right problem for it?”
The Easy Part: What’s Wrong
It’s immediately obvious to anyone who survived the scrutiny of RFC 1925 Rule 4 that the idea of centralized control plane has no merits on planet Earth (please note that centralized control is a totally different beast that makes perfect sense).
In particular, it’s really hard to:
- Detect non-trivial link failures in milliseconds (that’s why we have BFD);
- Respond to real-time events in reasonable timeframe;
- Respond to control-plane requests (ARP/ND) from a very large number of hosts;
- Run chatty edge protocols (LLCP, LACP, STP …) on a large number of ports.
There’s a reason the number of STP instances any large modular switch can support is limited. I also had an interesting discussion recently with someone who was actually involved in building a switch control plane, and it’s amazing how many more hurdles and showstoppers are hidden behind the scenes. The things I listed above are just the tip of the iceberg.
Of course there were people who tried to prove grumpy old farts wrong and/or tried to change the laws of physics.
Some of them woke up from their hype-induced stupor, added the necessary extensions to OpenFlow, got a working product, and lost interoperability and purist control/data plane separation while doing that. Reality is hard.
What Can I Solve with OpenFlow?
OK, so what problems could I solve with OpenFlow? There are quite a few things that don’t require control-plane protocols, are not time-sensitive (as in “this has to be done in 2 msec”), and need no real-time response to failures. A few examples:
- Programmable traffic tapping
- Flexible endpoint (host) authentication
- Per-user packet filters installed into edge devices
- Interesting load balancing scenarios of long-lived elephant flows
You might have realized that most problems listed above fall into “programmable ACL/PBR” category. You can use OpenFlow to solve them, but you could also use BGP FlowSpec or a number of vendor-specific tools.
Coho Data that Tom mentioned in his blog post is using OpenFlow to program a switch in front of its scale-out storage farm – a perfect example of load balancing of long-lived elephant flows. More details in the SDN Use Cases webinar.
You might think that DDoS mitigation falls into the same category. Well, it might but the real challenge is the number of filtering rules you’d need which usually preclude the use of a hardware solution.
In any case, people who know what they’re doing try to implement extremely fast packet drops as close to the server NIC as possible to solve the DDoS mitigation challenge. Others still try to solve the same problem with OpenFlow. I wish them luck.
What about Fabrics?
The network fabrics were a particularly alluring OpenFlow use case. My cynical take on that: it was easy to figure out Total Addressable Market and get VC funding that way.
However, let’s assume you want to build your network fabric with pure OpenFlow 1.3 (no extensions to make your life easier, so you can use switches from almost any vendor). What kind of fabric could you build? These would be the prerequisites:
- No control-plane protocols;
- No real-time response to topology change events;
- No real-time response to link failures. You’d either use a single uplink or a pre-computed backup path.
So what did you just build? A fancy programmable patch panel (here’s another one), and that’s exactly what some service providers need in their access networks. No wonder they still talk about using OpenFlow in their deployments, it’s a perfect tool for their particular problem. Does that imply that it will solve all the problems you have? Probably not.