Is Controller-Based Networking More Reliable than Traditional Networking?

Listening to some SDN pundits one gets an impression that SDN brings peace to Earth, solves all networking problems and makes networking engineers obsolete.

Cynical jokes aside, and ignoring inevitable bugs, is controller-based networking really more reliable than what we do today?

An SDN solution that abstracts the dirty details of whatever networking functionality (example: Tail-F NCS) is definitely an improvement over today’s CLI-driven box-by-box mentality. If you can disable manual bypasses (it’s always easier to tweak a device configuration than to modify the service definition), you’ll get a consistent behavior across all devices managed by the SDN controller.

Will that make a network more reliable? It might, but do keep in mind that most network problems arise from operator errors. Making a configuration error on an SDN controller just increases the blast radius (listen to the Software Gone Wild podcast with Jeremy Schulman for more details) – instead of misconfiguring a single device, the SDN controller helps you misconfigure the whole network. Role-based access controls and other checks are thus even more important in the SDN world than they are in the traditional world of networking.

Finally, there’s the (in)famous separation of control and data plane. To illustrate what might be lurking in that part of the SDN world, consider these questions: did you ever have to upgrade a stack of stackable switches, or suffered a bad case of brain-dead supervisor module that looked OK to the backup supervisor module (which thus refused to take over)?

Fabrics implemented with controller-based centralized control plane (example: OpenFlow controller like Open Daylight) are functionally identical to stackable switches (or Cisco’s VSS, Juniper’s Virtual Chassis or HP’s IRF). Why should they work better than other centralized control plane implementations?

From Theory to Practice

Want to know more about data center fabric architectures? Check out Data Center Fabrics webinar.

Interested in SDN? Find what you need to get started on SDN Resources page and explore SDN webinars.

Latest blog posts in High Availability Service Clusters series


  1. Interesting to see where the debate goes not only for traffic flow policy but for network management as well. I try to look at it in 3 main functional categories.

    Control/Data plane FSMs controlled by a centralized source distributed to all devices.(True SDN/NFV - devices - whiteboxes)

    Control/Data plane separated but still controlled individually by per device FSM, (fat/smart boxes)

    Control/Data plane separated but hybrid version of some control centralized some distributed per device FSM and some data plane centralized and some distributed per device FSM(think of that split mac wireless type of application - light boxes)

    For the classic distributed per device FSM control/data plane types it is fixed and known for behavior and timers per device's protocols FSM. Think PHB.

    The centralized to distributed version adds a delta of time, another protocol to inform the per device type FSMs of a change.

    You gain an ease of management/provisioning but also risk as you mentioned the blast radius but also risk the performance delta of any flow related change getting communicated to all devices in time in order etc.vs. the classical time consumption of PHB device provisioning at any layer of the stack for the rigid consistency but micro impact.

Add comment