After I wrote a comment on a LinkedIn discussion in the Carrier Ethernet group (more details here), Vishal Sharma wrote an interesting response, going into more details of distinction between centralized control and centralized control plane.
He started with a nice summary of my view:
What I understood from what you is that it is ok to have a centralized entity to have (to use a much overused phrase) a "single pane of glass" view of the network. And, presumably, the central controller may have obtained this view by amalgamating inputs from various sources.
Couldn’t agree more. Numerous SDN architectures use this approach.
Could it get the control plane details, for example, by acting as a peer of the CP running on the existing devices (switches/routers) in the network, so it has the same view of the network as they do, even if the control plane itself is not centralized in the controller per se?
That’s exactly what many SDN solutions are doing.
Most of them use plain BGP, for example Microsoft’s data center solution (see Centralized Routing Control in BGP Networks Using Link-State Abstraction for more details), Netflix’ traffic analysis solution, or Border6 Non-Stop Internet.
Some other solutions use BGP-LS (North-Bound Distribution of Link-State and TE Information Using BGP), for example Juniper’s NorthStar controller.
A centralized control plane, on the other hand, is the notion that all of the control computations be centralized in a single entity, which then programs elements in the (distributed) forwarding/data plane. And, your thought is that this latter entity does not make sense in the real world.
It’s not the notion of centralized computation that’s problematic. After all, tools like Cariden MATE or Juniper’s NorthStar controller use centralized computation, and you could argue that every BGP route reflector or route server (used by numerous Internet Exchange Points) do the same.
The real problem is in the other tasks that the control plane has to do, like detecting byzantine link failures, sending periodic messages to external devices, or running host-to-network protocols like ARP/ND. Those tasks don’t scale.
I have to admit that (if I understood what you said above correctly) this is certainly a contrarian viewpoint, since, for most people, SDN is about centralizing the control plane itself. Now, we do have the notion of a "logically centralized" control plane, but centralized none-the-less. So, some light on this would be much appreciated!
You might call my viewpoint contrarian, I call it realistic – and almost everyone who had to build and ship a production-grade product agrees with me.
For more details, go to the product-specific part of the previous blog post on this topic.
The real problem (as I see it) is that people who talk about centralized control plane don’t really understand all the implications of this concept. You either have centralized control plane (including all the complications I mentioned above and in the previous blog post) or you don’t. You can’t have it both ways.
You could, of course, offload the periodic control plane functionality to edge nodes, and still run central path computation. Juniper’s QFabric is doing exactly that, as did most Frame Relay, SONET/SDH and ATM networks. The SDN Architecture document from ONF mentions this approach (and the real-life scalability concerns) very explicitly in sections 4.2 and 4.3. Let me quote straight from section 4.3.4 of that document (which more-or-less says the same things I’ve been saying for years)
Although a key principle of SDN is stated as the decoupling of control and data planes, it is clear that an agent in the data plane is itself exercising control, albeit on behalf of the SDN controller. Further, a number of functions with control aspects are widely considered as candidates to execute on network elements, for example OAM, ICMP processing, MAC learning, neighbor discovery, defect recognition and integration, protection switching.
A more nuanced reading of the decoupling principle allows an SDN controller to delegate control functions to the data plane, subject to a requirement that these functions behave in ways acceptable to the controller; that is, the controller should never be surprised. This interpretation is vital as a way to apply SDN principles to the real world.
However, do keep in mind that the current set of tools you could use (primarily OpenFlow) doesn’t include a standard way of delegating control (at least not in OpenFlow 1.5), so anyone who solved this problem did it using proprietary extensions.
More to Explore
For even more details, explore my SDN webinars and other SDN resources: