OpenStack Quantum (Neutron) Plug-In: There Can Only Be One

OpenStack seems to have a great architecture: all device-specific code is abstracted into plugins that have a well-defined API, allowing numerous (more or less innovative) implementations under the same umbrella orchestration system.

Looks great in PowerPoint, but to an uninitiated outsider looking at the network (Quantum, now Neutron) plugin through the lenses of OpenStack Neutron documentation, it looks like it was designed by either a vendor or a server-focused engineer using NIC device driver concepts.

2013-10-03: Slightly changed the wording of the previous paragraph to explain my observation bias. For another perspective on Quantum beginnings, please read the comments.

You see, the major problem the Quantum plug-in architecture has is that there can only be one Quantum plugin in a given OpenStack deployment, and that plugin has to implement all the networking functionality: layer-2 subnets are mandatory, and there are extensions for layer-3 forwarding, security groups (firewalls) and load balancing.

This approach worked well in early OpenStack days when the Quantum plugin configured virtual switches (similar to what VMware’s vCenter does) and ignored the physical world. You could choose to work with Linux bridge or Open vSwitch and use VLANs or GRE tunnels (OVS only).

However, once the networking vendors started tying their own awesomesauce into OpenStack, they had to replace the original Quantum plugin with their own. No problem there if the vendor controls end-to-end forwarding path like NEC does with its ProgrammableFlow controller, or if the vendor implements end-to-end virtual networks like VMware does with NSX or Midokura does with Midonet … but what most hardware vendors want to do is to control their physical switches, not the hypervisor virtual switches.

You can probably guess what happened next: there’s no problem that cannot be solved by another layer of indirection, in this case a layered approach where a networking vendor provides a top-level Quantum plugin that relies on sub-plugin (usually OVS) controlling the hypervisor soft switches.

Remember that OpenStack supports a single plugin. Yeah, you got it right – if you want to use the above architecture, you’re usually locked into a single networking vendor. Perfect vendor lock-in within an open-source architecture. Brilliant. Also, do remember that your vendor has to update the plugin to reflect potential changes to Quantum/Neutron API.

2013-10-03: Florian Otel pointed out the Meta plugin. It might improve matters, but the documentation claims it's still experimental.

I never claimed it was a good idea to mix multiple switching vendors in the same data center (it’s not, regardless of what HP is telling you), but imagine you’d like to have switches from vendor A and load balancers from vendor B, all managed through a single plugin. Good luck with that.

Alas, wherever there’s a problem, there’s a solution – in this case a Quantum plugin that ties OpenStack to a network services orchestration platform (Tail-f NCS or Anuta nCloudX). These platforms can definitely configure multi-vendor network environments, but if you’re willing to go this far down the vendor lock-in path, you just might drop the whole OpenStack idea and use VMware or Hyper-V.

The next OpenStack release might give you a different option: a generic plugin that would implement the high-level functionality, work with virtual switches, and provide a hardware abstraction layer (Modular Layer 2 – ML2) where the vendors could plug in their own “device driver”.

This approach removes the vendor lock-in of the monolithic vendor-supplied Quantum plugins, but limits you to the lowest common denominator – VLANs (or equivalent). Not necessarily something I’d want to have in my greenfield revolutionary forward-looking OpenStack-based data center, even though Arista’s engineers are quick to point out you can implement VXLAN gateway on ToR switches and use VLANs in the hypervisors and IP forwarding in the data center fabric. No thanks, I prefer Skype over a fancier PBX.

Finally, there’s an OpenDaylight Quantum plugin, giving you total vendor independence (assuming you love riding OpenFlow unicorns). It seems OpenDaylight already supports layer-3 OpenFlow-based forwarding, so this approach might be an interesting option a year from now when OpenDaylight gets some traction and bug fixes.

Cynical summary: Reinventing the wheel while ensuring a comfortable level of lock-in seems to be a popular pastime of the networking industry. Let’s see how this particular saga evolves … and do keep in mind that some people remain deeply skeptical of OpenStack’s future.

13 comments:

  1. In my opinion this blog post is inaccurate in its assumption and incorrect in several factual areas.

    The original design discussions around Quantum/Neutron covered this topic extensively, and as someone how was there, pretty much all of your assumptions about intent are wrong.

    The root of the problem is confusing a Quantum/Neutron "plugin" (a strategy for implementing the neutron API) with a "driver" (a piece of code that talks to a particular back-end technology). Your post makes this mistake by saying:

    "Remember that OpenStack supports a single plugin. Yeah, you got it right – if you want to use the above architecture, you’re locked into a single networking vendor."

    A single plugin does not mean you can only use a single technology. Plugins can support drivers, as your examples above point out. In fact, in my view, this post argues against itself, as by highlighting the value of different models like the ML2 and tail-f designs, it drives home the point that no single "driver model" is sufficient, hence you need pluggability at a higher layer (i.e., the plugin). This was the exact motivation for the original design. A user can choose a plugin (i.e., a strategy) that ties them to a particular vendor technology, or a strategy that gives them flexibility to use technologies from different vendors, often with a "lowest-common denominator" result. We explained this to people so much in the early days of quantum that we even had standard back-up slides for it (see slides 36-38: http://www.slideshare.net/danwent/openstack-quantum-intro-os-meetup-32612 ).

    The notion of a "meta" plugin, that enables the use of different vendor-specific technologies at once was also discussed at the original design summit for OpenStack Quantum. It has been implemented and in the code base for a long time now. Again, all of this stuff is publicly available information: https://blueprints.launchpad.net/neutron/+spec/metaplugin

    You should also correct your statement about services like load balancers being tied to the plugin, as from the start you were able to load LB plugins as "service plugins", which are independent of the "core plugin" that is loaded.

    ReplyDelete
    Replies
    1. Hi Dan!

      Thanks for the comment - reworded the intro paragraph a bit to explain my observation bias ;)

      Although I agree with you in principle, the sad fact remains: at the moment you can't mix networking solutions from multiple vendors, and even though Tail-f can manage devices from multiple vendors, you're just replacing hardware lock-in with controller lock-in.

      Need to investigate LB aspect further - would appreciate if you could point me to a reasonable starting point.

      Kind regards,
      Ivan

      Delete
  2. Nice write-up Ivan (and updates by Dan).

    Would love to also see you explore what happens when Quantum services are combined with controller-based services. Where should various policies be configured? How will they interact if someone wants to deploy NFV services that run on VMs, which have to be provision via the "server" services (eg. Nova, vCenter, etc.)?

    ReplyDelete
  3. I have been involved in some of the Neutron Architecture design decisions from the very beginning (April 2011) and, believe me when I said that its architecture have been reviewed by many developers and not just one developer as you suggest in this blog. There are two major factors that have driven Neutron design and development, the first one is the race with nova-network functionality and the second was the need to provide complex network topologies with only opensource software but also letting vendors to get involved and introduce their own secret sauces.
    I do agree with Dan in most of his suggestions to your blog but I also find your point of view about multi-vendor and multi-plugin very interesting. I believe that we are targeting more than one domain in Neutron with only one plugin and therefore, it is very hard. By domain I mean PNI (Physical Networking Infrastruture) versus VNI (Virtual Networking Infrastructure). ML2 by means of drivers is putting together configuration for these two domain but I find it odd, exactly because looks messy and very difficult to debug for Cloud OpenStack Users. Services are like plugins, you can deploy many instances of the same service but only one kind of them, still some limitations but again some times those limitations are the result of providing all that functionality with opensource tools.

    ReplyDelete
    Replies
    1. Thanks for the feedback. I totally agree with you that the problem is exceedingly hard (and there aren't many successful commercial solutions out there, let alone open-source ones).

      Delete
  4. Excellent article Ivan, congratulations. OpenStack networking is definitely facing a transformation as releases go by and new scenarios are dealt with. In particular, there is a proposed solution to this issue that will be presented in the next OpenStack summit in Hong Kong and I believe it´s worth checking. https://wiki.openstack.org/w/images/7/71/Dnrm-blueprint-001.pdf

    ReplyDelete
  5. Nice article Ivan. Along the same lines I (and I am sure many others) would like to see a followon article from you that analyses the potential for vendor lock-in in other similar solutions. If running the Tail-f NCS as a Neutron plug-in implies lockin to a single controller, then that would also apply to all single-vendor controllers/ orchestrators including the VMWare NSX for example ? Would love to see an article from you that analyses the lock-in inherent with all such controller/ orchestration system based approaches. Would be interesting to know of successful deployments that have managed to avoid lockin at the controller/ orchestration level. Perhaps the answer to avoid lock-in is to run two or more separate administrative domains (if feasible), each with their own controller/ orchestration, within disjoint portions of the same enterprise/ carrier network ?

    ReplyDelete
  6. Thualsiram Valleru30 June, 2014 16:09

    Suppose I have Ryu controller in my environment and the switches are open flow enabled and I installed Ryu plug-in in OpenStack. When even a VM is created or a VM is moved to another hypervisor, Neutron Ryu plug-in can configure OVSDB using OVSD. But what about the hardware switches, Will Neutron plug-in configures physical switches or Ryu-controller.

    ReplyDelete
    Replies
    1. Have you considered asking the developers of that plugin?

      Delete
    2. Thulasiram valleru01 July, 2014 09:34

      No. Most of the SDN controller plugin documentation described how Neutron server requests configuration changes on a hypervisor using OVS sub plugin in controller plugin and they did not explain how neutron actually controls physical switches. To my understanding using sub plugin neutron can request OpenVSwitch in hypervisor for changes and the information is passed to controller. Once controller detects changes, it sends flow table entries to OVS. But how does controller know to which physical switch the nic cards of hypervisor are connected. I assume the controller is only to create and update flow tables with some additional options.

      Delete
    3. Hmm ... and what made you believe I might know more about Ryu controller than its authors/developers?

      Delete
  7. Thulasiram valleru01 July, 2014 12:04

    Just asking what is the role of SDN controllers in physical switch world. I know how they work for Virtual switches. Want to know how they work with physical switches.

    ReplyDelete
    Replies
    1. Ah, all of a sudden you're moving from Neutron plugin to SDN on physical gear - a totally different topic. Try ipSpace.net/SDN or sdncentral.com.

      Delete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.