OpenStack/Quantum SDN-based virtual networks with Floodlight

A few years before MPLS/VPN was invented, I’d worked with a service provider who wanted to offer L3-based (peer-to-peer) VPN service to their clients. Having a single forwarding table in the PE-routers, they had to be very creative and used ACLs to provide customer isolation (you’ll find more details in the Shared-router Approach to Peer-to-peer VPN Model section of my MPLS/VPN Architectures book).

Now, what does that have to do with OpenFlow, SDN, Floodlight and Quantum?

The Big Picture

Big Switch has recently released a plug-in for Quantum that provides OpenFlow-based virtual network support with their open-source Floodlight controller, and they use layer-2 ACLs to implement virtual networks, confirming the infinite wisdom of RFC 1925:

Every old idea will be proposed again with a different name and a different presentation, regardless of whether it works.

How does it work?

The 30K foot perspective first:

And a quick look behind the scenes:

  • Big Switch decided to implement virtual networks with dynamic OpenFlow-based L2 ACLs instead of using VLAN tags.
  • The REST API offered by Floodlight’s VirtualNetworkFilter module offers simple methods that create virtual networks and assign MAC addresses to them.
  • The VirtualNetworkFilter intercepts all new flow setup requests (PacketIn messages to the Floodlight controller), checks that the source and destination MAC address belong to the same virtual network, and permits or drops the packet.
  • If the VirtualNetworkFilter accepts the flow, the Floodlight’s Forwarding module installs the flow entries for the newly-created flow throughout the network.

The current release of Floodlight installs per-flow entries throughout the network. I’m not particularly impressed with the scalability of this approach (and I’m not the only one).

Does it make sense?

Floodlight controller and its Quantum plug-in have a very long way to go before I’d use them in a production environment:

  • The Floodlight controller is a single point of failure (there’s no provision for a redundant controller);
  • Unless I can’t read Java code (which wouldn’t surprise me at all), the VirtualNetworkFilter stores all mappings (including MAC membership information) in in-memory structures that are lost if the controller or the server on which it runs crashes;
  • As mentioned above, per-flow entries used by Floodlight controller don’t scale at all (more about that in an upcoming post).

The whole thing is thus a nice proof-of-concept tool that will require significant efforts (probably including a major rewrite of the forwarding module) before it becomes production-ready.

However, we should not use Floodlight to judge the quality of the yet-to-be-released commercial OpenFlow controller from Big Switch Networks. This is how Mike Cohen explained the differences:

I want to highlight that all of the points you raised around production deployability and flow scalability (and some you didn't around how isolation is managed / enforced) are indeed addressed in significant ways in our commercial products. There’s a separation between what's in Floodlight and the code folks will eventually see from Big Switch.

As always, I might become a believer once I see the product and its documentation.

5 comments:

  1. Talking about ideas being rehashed, would SNMP be considered an "API", and other than some syntax difference what do these new http based API add that snmp couldn't already do?
  2. In scale, HTTP is quite a bit better than SNMP. RESTful APIs use HTTP GETs to retrieve data. SNMP is used extensively to retrieve stats data from devices. When the number of tenants retrieving stats from an infrastructure has to scale to the public, SNMP agents on various devices will indeed stress their control planes. Most embedded system control planes don't have an excess of CPU cycles to burn as it is. Nor do they have the intelligence for rate limiting or caching of management requests in scale.

    With HTTP we have a rich and well proven delivery and caching mechanism which can be used to impose appropriate limits simply by serving requests out of CHEAP and available application level RAM caches. HTTP delivery and rewrite proxies are available at a MUCH lower cost and point of entry than similar mechanism which use SNMP.

    Even for validating provisioning requests, using HTTP POST or PUT, verse SNMP sets, opens up a world of scripting and coding that SNMP doesn't support readily. Everyone can setup an HTTP rewrite and cache engine in half a dozen scripting languages. How many of us can say the same for SNMP?

    While SNMP is wonderful, it's not as accessible or as cheap to get working well as HTTP in our day and age. That's why we are all moving towards HTTP as an application level protocol and web based data structures, like JSON. Ask yourself how many of us thought HTTP would be the world's most popular transport for video 5 years ago? It is. It may not be as optimal for the job as RTP, but it has won the day. Viva La Web..

    Frankly, as long as we are flushing the past away, isn't it better to acknowledge the 'sins of Ethernet' and forget transports which demand dynamic MAC address learning? The MAC ACL based solution scaling problems have their root in bad network access control to begin with don't they? As long as we are throwing the baby out with the bath water, at least don't fill the tub again with dirty water. We can let Ethernet live on between well define switching nodes and do something else where security (inspection and isolation) as well as churn (service insertion) are a problem. As long as SDN is questioning the faith of the true Ethernet believer, let's talk some problem with doctrine.

    Let's come up with HA and fail-over that works well at L3 and leave L2 un-routed, dumb, fast, and cheap. Adding cost and complexity to L2 seems like going backwards..way backwards.
    Replies
    1. HTTP/REST is certainly easier in some respects, but SNMP does have quite a few advantages. It supports types, MIBs are an excellent source of documentation for an API and those APIs tend to be considerably more stable than their RESTful cousins (the RabbitMQ management plugin API has changed in every single version I've deployed and broken nagios checks every time). Many of these APIs are even standardised (and some vendors even stick to them sometimes). There are some missing things, like properly standardised floats and doubles.

      Not to mention traps, a decent (if loathed) security model in v3, no TCP 3 way handshake (unless you want it). Stuff like EVENT-MIB and EXPRESSION-MIB give you standard server-side functionality. net-snmp does a pretty good job of doing most of the really hard stuff for you.

      The real problem for SNMP is that it is something else to learn and doesn't work well if you stick to the bare minimum. Doing tables properly is hard (that's trivial in a XML or JSON based REST API). People don't even think about using things like contexts. MIBs are also scarily close to actually writing documentation, which is never going to go down well with the majority of developers. It hasn't helped that most examples of SNMP usage try and bypass MIBs completely and use numeric oids for everything which makes examples practically impossible to read and gives SNMP a scary, super complicated air about it. You can do without MIBs, but that doesn't mean you should, you can always replace the textual values with the OIDs once you've worked everything out.

      I think SNMP's time may have come, but it has merits that we are losing with HTTP REST APIs, just not ones that people care about that much. How much they should care is open for debate.
  3. Thanks I really enjoy the dialog. I know in reality http/xml is the direction things are moving in since developers are more atuned to it. However, when people tout it as some revolutionary, completely new, thing I am reminded of 'there is nothing new under the sun'
  4. Ivan..my question is about Openflow and SDN in general and not releated to this topic..if Openflow is about controlling the dumb network devices from a central server, then we need a 'network' for that traffic between the controller and the device...am confused as to how the controller can communicate with the devices that its 'trying' to setup a network?
Add comment
Sidebar