Build the Next-Generation Data Center
6 week online course starting in spring 2017

Control-plane policing in OpenFlow networks

The Controller-Based Packet Forwarding in OpenFlow Networks post generated the obvious question: “does that mean we need some kind of Control-Plane Protection (CoPP) in OpenFlow controller?” Of course it does, but things aren’t as simple as that.

The weakest link in today’s OpenFlow implementations (like NEC’s ProgrammableFlow) is not the controller, but the dismal CPU used in the hardware switches. The controller could handle millions packets per second (that’s the flow setup rate claimed by Floodlight developers), the switches usually burn out at thousands of flow setups per second.

The CoPP function thus has to be implemented in the OpenFlow switches (like it’s implemented in linecard hardware in traditional switches), and that’s where the problems start – OpenFlow doesn’t have a usable rate-limiting functionality till version 1.3, which added meters.

OpenFlow meters are a really cool concept – they have multiple bands, and you can apply either DSCP remarking or packet dropping at each band – that would allow an OpenFlow controller to closely mimic the CoPP functionality and apply different rate limits to different types of control- or punted traffic. Unfortunately, no hardware switch available on the market supports OpenFlow 1.3 yet, and even when the first OpenFlow 1.3 switches start appearing, they might not support meters (or meters on flows sent to the controller).

In the meantime, proprietary extensions galore – NEC had to use one to limit unicast flooding in its ProgrammableFlow switches.

3 comments:

  1. As a vendor, control plane overload has long been a painful source of many outages and odd behavior, and in many cases very hard to debug and fix. While many newer generation switch ASICs have mechanisms to control the flow of traffic to the (almost always underpowered) CPU, little consistency exists in the control and definition of that limiting functionality.

    OpenFlow has the potential to add to the control plane work a switch has to perform. We really should use larger CPUs in our switches (always a cost/margin choice for a vendor) and I fully agree with you that a consistent mechanism to control control plane traffic is a must. OpenFlow or otherwise.

    ReplyDelete
  2. Glad to see an article addressing some of the fantasy around OF

    ReplyDelete
  3. Great post. Nice to see NEC thinking about solving these real problems. Also nice to here Marten echo the sentiment on the limitations of the nickel and dime field processors for FMPS.

    Ordered lists of TCAM requiring table re-writes/re-ordering depending on the spacing from the agent is comical to watch. L2 reactive forwarding as much as it pains me to think of it, is probably one of the few options. NPUs are putting up some pretty good numbers but I will believe it when I see it.

    Ive been trying to find time to proof a hashtable capturing high volume pps into the CP to trigger something. I dunno tho, Ops at the day job finds BUM traffic only when a problem has gone on long enough to trigger trouble and the pcap uncovers a crippling unicast flooding result. Maybe client agents are the right idea lol :0?

    Cheers, great videos with NEC.
    -Brent

    ReplyDelete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.