Build the Next-Generation Data Center
6 week online course starting in spring 2017

Is Flow-Based Forwarding Just Marketing Fluff?

When writing the Packet- and Flow-Based Forwarding blog post, I tried to find a good definition of flow-based forwarding (and I was not the only one being confused), and the one from Junos SRX documentation is as good as anything else I found, so let’s use it.

TL&DR: Flow-based forwarding is a valid technical concept. However, when mentioned together with OpenFlow, it’s mostly marketing fluff.

Ready for more red-pill SDN? Check out my SDN webinars and podcast

According to Junos SRX documentation (here’s a later, more detailed version), packet-by-packet (stateless) forwarding works like this for every single packet:

  • Receive a packet;
  • Pass packet through ingress ACL (and whatever other input policy you have);
  • Perform a forwarding lookup (L2, L3, PBR, whatever);
  • Pass packet through egress ACL (and other output policies).

Junos documentation calls this process packet-based forwarding or packet-based processing conveniently ignoring the fact that all statistical multiplexing technologies in use today work on packets, particularly when they’re based on Ethernet or IP.

Flow-based forwarding, on the other hand, is a cache-based forwarding mechanism that:

  • Performs packet forwarding for the first packet in a flow;
  • Caches the results (output interface, rewriting, logging, counting and QoS actions);
  • Performs cache lookup on subsequent packets of the same flow and applies cached results without evaluating the input and output path.

I over-simplified the process a bit. Cache lookup is performed on every input packet, and the cache misses are punted to the slower forwarding path.

Is OpenFlow Flow-Based Forwarding?

At this point it’s worth mentioning that hardware OpenFlow devices from major vendors don’t use flow-based forwarding as described above. They use the exact same forwarding hardware as they’d use with standalone network OS.

The only OpenFlow switch I’ve seen so far that actually used the orthodox (micro)flow-based forwarding was the early implementation of Open vSwitch, and they quickly dropped that idea and implemented megaflows due to dismal performance. Technically, megaflows-based forwarding is still doing flow-based forwarding, but with way coarser flows. I haven’t looked into what they’re doing these days with Bloom filters.

There might be other vendors out there doing true flow-based forwarding with hardware OpenFlow switches (please write a comment), and I’d dare to guess that the price of their hardware might be a bit higher than what you can get with traditional 10GE switches today.

Back to Flow-Based Forwarding

Does the description of flow-based forwarding remind you of fast switching or Netflow-based switching (aka MLS)? It’s exactly the same concept, and the old grunts know how well those mechanisms work. Every single cache-based scheme ever invented faces interesting challenges like:

  • Cache thrashing when faced with packet sprays (or DoS attacks). Widespread port scans performed on early Internet quickly brought down fast switching caches in core routers (forcing Cisco to roll out CEF in a hurry);

The past experience with cache-based switching makes me really skeptical about reinvented cache-based wheels like LISP.

  • Edge cases that are not handled correctly in the caching code. Sometimes the cached results don’t match exactly the results (and side effects) produced by the slow forwarding path. Just think about all the times you had to turn off fast switching in Cisco IOS to make a feature work;
  • Limited cache size in hardware-based solutions (be it x86 CPU or 10GE switch), which results in either cache thrashing, or excessive punting to slow path which also kills performance.

Does It Make Sense?

Flow-based forwarding makes perfect sense when:

  • It’s cheap enough to implement large flow caches that can easily cope with the maximum number of flows the device could reasonably have to handle;
  • The cost (in terms of hardware utilization or latency) of full pipeline processing significantly exceeds the cost of doing a cache lookup and applying the cached actions;
  • It’s possible to protect the device using flow-based forwarding against cache exhaustion, either by using a very large cache size or by flow-setup policing (example: TCP SYN cookies).

Firewalls and load balancers are thus a perfect use case for flow-based forwarding. Using the same concepts in high-speed L2/L3 forwarding devices is asking for trouble and reassuringly-expensive hardware.

Want to know more?

Start with my free Introduction to SDN series, continue with the collection of SDN blog posts and my software-defined podcast, and finally move on to advanced SDN topics… or visit ipSpace.net SDN page to get started.

16 comments:

  1. All this sounds like the old switching methods as you mention. And Cisco's CEF was created as an improvement of fast-switching. But this isn't new. So no.. this can't be what is meant as "flow-based forwarding" this discussion?!

    Maybe what is meant is a "flow/cache" that's defining the full forwarding path - so only "one" router/control plane has to create a forwarding path(The forwarding decision only has to be made once regardless of the number of hops) for the flow to the destination.. like a virtual circuit.. perhaps I'm talking gibberish.

    I'm confused and intrigued at the same time.

    ReplyDelete
  2. It sounds like we have a 'solution' (loosely defined 'flow-based forwarding) and we are looking for a problem;)

    I would start with defining the problem and then talk about the solutions. Maybe following this path we could understand the originally cited 'flow-based forwarding' superiority over regular forwarding;)

    Again, we are still talking about definition of the 'flow-based forwarding' which actually may depend on marketing trends ('coolness' factor)

    ReplyDelete
  3. Flow-based forwarding sounds like what many firewalls already do with session tables. If a session is already established, it skips everything except the session lookup and sends the packet. This only works because firewalls keep state, whereas routers and switches do not, and arguably should not.

    ReplyDelete
  4. Flow-based forwarding at the first sight looks like a kind of approximation of channel forwarding as defined in G.800 Section 8.1. It is contrasted with destination forwarding as described in Section 8.2. If you look into G.809 that defines what is a flow and its related terminology, then you could see that flow-based forwarding is a kind of funny terminology. Because it has nothing to do with how the forwarding is done from the outside perspective. It just wants to emphasize a special way of internal optimization. But actually, it is still destination forwarding. Since a flow is defined as an aggregation of one or more traffic units with an element of common routing, the proper term could be "flow-based routing". But not forwarding...
    In fact, this confusion between forwarding and routing is reflected in all of the caching problems you have mentioned...

    ReplyDelete
  5. The definition I use is that a "flow" is unidirectional and stateless series of packets from a source to a destination. This is in keeping with Cisco NetFlow's definition which dates back to 1996 (see US patent 6,243,667).

    If you want bidirectional or stateful, then you've got a "session" or "connection." i.e., something that hates asymmetry. Juniper's "flow-based forwarding" is a misnomer. It really is "session-based forwarding."

    OpenFlow distorts it in a different direction by letting you customize which packet headers constitute a "flow" (i.e., a TCAM entry). Source-only, destination-only, source+dest L2/L3/L4, etc. OpenFlow would be more appropriately named OpenTCAM".

    ReplyDelete
    Replies
    1. Haha classic! (wishing I had a 'thumbs up' button)

      Delete
  6. MLS? Did you mean MPLS?

    The vulnerability of cache-fast-path forwarding is not limited to routers/switches. My company handles several hundred Mbps of traffic coming from hundreds of millions of unique mobile devices. Our platform of choice is Illumos (Solaris), and its networking stack would melt down the kernel with excess memory utilization in the "DCE cache" (next-hop cache) because of the sheer number of entries in the cache.

    ReplyDelete
    Replies
    1. MLS stands for Multi Layer Switching:

      Basic explanation can be found here in "Understanding Traditional MLS" section of:

      http://www.ciscopress.com/articles/article.asp?p=700137

      Delete
    2. MLS = Multi-Layer Switching (NetFlow-based switching on early Catalyst 5000s).

      Delete
  7. The side comment about LISP being cache-based confuses me (Assuming LISP = Location/Identity separation). As I understood it, LISP is packet-by packet forwarding, but with core routers using 64 of the 128 bits of an IPv6 address as the destination address and ignoring the rest, and edge routes doing the reverse. This does not need any caching that I see (tough caching CAN be used, if anyone sees value in caching lookup results - same as any packet-based lookup-and-forward router).
    What am I missing here?

    ReplyDelete
    Replies
    1. I don't think LISP is what you think it is ;) What you've described is another proposal (probably Identifier-Locator Addressing; HIP might also be along these lines). LISP is tunneling with DNS-like tunnel endpoint discovery mechanism.

      Delete
  8. Hey Ivan, what do you mean in this sentence: Performs cache lookup on subsequent packets of the same flow and applies cached results without evaluating the input and output path, by "without evaluating the input and output path"? Thanks

    ReplyDelete
    Replies
    1. For example, once you know that input/output ACLs permitted the first packet of a flow, there's no need to check the same ACLs for subsequent packets (modulo weird stuff like TCP flag checks or fragmentation checks etc.)

      Delete
  9. Thanks Ivan; you've articulated what a lot of us have been thinking for a while. What's old is new again, & it doesn't work any better than it did 15 years ago.

    Funny thing is, I recently found myself wanting a feature from the crusty old Cat6500 PFC3: microflow policing

    http://www.cisco.com/c/en/us/products/collateral/switches/catalyst-6500-series-switches/prod_white_paper0900aecd803e5017.html

    Of course it was based on netflow TCAM, which was usually exhausted within a few seconds of activating any feature that depends on it on a box with any decent amount of traffic, but for low-bandwidth stuff it did work.

    I don't see any reason why modern NPs like Juniper Trio couldn't implement this...

    ReplyDelete
  10. Clearly this is an implementation of cache based forwarding, and optimising on the pipeline lookup stages & follow on processing. This idea was also deployed by Tasman ISR, acquired by Nortel later (FYI a non-Cisco/Juniper product).
    Nevertheless, a Openflow based switching pipeline can also benefit from a cache based early lookup and reduced inline service pipeline.(surely nobody IMO does it)
    So technically the Juniper feature is not a flow based forwarding idea, but a cache/netflow/sflow/tuple based forwarding pipeline optimisation.
    Technically again this 'cache' can be made out of a TCAM space, with match parameters and actions , so *internally* the Openflow semantics can be used to program the TCAM(if being used as a cache). Do recall that once the lookup fails in the TCAM space, the OF rules allows you to jump to legacy routing/switching pipeline.
    A related concept is here ;
    http://spectrum.ieee.org/computing/networks/a-radical-new-router/0

    ReplyDelete
  11. On SRX, this is internal box traffic procession solution, if you have time please read rest of document you have already found: http://www.juniper.net/techpubs/en_US/junos12.3x48/topics/concept/forwarding-processing-srx5000-line-overview.html . And it is bit oversimplified how it is really done, but hopefully you will get bit clearer view on Juniper ideas behind naming of flow and packet based forwarding and internal organization of box.

    ReplyDelete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.