OpenFlow @ Google: Brilliant, but not revolutionary

Google unveiled some details of its new internal network at Open Networking Summit in April and predictably the industry press and OpenFlow pundits exploded with the “this is the end of the networking as we know it” glee. Unfortunately I haven’t seen a single serious technical analysis of what it is they’re actually doing and how different their new network is from what we have today.

This is a work of fiction, based solely on the publicly-available information presented by Google’s engineers at Open Networking Summit (plus an interview or two published by the industry press). Read and use it at your own risk.

What is Google doing?

After supposedly building their own switches, Google decided to build their own routers. They use a distributed multi-chassis architecture with redundant central control plane (not unlike Juniper’s XRE/EX8200 combo). Let’s call their combo a G-router.

A G-router is used as a WAN edge device in their data centers and runs traditional routing protocols: EBGP with the data center routers and IBGP+IS-IS across WAN with other G-routers (or traditional gear during the transition phase).

On top of that, every G-router has a (proprietary, I would assume) northbound API that is used by Google’s Traffic Engineering (G-TE) – a centralized application that’s analyzing the application requirements, computing the optimal paths across the network and creating those paths through the network of G-routers using the above-mentioned API.

I wouldn’t be surprised if G-TE would use MPLS forwarding instead of installing 5-tuples into mid-path switches. Doing Forwarding Equivalence Class (FEC) classification at the head-end device instead of at every hop is way simpler and less loop-prone.

Like MPLS-TE, G-TE runs in parallel with the traditional routing protocols. If it fails (or an end-to-end path is broken), G-routers can always fall back to traditional BGP+IGP-based forwarding, and like with MPLS-TE+IGP, you’ll still have a loop-free (although potentially suboptimal) forwarding topology.

Is it so different?

Not really. Similar concepts (central path computation) were used in ATM and Frame Relay networks, as well as early MPLS-TE implementations (before Cisco implemented OSPF/IS-IS traffic engineering extensions and RSVP that was all you’d had).

Some networks are supposedly still running offline TE computations and static MPLS TE tunnels because they give you way better results than the distributed MPLS-TE/autobandwidth/automesh kludges.

MPLS-TP is also going in the same direction – paths are computed by NMS, which then installs in/out label mappings (and fast failover alternatives if desired) to the Label Switch Routers (LSRs).

Then what is different?

Google is (as far as I know) the first one that implemented the end-to-end system: gathering application needs, computing paths, and installing them in the routers in real time.

You could do the same thing (should you wish to do it) with the traditional gear using NETCONF with a bit of MPLS-TP sprinkled on top (or your own API if you have switches that can be easily programmed in a decent programming language – Arista immediately comes to mind), but it would be a “slight” nightmare and would still suffer the drawbacks of distributed signaling protocols (even static MPLS-TE tunnels use RSVP these days).

The true difference between their implementation and everything else on the market is thus that they did it the right way, learning from all the failures and mistakes we made in the last two decades.

Why did they do it?

Wouldn’t you do the same assuming you’d have the necessary intellectual potential and resources? Google’s engineers built themselves a high-end router with modern scale-out software architecture that runs only the features they need (with no code bloat and no bugs from unrelated features), and they can extend the network functionality in any way they wish with the northbound API.

Even though they had to make hefty investment in the G-router platform, they claim their network already converges almost 10x faster than before (on the other hand, it’s not hard converging faster than IS-IS with default timers), and has average link utilization above 90% (which in itself is a huge money-saver).

Hype galore

Based on the information from Open Networking Summit (which is all the information I have at the moment), you might wonder what all the hype is about. In one word: OpenFlow. Let’s try to debunk those claims a bit.

Google is running an OpenFlow network. Get lost. Google is using OpenFlow between controller and adjacent chassis switches because (like everyone else) they need a protocol between the control plane and forwarding planes, and they decided to use an already-documented one instead of inventing their own (the extra OpenFlow hype could also persuade hardware vendors and chipset manufacturers to implement more OpenFlow capabilities in their next-generation products).

Google built their own routers ... and so can you. Really? Based on the scarce information from ONS talks and interview in Wired, Google probably threw more money and resources at the problem than a typical successful startup. They effectively decided to become a router manufacturer, and they did. Can you repeat their feat? Maybe, if you have comparable resources.

Google used open-source software ... so the monopolistic Ciscos of the world are doomed. Just in case you believe the fairy-tale conclusion, let me point out that many Internet exchanges use open-source software for BGP route servers, and almost all networking appliances and most switches built today run on open source software (namely Linux or FreeBSD). It’s the added value that matters, in Google’s case their traffic engineering solution.

Google built an open network – really? They use standard protocols (BGP and IS-IS) like everyone else and their traffic engineering implementation (and probably the northbound API) is proprietary. How is that different (from the openness perspective) from networks built from Juniper’s or Cisco’s gear?

Conclusions

Google’s engineers did a great job – it seems they built a modern routing platform that everyone would love to have, and an awesome traffic engineering application. Does it matter to me and you? Probably not; I don’t expect them giving their crown jewels away. Does it matter that they used OpenFlow? Not really, it’s a small piece of their whole puzzle. Will someone else repeat their feat and bring a low-cost high-end router to the market? I doubt, but I hope to be wrong.

2012-05-18 19:17 UTC - reworded 'vendor' to 'manufacturer' after Brad Casemore rightfully pointed out that Google probably has no intention to become a router vendor.

21 comments:

  1. http://www.nanog.org/meetings/nanog49/abstracts.php?pt=MTU5NSZuYW5vZzQ5&nm=nanog49 is a nanog talk from june 2010 describing building just such an MPLS transport core

    ReplyDelete
  2. Google switches revealed?

    http://www.networking-forum.com/viewtopic.php?f=46&t=29803

    ReplyDelete
    Replies
    1. The Pluto switch is the smallest and most insignificant piece of hardware that we make; don't get too excited about this one.

      Delete
  3. These slides may provide more hints on how to build an open LSR that uses Quagga as the routing engine and NetFPGA as the forwarding plane:
    http://www.nanog.org/meetings/nanog50/presentations/Monday/NANOG50.Talk17.swhyte_Opensource_LSR_Presentation.pdf

    While the pdf slides from ONS are unfortunately not available online, here is a 10 min lightning talk that pretty much summarizes the OpenFlow-based G-Scale backbone using open-source control plane (Quagga BGP, IS-IS) running in centralized servers:
    https://ripe64.ripe.net/archives/video/884/

    For those interested in such split router designs based on Quagga (or any Linux-based routing engine) running in commodity servers and downloading the FIBs to OpenFlow hardware, here is the open-source RouteFlow project (shameless plug):
    https://sites.google.com/site/routeflow/

    ReplyDelete
  4. NotSoAnonymous19 May, 2012 05:21

    Whatever they did, it's not exposed to the world yet. You only find the true results once millions of people start using it.

    People just have the habit of bashing Cisco and Microsoft.

    Mac used to glorify themselves in many such conferences and now when they got the market, 500000+ macs just got infected.

    you can certainly bash others to get into the market. Unfortunately, thats how the world works.

    ReplyDelete
    Replies
    1. It's being used by millions of people, second after second - you just can't interact with it and you never will be able to. OpenFlow is not a customer-facing technology, it's a network engineering technology that is intended to improve how we manage and operate networks, and how they function. What it is not is something that your customers should ever interact with.

      Delete
    2. Well..then happy using it....just don't advertise it in summits all over the world...

      Delete
  5. Thanks Ivan for commenting the PR - I agree with the views from this article.

    When Google will become as transparent as others, that's when I will consider a new era...

    They have one of the least challenging environment in the world, i.e. almost unlimited resources, access to the most talented engineers (along with their ego ;-)) but share almost nothing of their own dev.

    Now one would wonder if it is because the applicability of what they develop is just unique to Google, especially this massive use of TE... application is what drives Google dev, I think this use case of OF is just a confirmation to me.

    ReplyDelete
    Replies
    1. You are seriously off-base here on several points:

      - Google is very transparent, and I actually challenge you to show me where Cisco or Juniper or Brocade or HP or anyone else will ever tell you even one small piece of how their internal networks are built. There are countless links and videos out there, from Google directly, that give you insight into how our servers, datacenters, & networks operate. Obviously, we can't give you every single detail, and no one else will either.

      - We have one of the most challenging environments, because when you operate at the scale we do, no one out there can provide services to meet the needs, so we end up having to be the solution ourselves. We have to come up with ways to do things that no one else can, and to do that, you need resources, so yes, we have a lot of talent in-house, but it's not that these engineers are that much better than anyone else - it's that they are highly motivated and want to push their limits and do things that no one else is doing. You sound like someone that didn't make it through the interviews and are disgruntled. Get over it and try again.

      - Of course we're not going to share our own intellectual property if that is what is giving us a competitive advantage to the Bings and Baidus and Yahoos and Facebooks. To expect Google, or anyone for that matter, to give up all of our trade secrets is just plain stupid and demonstrative that you have no idea how this industry really works or how businesses actually make money.

      Delete
    2. Zog[quagga]% git pull
      Already up-to-date.
      Zog[quagga]% git log | grep -i author: | grep -i google.com | wc
      21 84 855

      that looks like 21 commits by google and if you look at the logs more closely you'll see a lot of isisd and bgpd enhancement added by the maintainers which are clear google code-base merges. the multi-path enhancements to bgpd are solid and material contributions here. isisd is still kind of sparse, but has benefited considerably from these contributions.

      i wouldn't expect everything to be opened up, there's a lot of glue that appears to have been developed but this is good stuff. don't look a gift horse in the mouth.

      Delete
    3. Precisely...Thats because of Google and similar cos secrecy that SDN will become VDN (Vendor defined Network)....and that will hurt SDN on the whole...differing and closed implementations will not serve the cause and will give tradtional cos like Cisco/Juni a heads up in getting market segments of their own.

      Me thinks SDN will be used more by CLOUD providers and ISP rather than Enterprises.....and will be operated in silos....what if these Google SDN stars decide to move on?

      Delete
  6. Ivan, Good post, many smart points. Still, I differ slightly on a few. You hint that perhaps Google's actions will "persuade hardware vendors and chipset manufacturers to implement more OpenFlow capabilities". Hmm. Maybe, but chipsets are not the barrier, as demonstrated by Google's use of off-the-shelf chips. The question is, will ("hardware") vendors add software support for some open control manipulation protocol. (Probably we should say "system" vendors, but nobody does.) And does OpenFlow matter? As you say, they need to manipulate control state somehow. Google could have used a proprietary protocol, but they wanted to demonstrate that there was a viable open protocol, so seems like choice of OpenFlow offers an existence proof that some problems can be solved using an open control state manipulation protocol (it's not like there are dozens). As for "Will someone else repeat their feat and bring a low-cost high-end router to the market?", how can you bring that box to market if there are no controller vendors highlighting that application? One might imagine that Google itself is the market, but how many will buy that argument?

    ReplyDelete
    Replies
    1. As always, you're spot-on. The problem OpenFlow is facing today is lack of commercial-grade controllers (apart from Nicira's NVP), and Google solved it for their internal use. Will someone else productize something similar? That would be interesting ...

      Delete
  7. The revolution is that it went beyond all the OpenFlow "hype" and actually put it into some kind of action. The market is flooded with people talking about OpenFlow and SDN, but the practicality of it is beyond the reach of most enterprises and networks out there. That could be because of hardware limitations, but I feel the more realistic limitation is in the engineer's ability to actually come up with a useful scenario for trying to fit OpenFlow into existing environments.

    ReplyDelete
    Replies
    1. So you take a technology that's pretty limited by design (control -> data-plane protocol) but hyped to extreme, use it in one of the few sensible ways it can be used ... and that's revolutionary? ;) What has the IT world come to ...

      Delete
    2. The use of the word "revolution" is obviously a stretch :) Maybe it's more of a "revelation" that someone actually did something with OpenFlow...

      Delete
  8. Average link utilisation above 90% would imply that is in working case (i.e. average) conditions, which would also imply >100% link utilisation (unless they load balance over more than 10 parallel paths) in failure cases. It's not that you can't run networks today with link utilisations above 90% but that in practise most providers plan for failure cases hence the average util is <90% by implication.

    ReplyDelete
  9. besides technology hypes(!) there is a "peopleware issue" e.g.:
    http://bhatkoti.com/2012/05/23/why-should-a-network-engineer-move-into-an-app-space/

    only with an assent of the support staff (CLI Monkeys) such "technology hypes" could be sold at the end of the day...

    ----------------------
    6VPE (a CLI monkey)

    ReplyDelete
  10. Kirill Kasavchenko11 September, 2012 09:53

    Ivan,

    I found this very interesting post reading the latest one about Openflow focus at L2-L4.

    With regards to: "some networks are supposedly still running offline TE computations and static MPLS TE tunnels because they give you way better results than the distributed MPLS-TE/autobandwidth/automesh kludges." - I understand the most frequent reason for that is so called "tunnel packing problem" described here: http://fengnet.com/book/TE_MPLS/ch09lev1sec5.html, isn't it?

    As for "MPLS-TP is also going in the same direction – paths are computed by NMS, which then installs in/out label mappings" - NMS is not a MUST, nor even a SHOULD, just one of the options provided by RFC5921 section 3.14.

    ReplyDelete
    Replies
    1. The MPLS TE problems are best described in the RIPE64 video I've linked to in the note in the article. Tunnel packing (or, more general, knapsack problem) is just one of them.

      Thanks for the MPLS-TP note. I know NMS is not mandatory, but it's interesting to see MPLS-TP moving in that same direction (actually, it's moving in all directions at once, but let's not go there ;) )

      Delete
    2. Kirill Kasavchenko11 September, 2012 21:28

      Thanks, interesting to learn about knapsack problem.

      Delete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.