OpenFlow @ Google: Brilliant, but not revolutionary
Google unveiled some details of its new internal network at Open Networking Summit in April and predictably the industry press and OpenFlow pundits exploded with the “this is the end of the networking as we know it” glee. Unfortunately I haven’t seen a single serious technical analysis of what it is they’re actually doing and how different their new network is from what we have today.
This is a work of fiction, based solely on the publicly-available information presented by Google’s engineers at Open Networking Summit (plus an interview or two published by the industry press). Read and use it at your own risk.
What is Google doing?
After supposedly building their own switches, Google decided to build their own routers. They use a distributed multi-chassis architecture with redundant central control plane (not unlike Juniper’s XRE/EX8200 combo). Let’s call their combo a G-router.
A G-router is used as a WAN edge device in their data centers and runs traditional routing protocols: EBGP with the data center routers and IBGP+IS-IS across WAN with other G-routers (or traditional gear during the transition phase).
On top of that, every G-router has a (proprietary, I would assume) northbound API that is used by Google’s Traffic Engineering (G-TE) – a centralized application that’s analyzing the application requirements, computing the optimal paths across the network and creating those paths through the network of G-routers using the above-mentioned API.
I wouldn’t be surprised if G-TE would use MPLS forwarding instead of installing 5-tuples into mid-path switches. Doing Forwarding Equivalence Class (FEC) classification at the head-end device instead of at every hop is way simpler and less loop-prone.
Like MPLS-TE, G-TE runs in parallel with the traditional routing protocols. If it fails (or an end-to-end path is broken), G-routers can always fall back to traditional BGP+IGP-based forwarding, and like with MPLS-TE+IGP, you’ll still have a loop-free (although potentially suboptimal) forwarding topology.
Is it so different?
Not really. Similar concepts (central path computation) were used in ATM and Frame Relay networks, as well as early MPLS-TE implementations (before Cisco implemented OSPF/IS-IS traffic engineering extensions and RSVP that was all you’d had).
Some networks are supposedly still running offline TE computations and static MPLS TE tunnels because they give you way better results than the distributed MPLS-TE/autobandwidth/automesh kludges.
MPLS-TP is also going in the same direction – paths are computed by NMS, which then installs in/out label mappings (and fast failover alternatives if desired) to the Label Switch Routers (LSRs).
Then what is different?
Google is (as far as I know) the first one that implemented the end-to-end system: gathering application needs, computing paths, and installing them in the routers in real time.
You could do the same thing (should you wish to do it) with the traditional gear using NETCONF with a bit of MPLS-TP sprinkled on top (or your own API if you have switches that can be easily programmed in a decent programming language – Arista immediately comes to mind), but it would be a “slight” nightmare and would still suffer the drawbacks of distributed signaling protocols (even static MPLS-TE tunnels use RSVP these days).
The true difference between their implementation and everything else on the market is thus that they did it the right way, learning from all the failures and mistakes we made in the last two decades.
Why did they do it?
Wouldn’t you do the same assuming you’d have the necessary intellectual potential and resources? Google’s engineers built themselves a high-end router with modern scale-out software architecture that runs only the features they need (with no code bloat and no bugs from unrelated features), and they can extend the network functionality in any way they wish with the northbound API.
Even though they had to make hefty investment in the G-router platform, they claim their network already converges almost 10x faster than before (on the other hand, it’s not hard converging faster than IS-IS with default timers), and has average link utilization above 90% (which in itself is a huge money-saver).
Based on the information from Open Networking Summit (which is all the information I have at the moment), you might wonder what all the hype is about. In one word: OpenFlow. Let’s try to debunk those claims a bit.
Google is running an OpenFlow network. Get lost. Google is using OpenFlow between controller and adjacent chassis switches because (like everyone else) they need a protocol between the control plane and forwarding planes, and they decided to use an already-documented one instead of inventing their own (the extra OpenFlow hype could also persuade hardware vendors and chipset manufacturers to implement more OpenFlow capabilities in their next-generation products).
Google built their own routers ... and so can you. Really? Based on the scarce information from ONS talks and interview in Wired, Google probably threw more money and resources at the problem than a typical successful startup. They effectively decided to become a router manufacturer, and they did. Can you repeat their feat? Maybe, if you have comparable resources.
Google used open-source software ... so the monopolistic Ciscos of the world are doomed. Just in case you believe the fairy-tale conclusion, let me point out that many Internet exchanges use open-source software for BGP route servers, and almost all networking appliances and most switches built today run on open source software (namely Linux or FreeBSD). It’s the added value that matters, in Google’s case their traffic engineering solution.
Google built an open network – really? They use standard protocols (BGP and IS-IS) like everyone else and their traffic engineering implementation (and probably the northbound API) is proprietary. How is that different (from the openness perspective) from networks built from Juniper’s or Cisco’s gear?
Google’s engineers did a great job – it seems they built a modern routing platform that everyone would love to have, and an awesome traffic engineering application. Does it matter to me and you? Probably not; I don’t expect them giving their crown jewels away. Does it matter that they used OpenFlow? Not really, it’s a small piece of their whole puzzle. Will someone else repeat their feat and bring a low-cost high-end router to the market? I doubt, but I hope to be wrong.
2012-05-18 19:17 UTC - reworded 'vendor' to 'manufacturer' after Brad Casemore rightfully pointed out that Google probably has no intention to become a router vendor.
While the pdf slides from ONS are unfortunately not available online, here is a 10 min lightning talk that pretty much summarizes the OpenFlow-based G-Scale backbone using open-source control plane (Quagga BGP, IS-IS) running in centralized servers:
For those interested in such split router designs based on Quagga (or any Linux-based routing engine) running in commodity servers and downloading the FIBs to OpenFlow hardware, here is the open-source RouteFlow project (shameless plug):
People just have the habit of bashing Cisco and Microsoft.
Mac used to glorify themselves in many such conferences and now when they got the market, 500000+ macs just got infected.
you can certainly bash others to get into the market. Unfortunately, thats how the world works.
When Google will become as transparent as others, that's when I will consider a new era...
They have one of the least challenging environment in the world, i.e. almost unlimited resources, access to the most talented engineers (along with their ego ;-)) but share almost nothing of their own dev.
Now one would wonder if it is because the applicability of what they develop is just unique to Google, especially this massive use of TE... application is what drives Google dev, I think this use case of OF is just a confirmation to me.
- Google is very transparent, and I actually challenge you to show me where Cisco or Juniper or Brocade or HP or anyone else will ever tell you even one small piece of how their internal networks are built. There are countless links and videos out there, from Google directly, that give you insight into how our servers, datacenters, & networks operate. Obviously, we can't give you every single detail, and no one else will either.
- We have one of the most challenging environments, because when you operate at the scale we do, no one out there can provide services to meet the needs, so we end up having to be the solution ourselves. We have to come up with ways to do things that no one else can, and to do that, you need resources, so yes, we have a lot of talent in-house, but it's not that these engineers are that much better than anyone else - it's that they are highly motivated and want to push their limits and do things that no one else is doing. You sound like someone that didn't make it through the interviews and are disgruntled. Get over it and try again.
- Of course we're not going to share our own intellectual property if that is what is giving us a competitive advantage to the Bings and Baidus and Yahoos and Facebooks. To expect Google, or anyone for that matter, to give up all of our trade secrets is just plain stupid and demonstrative that you have no idea how this industry really works or how businesses actually make money.
Zog[quagga]% git log | grep -i author: | grep -i google.com | wc
21 84 855
that looks like 21 commits by google and if you look at the logs more closely you'll see a lot of isisd and bgpd enhancement added by the maintainers which are clear google code-base merges. the multi-path enhancements to bgpd are solid and material contributions here. isisd is still kind of sparse, but has benefited considerably from these contributions.
i wouldn't expect everything to be opened up, there's a lot of glue that appears to have been developed but this is good stuff. don't look a gift horse in the mouth.
Me thinks SDN will be used more by CLOUD providers and ISP rather than Enterprises.....and will be operated in silos....what if these Google SDN stars decide to move on?
only with an assent of the support staff (CLI Monkeys) such "technology hypes" could be sold at the end of the day...
6VPE (a CLI monkey)
I found this very interesting post reading the latest one about Openflow focus at L2-L4.
With regards to: "some networks are supposedly still running offline TE computations and static MPLS TE tunnels because they give you way better results than the distributed MPLS-TE/autobandwidth/automesh kludges." - I understand the most frequent reason for that is so called "tunnel packing problem" described here: http://fengnet.com/book/TE_MPLS/ch09lev1sec5.html, isn't it?
As for "MPLS-TP is also going in the same direction – paths are computed by NMS, which then installs in/out label mappings" - NMS is not a MUST, nor even a SHOULD, just one of the options provided by RFC5921 section 3.14.
Thanks for the MPLS-TP note. I know NMS is not mandatory, but it's interesting to see MPLS-TP moving in that same direction (actually, it's moving in all directions at once, but let's not go there ;) )