You Want Your Network to Be like Google’s? Really?

This article was initially sent to my SDN mailing list. To register for SDN tips, updates, and special offers, click here.

During one of my SDN workshops one of the attendees working for a mid-sized European ISP asked me this question:

Our management tells us we should build our network like Google does, including building our own switches. Where should we start?

The only answer I could give him was “You don’t have a chance.

The Problem

Building your own network operating system is still a major undertaking. LinkedIn recently described their journey toward the first Tomahawk-based switch. It took them a year (~ 6 man-years of work) to build the prototype with the functionality they need in their network, and test it in a pilot network… and I’m positive they used standard building block (example: Quagga to run BGP).

Also consider that:

  • LinkedIn applications were probably designed from the grounds-up to be well-behaved. Your networking gear might have to support all sorts of extra kludges to cope with broken application stacks;
  • Unless you decide to use Broadcom’s OpenNSL API or buy your hardware from a vendor that ships their boxes with Linux device drivers, you’ll have to deal with Broadcom’s NDA procedures, so you’ll have to be big enough to matter to them.

With all this in mind, do the ROI calculations, and don’t forget to include the costs of the ongoing software maintenance (either vendor support or your own team). It might turn out you’re not big enough to make it work.

But wait, that’s not all

The ISP I mentioned in the beginning might have been big enough to make the ROI arithmetic work, but I’m positive they would face another problem: lack of talent.

Building a network operating system (even when using standard components) is not trivial, and it’s hard to get engineers with the necessary experience unless you’re a Silicon Valley startup or one of the popular big guys.

Starting from scratch and growing your own talent sounds like a feasible Plan-B, but unfortunately people considering this approach usually underestimate the complexities of the project. I know companies that wasted years trying to build their own OpenStack-based public clouds from the OpenStack sources instead of using a commercially-supported distribution… and likewise you might be better off buying Cumulus Linux licenses (or another commercial alternative) and slowly building your competence while already running a production network.

I'd like to hear from you!

Disagree? Please write a comment! Want to hear what I think about your SDN deployment plans? I’m usually available for short online consulting.

Want to know more about SDN? Watch the SDN and network automation webinars on ipSpace.net – if you're serious about advancing your career I’m positive you already have the subscription that gives you full access to all of them.

12 comments:

  1. Might not have OpenSwitch been a reasonable place to recommend his company start instead? Sounds like they already get you partway there.

    (Software Gone Wild Episode 48 - OpenSwitch Deep Dive)
    Replies
    1. Openswitch, although it might be an interesting idea, it's years away from being production ready. They don't have available builds to download and their hardware compatibility list only has a few TOR switches.
    2. Thanks for the input David; you're right it may be quite a ways away from being production ready. However if that's a valid mandate for the gentleman's company, then I think starting with something like OpenSwitch is a far cry better than trying to re-invent the wheel themselves.

      Also, I believe that things may have changed since you last checked, but I was able to download and run OpenSwitch in Virtualbox just fine. Also the build system instructions appear pretty complete as well (http://www.openswitch.net/develop/develophome).

      Last, have to agree on the current HCL but I'm sure that won't be much of an issue here soon.
    3. I think OpenSwitch and OpenNetworkLinux both are good places to start if someone is interested to build their own network OS. OpenSwitch as a traditional L2/L3 platform and OpenNetworkLinux as a SDN - OpenFlow enabled platform.

      on otherhand standalone network OS such as Cumulus, Pica8 are also good commercial operating systems.
  2. You have to wonder if that was general guidance from management or a true mandate. I mean, on the left hand of the slide rule, there are big-brand products (safe, but expensive) and on the right hand there are innovative alternatives (buy whitebox and write your own OS)... but there are plenty of options in between. Branded whitebox comes to mind, like Big Switch, Nuage and others. You can take advantage of the cost savings, but you still have one source for both software and hardware. I suspect that the overhead of paying for the branded services will offset the in-house development costs and technical debt of grow-your-own. Grow-your-own is a gamble that you can do it successfully and that the ROI isn't going to drag out past 3 years. I'm not criticizing the idea - if it fits your business model, then go for it. But if you can't tie that effort directly and clearly to your business strategy, then I'd consider something with training wheels.

    CWB
  3. Google, Facebook, Amazon created the idea of Open Networking, and dis aggregation. they created this philosophy when it did'nt exist.

    At the current time, we have many Whitebox/ BrightBox hardware vendors (Edge-Core/ Accton , Quanta , Dell , HP, etc) and also Multiple opensource and commercial network OS (Big Switch, Cumulus, Pica8, OpenSwitch, OpenNetworkLinux, Ocnoc, Dell OS10, Nuage, Pluribus, ...)

    in fact if google was sleeping and suddenly wake up at this time, they won't go for building their own Jupiter, Firehorse, etc. they could use an exiting technology.


    Replies
    1. Are you sure about your last comment ?
      Linkedin just started to build their own. It's the big question for me related to linkedin's announcement, why not using and contributing to an existing NOS project ?
  4. It all depends on a number of factors.

    * How much R&D budget your company have?
    * How much are you spending in your networking gear, Support etc?
    * Will you be making a saving in the long run if you innovate your own technology?
    * Are you willing to take a possible financial risk if the innovation doesn’t pay off?

    Any mid to large company can do it, but it all depends on how the figures tally on the books. Tasks like this are more of less depends on the business side than technical. You do need technical expertise but there should be enough funds to support the work. In the long run, you have to justify that running cost of your own innovation will have major advantages in comparison to off-the-shelf technology.

    I guess, because of these factors, only companies with big R&D budget can afford it.
  5. Excellent executive hubris and ambition leading a company down the garden path. Be wary of the people who tell you everything is easy and can be done in a day, they usually have no clue about what they are talking about. First problem the executive gave them a solution, build a google like network, but no mention of the problem they are trying to solve.
  6. Hi Ivan,
    we're also a "mid-sized European ISP" and we do build our own network gear, although not L2 datacenter switches but edge L3 stuff. We instal them in thousands of remote POPs with no out-of-band management.

    We decided an overall SoC architecture and hired an OEM to assemble our custom hardware (at the moment 2*10G+24*1G and 8*10G) in a few thousands units.

    Of course we used "standard building block", as you call them, e.g.: Quagga for OSPF, Exabgp, OpenVSwitch, lldpd, a custom PPPoX/L2TP damon, etc.

    We've opted to do dataplane in a dpkd-like way over a great userspace stack we licensed from 6WIND (http://www.lightreading.com/carrier-sdn/sdn-technology/italian-sp-deploys-homemade-sdn-appliance/d/d-id/713802).

    Also, we build our own centralised network automation tool, scripting tools, a monitoring GUI and even defined our own "cli" with its own syntax.

    We've got a rather meshed backbone, so we're using OpenFlow-rules to loadbalance MPLS customer traffic on all the available routes based on opportunistic traffic class determination and realtime link capacity (most of them are microwave, i.e., time-varying) though a Mixed Integer Programming algorithm that uses the libs of a general-purpose commercial solver to find the optimal routing strategy. On top of that, we proactively provision backup paths and use BFD for fast-reroute.

    For fun, we're rewriting part of that central controller on GPU (on a pair of Nvidia Telsa K2).

    All this with a staff of 3 people, over the last 1.5 years.

    If we were to start this adventure again, we would honestly do a number of things differently, but I think we would follow the same path.

    I completely agree with the "lack of talent" you talked about! Imho networking is still a software-adverse field: we find it extremely hard to find networking people that have a solid software mentality.

    If you'd like more details for a chat, feel free to get in touch.
    Giacomo Bernardi ([email protected])
    Replies
    1. Ivan I hope you can find time to have Giacomo and/or his team on Software Gone Wild in the near future.

      Giacomo, cool stuff!
    2. Already working on it ;)
Add comment
Sidebar