Legacy Protocols in OpenFlow-Based Networks

This post is probably a bit premature, but I’m positive your CIO will get a visit from a vendor offering clean-slate OpenFlow/SDN-based data center fabrics in not so distant future. At that moment, one of the first questions you should ask is “how well does your new wonderland integrate with my existing network?” or more specifically “which L2 and L3 protocols do you support?

At least one of the vendors offering OpenFlow controllers that manage physical switches has a simple answer: use static LAG to connect your existing gear with our OpenFlow-based network (because our controller doesn’t support LACP), use static routes (because we don’t run any routing protocols) and don’t create any L2 loops in your network (because we also don’t have STP). If you wonder how reliable that is, you obviously haven’t implemented a redundant network with static routes before.

However, to be a bit more optimistic, the need for legacy protocol support depends primarily on how the new solution integrates with your network.

Overlay solutions (like Nicira’s NVP) don’t interact with the existing network at all. A hypervisor running Open vSwitch and using STT or GRE appears as an IP host to the network, and uses existing Linux mechanisms (including NIC bonding and LACP) to solve the L2 connectivity issues.

Hybrid OpenFlow solutions that only modify the behavior of the user-facing network edge (example: per-user access control) are also OK. You should closely inspect what the product does and ensure it doesn’t modify the network device behavior you rely upon in your network, but in principle you should be fine. For example, the XenServer vSwitch Controller modifies just the VM-facing behavior, but not the behavior configured on uplink ports.

Rip-and-replace OpenFlow-based network fabrics are the truly interesting problem. You’ll have to connect existing hosts to them, so you’d probably want to have LACP support (unless you’re a VMware-only shop), and they’ll have to integrate with the rest of the network, so you should ask for at least:

  • LACP, if you plan to connect anything but vSphere hosts to the fabric … and you’ll probably need a device to connect the OpenFlow-based part of the network to the outside world;
  • LLDP or CDP. If nothing else, they simplify troubleshooting, and they are implemented on almost everything including vSphere vSwitch.
  • STP unless the OpenFlow controller implements split horizon bridging like vSphere’s vSwitch, but even then we need basic things like BPDU guard.
  • A routing protocol if the OpenFlow-based solution supports L3 (OSPF comes to mind).

Call me a grumpy old man, but I wouldn’t touch an OpenFlow controller that doesn’t support the above-mentioned protocols. Worst case, if I would be forced to implement a network using such a controller, I would make sure it’s totally isolated from the rest of my network. Even then a single point of failure wouldn’t make much sense, so I would need two firewalls or routers and static routing in redundant scenarios breaks sooner or later. You get the picture.

To summarize: dynamic link status and routing protocols were created for a reason. Don’t allow glitzy new-age solutions to daze you, or you just might experience a major headache down the road.

15 comments:

  1. I would expect OpenFlow network to appear to other networks as a one big bridge and one big router.

    ReplyDelete
    Replies
    1. Totally agree … and the controller has to run the protocols a bridge or a router would run. BTW, thank you for saying "bridge" and not "switch", it's nice to see there are still people who know the proper terminology.

      Delete
    2. I just got CCNP certified and I'm worried. Like Google said, the desktop is dead and tablet is king. Wifi has killed the need for Ethernet, and in the near future every residential home is going to be powered by wireless wifi Clear 4G technology so there goes the need for a router, switch, Ethernet, fiber, etc... Everything is going wireless, and without need for wire, there is no need for router, switch, networking as we now it is dead. Everybody will be on cellphone, tablet, Wifi, 4G, 5G.... and everything will be cloud based SAAS.... no need for file servers, print servers, email servers.... all computers become dummy terminals and the real data centers will use OpenFlow.... where does that level the average networking tech? To find a job being a toll booth operator? Oh wait...

      Delete
  2. Thanks Ivan. I had been delinquent on digging into the XenServ vSwitch controller. I think you touch on a very important point that folks seem to be trying to develop answers for. Blending the data center hypervisor network pools with the physical substrate into orchestration. The outstanding question I reckon is, do we need to further abstract the substrate? I have trouble picturing the network like a utility do to it's complexion. Maybe if that complexity is further abstracted into software than we are left with simple routed networks.
    I have no idea what the hell I just said.
    Your SDN vendor summaries have been really helpful. I used your Brocade SDN post from a few weeks ago just this week. Their protected and unprotected modes are really cool, but I got 10x clarity from your post then scouring their documentation. http://goo.gl/2UD7k

    ReplyDelete
    Replies
    1. You're pretty spot-on. However, I think we should start with "why is the network so complex"; probably most of the complexity comes from our increadible masochistic urge to be McGyver and fix everyone else's blunders (see also: long-distance vMotion and Network Load Balancing :D )

      Delete
  3. Great blog post, Ivan. We, at Midokura, have been thinking along these lines since we started building our overlay solution in 2010. That's why our product, MidoNet, supports BGP, including multihoming and ECMP, for interfacing MidoNet virtual routers with external L3 networks. We don't yet support STP or LACP for interfacing MidoNet virtual bridges with external L2 networks, because we haven't seen the customer demand but it's something that we can definitely reconsider if the market requires it.

    ReplyDelete
  4. Well, if you have a OpenFlow Network with a controller, the controller will have the whole network topology.

    With that, you don't need STP... The controller will make an openflow path end-to-end. Don't need STP to control loops. And instead of getting 1 link down and 1 up(non-vlan STP), you'll get full bandwidth for as many uplinks as possible. Your controller is supposed to create unique paths that don't cause loop.

    LACP you could replace the same way, the controller will known the state of each link and if one path gets down, it will check the network topology and trace another path. For now, active passive should be possible without putting something on the edge.

    And don't worry, since it's *open* if you don't make one, somebody will and eventually will became opensource for everybody use... Comercial ones will come as well.

    Solutions for everybody tastes...


    Anyways, this is something coming to the masses this year... Let's see how it goes =)

    ReplyDelete
    Replies
    1. You're absolutely right ... assuming you believe a single OpenFlow-controlled fabric will be all you'll ever have in your network. I'm too old for such an optimistic view of the world.

      As for "it's open" - it's nice to be a believer, but I've been disappointed by the open parts I've seen so far (not to mention that OpenFlow is actually tightly controlled by an industry consortium).

      Delete
  5. we can bring more virtual networks over physical networks in a simpler way with openflow...

    ReplyDelete
    Replies
    1. Is that straight out of The OpenFlow Manifesto? ;) Supporting evidence?

      Delete
    2. Isn't that what Flowvisor is doing?

      Delete
    3. No, FlowVisor is slicing existing OpenFlow switches into multiple instances. It's a long way from there to virtual networks.

      Delete
  6. in my opinion, playing flow tables are more fun than either playing mac-table :D or routing-tables. we can define more functions in a flow table, not just matching mac-address nor just matching the most specific ip prefix. we are free to define the matching rule.. cmiiw :D

    ReplyDelete
  7. hey guys is there any way by which an openflow switch can communicate with a legacy switch other than the legacyflow concept?

    ReplyDelete
    Replies
    1. Pure OpenFlow switch is by definition a totally dumb device. It cannot communicate with anything but the controller. The controller has to implement all legacy protocols.

      Delete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.