This post is probably a bit premature, but I’m positive your CIO will get a visit from a vendor offering clean-slate OpenFlow/SDN-based data center fabrics in not so distant future. At that moment, one of the first questions you should ask is “how well does your new wonderland integrate with my existing network?” or more specifically “which L2 and L3 protocols do you support?”
At least one of the vendors offering OpenFlow controllers that manage physical switches has a simple answer: use static LAG to connect your existing gear with our OpenFlow-based network (because our controller doesn’t support LACP), use static routes (because we don’t run any routing protocols) and don’t create any L2 loops in your network (because we also don’t have STP). If you wonder how reliable that is, you obviously haven’t implemented a redundant network with static routes before.
However, to be a bit more optimistic, the need for legacy protocol support depends primarily on how the new solution integrates with your network.
Overlay solutions (like Nicira’s NVP) don’t interact with the existing network at all. A hypervisor running Open vSwitch and using STT or GRE appears as an IP host to the network, and uses existing Linux mechanisms (including NIC bonding and LACP) to solve the L2 connectivity issues.
Hybrid OpenFlow solutions that only modify the behavior of the user-facing network edge (example: per-user access control) are also OK. You should closely inspect what the product does and ensure it doesn’t modify the network device behavior you rely upon in your network, but in principle you should be fine. For example, the XenServer vSwitch Controller modifies just the VM-facing behavior, but not the behavior configured on uplink ports.
Rip-and-replace OpenFlow-based network fabrics are the truly interesting problem. You’ll have to connect existing hosts to them, so you’d probably want to have LACP support (unless you’re a VMware-only shop), and they’ll have to integrate with the rest of the network, so you should ask for at least:
- LACP, if you plan to connect anything but vSphere hosts to the fabric … and you’ll probably need a device to connect the OpenFlow-based part of the network to the outside world;
- LLDP or CDP. If nothing else, they simplify troubleshooting, and they are implemented on almost everything including vSphere vSwitch.
- STP unless the OpenFlow controller implements split horizon bridging like vSphere’s vSwitch, but even then we need basic things like BPDU guard.
- A routing protocol if the OpenFlow-based solution supports L3 (OSPF comes to mind).
Call me a grumpy old man, but I wouldn’t touch an OpenFlow controller that doesn’t support the above-mentioned protocols. Worst case, if I would be forced to implement a network using such a controller, I would make sure it’s totally isolated from the rest of my network. Even then a single point of failure wouldn’t make much sense, so I would need two firewalls or routers and static routing in redundant scenarios breaks sooner or later. You get the picture.
To summarize: dynamic link status and routing protocols were created for a reason. Don’t allow glitzy new-age solutions to daze you, or you just might experience a major headache down the road.