Two days ago I described how you can use tunneling or labeling to reduce the forwarding state in the network core (which you have to do if you want to have reasonably fast convergence with currently-available OpenFlow-enabled switches). Now let’s see what you can do in the very limited world of OpenFlow 1.0 (if any shipping physical switch supports OpenFlow 1.1 beyond OpenFlow 1.0 functionality, please write a comment)
OpenFlow 1.0 does not support tunneling of any sort
Open vSwitch (OpenFlow-capable soft switch running on Linux/Xen/KVM) can use GRE tunnels to exchange MAC frames between hypervisor hosts across an IP backbone, but cannot use OpenFlow to provision those tunnels – it uses Open vSwitch Database to get its configuration information (including GRE tunnel definitions).
After the GRE tunnels have been created, they appear as regular interfaces within the Open vSwitch; an OpenFlow controller can use them in flow entries to push user packets across GRE tunnels to other hypervisor hosts.
Tunneling support within existing OpenFlow-enabled data center switches is virtually non-existent (Juniper’s MX routers with OpenFlow add-on might be an exception), primarily due to hardware constraints.
We will probably see VXLAN/NVGRE/GRE implementations in data center switches in the next few months, but I expect most of those implementations to be software-based and thus useless for anything else but a proof-of-concept.
Cisco already has VXLAN-capable chipset in the M-series linecards; believers in merchant silicon will have to wait for the next-generation chipsets.
OpenFlow 1.0 has limited labeling functionality
MPLS support was added to OpenFlow in release 1.1 and while MPLS-capable hardware devices could use MPLS labeling with OpenFlow, there aren’t many devices that would support both MPLS and OpenFlow today (yet again, talk to Juniper). Forget MPLS for the moment.
VLAN stacking was also introduced in OpenFlow 1.1. While it would be a convenient labeling mechanism (similar to SPBV, but with a different control plane), many data center switches don’t support Q-in-Q (802.1ad). No VLAN stacking today.
The only standard labeling mechanism left to OpenFlow-enabled switches is thus VLAN tagging (OpenFlow 1.0 supports VLAN tagging, VLAN translation and tag stripping). You could use VLAN tags to build virtual circuits across the network core (similar to what MPLS labels do) and the source-destination-MAC combination at the egress node to recreate the original VLAN tag, but the solution is messy, hard to troubleshoot, and immense fun to audit. But wait, it gets worse.
- Forwarding state abstraction is mandatory;
- OpenFlow 1.0 has very limited functionality;
- Standard tagging/tunneling mechanisms are almost useless due to hardware/OpenFlow limitations (see above);
- Everyone uses their own secret awesomesauce to solve the problem ... often with proprietary OpenFlow extensions.
Someone was also kind enough to give me a hint that solved the secret awesomesauce riddle: “We can use any field in the frame header in any way we like.”
Looking at the OpenFlow 1.0 specs (assuming no proprietary extensions are used) you can rewrite source and destination MAC addresses to indicate whatever you wish – you have 96 bits to work with. Assuming the hardware devices support wildcard matches on MAC addresses (either by supporting OpenFlow 1.1 or a proprietary extension to OpenFlow 1.0), you could use the 48 bits of the destination MAC address to indicate egress node, egress port, and egress MAC address.
I might have doubts about the VLAN translation mechanism described in the previous paragraph (I am positive many security-focused engineers will have doubts), but the reuse header fields approach is even more interesting to support. How can you troubleshoot a network if you never know what the source/destination MAC addresses really mean?
Before buying an OpenFlow-based data center network, figure out what the vendors are doing (they will probably ask you to sign an NDA, which is fine), including:
- What are the mechanisms used to reduce forwarding state in the OpenFlow-based network core?
- What’s the actual packet format used in the network core (or: how are the fields in the packet header really used?)
- Will you be able to use standard network analysis tools to troubleshoot the network?
- Which version of OpenFlow are they using?
- Which proprietary extensions are they using (or not using)?
- Which switch/controller combinations are tested and fully supported?