I didn’t expect we’d see multi-vendor OpenFlow deployment any time soon. NEC and IBM decided to change that and Tervela, a company specialized in building messaging-based data fabrics, decided to verify their interoperability claims. Janice Roberts who works with NEC Corporation of America helped me get in touch with them and I was pleasantly surprised by their optimistic view of OpenFlow deployment in typical enterprise networks.
A bit of a background
Tervela’s data fabric solutions typically run on top of traditional networking infrastructure, and an underperforming network (particularly long outages triggered by suboptimal STP implementations) can severely impact the behavior of the services running on their platform.
They were looking for a solution that would perform way better than what their customers are typically using today (large layer-2 networks), while at the same time being easy to design, provision and operate. It seems that they found a viable alternative to existing networks in a combination of NEC’s ProgrammableFlow Controller and IBM’s BNT 8264 switches.
Easy to deploy?
As long as your network is not too big (NEC claimed their controller can manage up to 50 switches in their Networking Tech Field Day presentation), the design and deployment isn’t too hard according to Tervela’s engineers:
- They decided to use out-of-band management network and connected the management port of BNT8264 to the management network (they could also use any other switch port).
- All you have to configure on the individual switch is the management VLAN, a management IP address and the IP address of the OpenFlow controllers.
- The ProgrammableFlow controller automatically discovers the network topology using LLDP packets sent from the controller through individual switch interfaces.
- After those basic steps, you can start configuring virtual networks in the OpenFlow controller (see the demo NEC made during the Networking Tech Field Day).
Obviously, you’d want to follow some basic design rules, for example:
- Make the management network fully redundant (read the QFabric documentation to see how that’s done properly);
- Connect the switches into a structure somewhat resembling a Clos fabric, not in a ring or a random mess of cables.
Test results – Latency
Tervela’s engineers ran a number of tests, focusing primarily on latency and failure recovery.
They found out that (as expected) the first packet exchanged between a pair of VMs experiences a 8-9 millisecond latency because it’s forwarded through the OpenFlow controller, with subsequent packets having latency they were not able to measure (their tool has a 1 msec resolution).
Lesson#1 – If the initial packet latency matters, use proactive programming mode (if available) to pre-populate the forwarding tables in the switches;
Lesson#2 – Don’t do a full 12-tuple lookups unless absolutely necessary. You’d want to experience the latency only when the inter-VM communication starts, not for every TCP/UDP flow (not to mention that capturing every flow in a data center environment is a sure recipe for disaster).
Test results – Failure recovery
Very fast failure recovery was another pleasant surprise. They tested just the basic scenario (parallel primary/backup links) and found that in most cases the traffic switches over to the second link in less than a millisecond, indicating that NEC/IBM engineers did a really good job and pre-populated the forwarding tables with backup entries.
If it takes 8-9 milliseconds for the controller to program a single flow into the switches (see latency above), it’s totally impossible that the same controller would do a massive reprogramming for the forwarding tables in less than a millisecond. The failure response must have been preprogrammed in the forwarding tables.
There were a few outliers (10-15 seconds), probably caused by lack of failure detection on the physical layer. As I wrote before, detecting link failures via control packets sent by OpenFlow controller doesn’t scale – you need distributed linecard protocols (LACP, BFD) if you want to have a scalable solution.
Finally, assuming their test bed allowed the ProgrammableFlow controller to prepopulate the backup entries, it would be interesting to observe the behavior of a four-node square network, where it’s impossible to find a loop-free alternate path unless you use virtual circuits like MPLS Fast Reroute does.
Test results – Bandwidth allocation and traffic engineering
One of the interesting things OpenFlow should enable is the bandwidth-aware flow routing. Tervela’s engineers were somewhat disappointed to discover the software/hardware combination they were testing doesn’t meet those expectations yet.
They were able to reserve a link for high-priority traffic and observe automatic load balancing across alternate paths (which would be impossible in a STP-based layer-2 network), but they were not able to configure statistics-based routing (route important flows across underutilized links).
Tervela’s engineers said the test results made them confident in the OpenFlow solution from NEC and IBM. They plan to run more extensive tests and if those test results work out, they’ll start recommending OpenFlow-based solutions as a Proof-of-Concept-level alternative to their customers.
A huge thank you!
This blog post would never happen without Janice Roberts who organized the exchange of ideas, and Michael Matatia, Jake Ciarlante and Brian Gladstein from Tervela who were willing to spend time with me sharing their experience.