Nexus 1000V LACP offload and the dangers of in-band control

2021-03-01: Nexus 1000v turned into abandonware long time ago, and is now officially a zombie (oops, EOL). However, the challenges they were facing with LACP offload are still worth pointing out to anyone advocating centralized control plane (stupidity formerly known as SDN).

A while ago someone sent me the following comment as part of a lengthy discussion focusing on Nexus 1000V: “My SE tells me that the latest 1000V release has rewritten the LACP code so that it operates entirely within the VEM. VSM will be out of the picture for LACP negotiations. I guess there have been problems.

If you’re not convinced you should be running LACP between the ESX hosts and the physical switches, read this one (and this one). Ready? Let’s go.

Now imagine you’ve just installed the Nexus 1000V software on a bunch of ESX hosts and decided to enable LACP to support proper link aggregation between the hosts and a redundant pair of switches running multi-chassis link aggregation. All the soft switches (VEMs) are controlled by a VSM through the control VLAN, all the configuration is centralized on the VSM (where you can use show running to look at it without getting the carpal tunnel syndrome navigating the GUI); life is good.

Next, an ESX host reloads. The VEM module is loaded into the VMware kernel during the startup process, and tries to contact the VSM ... but it can’t. The network interfaces are not operational; the switch is waiting for the LACP negotiation to finish and VSM won’t even start the LACP negotiation until it’s configured to do so by the VSM.

There are two solutions to this problem:

Build an out-of-band control network. Install additional NICs in the ESX host and use them for control traffic (communication with vCenter and VSM). Obviously you’d need two additional NICs for redundancy. Not a big deal if your hardware supports virtual NICs (like Cisco UCS); in most other cases installing two extra NICs (and consuming two more switch ports per server) just to get the management traffic going doesn’t sound attractive if you’re on the buying side.

Make the switching element self-sufficient. This is exactly what Cisco did with LACP offload. Once you configure ESX host NICs in a port channel, VSM stores the LACP configuration in the local VEM settings. As part of the LACP offload functionality, the LACP code was ported to VEM module, allowing VEM to complete LACP negotiation with the upstream physical switch without VSM involvement (without LACP offload, all LACP packets are forwarded to the VSM through the packet VLAN and processed by the VSM).

Is this a Cisco-specific problem?

Absolutely not. LACP offload is just a simple manifestation of a fundamental problem well-known to operators of ATM or SONET/SDH networks: it’s hard to implement distributed switching architecture with central controller (like Nexus 1000V VSM or OpenFlow controller) without having completely independent out-of-band control network or at least some local intelligence in the switching elements – yet another “trivial” detail that’s usually glossed over in OpenFlow discussions.

Even more information

You’ll find big-picture perspective as well as in-depth discussions of various data center and network virtualization technologies in Data Center 3.0 for Networking Engineers and VMware Networking Deep Dive webinars.

1 comments:

  1. Ok and with the version 1.4a you have also the SSU update which is pretty good with the DRS
Add comment
Sidebar