VMware NSX Killed My EVPN Fabric

I had an interesting discussion with someone running VMware NSX on top of VXLAN+EVPN fabric a while ago. That’s a pretty common scenario considering:

  • NSX’s insistence on having all VXLAN uplink from the same server in the same subnet;
  • Data center switching vendors being on a lemming-like run praising EVPN+VXLAN;
  • The reluctance of non-FAANG environments to connect a server to a single switch.

Apart from the weird times when someone started tons of new VMs, his fabric was running well.

In those moments, multicast traffic generated by NSX-V to propagate ARP requests and gratuitous ARPs generated by the newly-started VMs killed the control plane of ToR switches, resulting in lost BGP sessions (and a widespread disaster).

We had pretty intricate discussions about (the inadequacy of) Control-Plane Protection implemented by the vendor he was using. Finally, it dawned on me: maybe CoPP doesn’t work because the multicasts are SUPPOSED to get to the switch CPU.

After that, it was pretty easy to identify the culprit: someone configured 224.0.0.0/24 as the multicast range used for NSX-V VXLAN BUM flooding. Looking through the IANA list of addresses in that range, it’s quite easy to see what’s going on: every host should listen to 224.0.0.1, and all routers should listen to 224.0.0.2 (plus a few others).

It’s easy to start pointing fingers: VMware should never accept 224.0.0.0/24 as the multicast range, the NSX administrators should know about reserved multicast ranges etc… but in reality, whoever coded the NSX GUI mindlessly followed the “top nibble must be 0xE” recipe, and the sysadmins probably followed the help screen saying “enter a range between 224.0.0.0 and 239.255.255.255”.

A friend working for VMware pointed out they recommended using the unicast flooding mode for years. The default flooding mode is now unicast (it was Hybrid mode in earlier releases), so it looks like quite a few things were going wrong in that environment.

The moral of the story: when deploying a new technology, it helps to understand how it really works (and NSX-V design guides are pretty good)… or you could ask someone who does (hint: in this case, the fellow networking engineer ;).

Revision History

2019-10-10
Made the all uplinks claim more explicit. I also included information about default settings.

2 comments:

  1. I hope that the lesson learned for the team in question is "just stick to the Unicast Replication mode".

    For those not entirely convinced, here's a link to a blog I wrote a few years back: https://telecomoccasionally.wordpress.com/2015/01/11/nsx-for-vsphere-vxlan-control-plane-modes-explained/

    :)
  2. I'm with Dmitri on this, but when customers do want to pursue the replication modes which use multicast, I point them to https://tools.ietf.org/html/rfc2365, and the need to plan their multicast address scopes / VNID pools carefully.
Add comment
Sidebar