VMware NSX Killed My EVPN Fabric

A while ago I had an interesting discussion with someone running VMware NSX on top of VXLAN+EVPN fabric - a pretty common scenario considering:

  • NSX’s insistence on having all VXLAN uplink from the same server in the same subnet;
  • Data center switching vendors being on a lemming-like run praising EVPN+VXLAN;
  • Non-FANG environments being somewhat reluctant to connect a server to a single switch.

His fabric was running well… apart from the weird times when someone started tons of new VMs. In those moments multicast traffic generated by NSX-V to flood all the ARPs and gratuitous ARPs generated by the newly-started VMs killed the control plane of ToR switches resulting in lost BGP sessions (and a widespread disaster).

2019-10-10: Made the all uplinks claim more explicit. Also inserted the information about default settings.

We had quite intricate discussions of (inadequacy of) Control-Plane Protection implemented by the vendor he was using, and finally it dawned on me: maybe CoPP doesn’t work because the multicasts are SUPPOSED to get to the switch CPU.

After that it was pretty easy to identify the culprit: someone configured 224.0.0.0/24 as the multicast range used for NSX-V VXLAN BUM flooding. When looking through the IANA list of addresses in that range it’s quite easy to see what’s going on: every host should listen to 224.0.0.1 and all routers should listen to 224.0.0.2 (plus a few others).

It’s easy to start pointing fingers: VMware should never accept 224.0.0.0/24 as the multicast range, the NSX administrators should know about reserved multicast ranges etc… but in reality, whoever coded the NSX GUI blindly followed the “top nibble must be 0xE” recipe, and the sysadmins probably followed the help screen saying “enter a range between 224.0.0.0 and 239.255.255.255”.

A friend working for VMware pointed out that they recommended using unicast mode for years, and that the default flooding mode is now Unicast (it was Hybrid mode in earlier releases), so it looks like there were a few things going wrong in that environment.

Moral of the story: when deploying a new technology, it helps to understand how it really works (and NSX-V design guides are pretty good)… or you could ask someone who does (hint: in this case the fellow networking engineer ;).

Want to know more about VMware NSX? I covered NSX-V details last year, and will start deep dive into NSX-T in November 2019. All you need to join the live sessions or enjoy the existing NSX content is a Standard ipSpace.net Subscription.

2 comments:

  1. I hope that the lesson learned for the team in question is "just stick to the Unicast Replication mode".

    For those not entirely convinced, here's a link to a blog I wrote a few years back: https://telecomoccasionally.wordpress.com/2015/01/11/nsx-for-vsphere-vxlan-control-plane-modes-explained/

    :)
  2. I'm with Dmitri on this, but when customers do want to pursue the replication modes which use multicast, I point them to https://tools.ietf.org/html/rfc2365, and the need to plan their multicast address scopes / VNID pools carefully.
Add comment
Sidebar