VXLAN and EVPN on Hypervisor Hosts

One of my readers sent me a series of questions regarding a new cloud deployment where the cloud implementers want to run VXLAN and EVPN on the hypervisor hosts:

I am currently working on a leaf-and-spine VXLAN+ EVPN PoC. At the same time, the systems team in my company is working on building a Cloudstack platform and are insisting on using VXLAN on the compute node even to the point of using BGP for inter-VXLAN traffic on the nodes.

Using VXLAN (or GRE) encap/decap on the hypervisor hosts is nothing new. That’s how NSX and many OpenStack implementations work.

The “BGP for Inter-VXLAN traffic” bit sounds too sketchy – at the very minimum one should know:

  • What they plan to use BGP for;
  • How they plan to propagate MAC-to-VTEP mapping information (that’s one of the things we use EVPN for);
  • How they plan to integrate virtual with physical network;
  • What control-plane stack they plan to use.

If they plan to use Juniper Contrail (or however it’s called today), then the whole thing has a good chance of working. Free Range Routing is another reasonable option, although I’m not sure how well it works with VXLAN tunnels on Linux hosts, and a recent post by Attilla de Groot indicates the whole idea might not be ready for production deployment. Anyone with real-life experience is most welcome to write a comment.

I am a little worried about this as we network folks will have no visibility as to how they are going to implement it and with what ramifications.

Maybe it’s time you start acting like a service provider, and provide them with end-to-end connectivity that they can use in any way they wish. You might still have to use VXLAN to implement VLANs stretched across multiple ToR switches unless you use either host routing or routing on host designs, but stop worrying about things you have no influence upon.

Just make sure all the decisions made in the design phase and integration/service points are well documented and understood by the management level where the two teams meet. “Sign on the dotted line that you accept the risk” sometimes does wonders when there’s lack of communication between teams.

Do you think it’s a good idea to offload the VXLAN traffic from the hardware switch to the compute nodes?

As I wrote many times, it’s the right architecture, and all scalable solutions (whether visible to the outside world, or hidden in the misty fog of the public clouds) use this approach. In case you missed those blog posts, start with:

ipSpace.net subscribers can find more details in Networking in Private and Public Clouds and Overlay Virtual Networking webinars.

Would this not affect the performance of the hardware Nics on those servers?

Well-implemented VXLAN encapsulation/decapsulation never had significant performance problems. Unfortunately, the open-source world is full of less-than-optimal solutions.

What other arguments can we the network engineers present to challenge this approach if it helps the company in the right direction

Without having more details, the only thing I can say is that their approach is architecturally correct even though you might not like it, or doubt that the other team is able to pull it off.

5 comments:

  1. Hi Ivan,
    as this happens to be the topic I was really interested in, here are some other solutions that use VXLAN in the hypervisor:
    1. Contrail - proprietary vRouter managed by XMPP, EVPN from the controller
    2. Nuage - "proprietary" OVS managed by OpenFlow, EVPN from the controller
    3. OpenDaylight - OVS managed by standard OpenFlow, EVPN from the controller (Quagga)
    4. BaGPipe BGP - OVS managed by local agent, EVPN from the host (ExaBGP)
    5. FRR/Quagga - Linux bridge, EVPN from the host
    That, of course, is not including non-overlay solutions (e.g. Calico) and myriad k8s plugins.
  2. Thanks for a really exhaustive list. Reading the Cumulus blog post (see above) I got the impression that #5 isn't exactly ready for production use.

    I also keep wondering how much self-assembly (and tinkering) is required to get #3 to work.

    Would you know of anyone packaging and shipping #3 and/or #5 as a solution?
    Replies
    1. #3 is a a part of NFVi solution from Ericsson (https://www.ericsson.com/ourportfolio/digital-services-solution-areas/cloud-sdn?nav=fgb_101_0363). Last time I checked they used a fork of Quagga with 6WIND's EVPN implementation (eariler than Cumulus)
      It did require quite a bit of tinkering a year ago (https://networkop.co.uk/blog/2017/12/15/os-odl-netvirt/). Its target market are SP/Telcos ($$$) so it's safe to assume that they tailor and automate each solution individually.

      #5 - I've only seen DIY'ed (https://vincent.bernat.ch/en/blog/2017-vxlan-bgp-evpn). And yes, you need to a lot of tinkering and be as smart as Vincent Bernat to build it
    2. Thanks a million! If you send me an email, and if we ever manage to meet, the beer is on me ;))

      I particularly love the last bit: "you have to be as smart as Vincent Bernat to build it". Nuff said...
  3. If you like even more options to consider, there's also
    6. Ryu.

    Although strictly speaking an SDN controller and mainly BGP control plane, a while ago I upstreamed some BGP EVPN interop patches that allows the Ryu BGP speaker to participate in an BGP EVPN 'fabric'.

    And if you're into (even more) DIY, there's ExaBGP :)
Add comment
Sidebar