Loop Avoidance in VXLAN Networks

Antonio Boj sent me this interesting challenge:

Is there any way to avoid, prevent or at least mitigate bridging loops when using VXLAN with EVPN? Spanning-tree is not supported when using VXLAN encapsulation so I was hoping to use EVPN duplicate MAC detection.

MAC move dampening (or anything similar) doesn’t help if you have a forwarding loop. You might be able to use it to identify there’s a loop, but that’s it… and while you’re doing that your network is melting down.

As most hardware doesn’t support VXLAN-to-VXLAN bridging, it’s almost impossible to get bridging loop within VXLAN domain (though never say never), and if you implement valley-free-routing design you won’t get IP forwarding loops either. The IP transport backbone is thus pretty much loop-free.

Usually VXLAN Tunnel Endpoints (VTEPs) use automatic VXLAN-to-VLAN split horizon: whatever a switch gets from a VXLAN tunnel is only forwarded to corresponding VLAN, the only exception being Cisco’s multi-site architecture.

Regardless of what you do in the network core, the VLAN edge ports have the real potential to mess up your network, including the infamous “let’s plug the TX fiber to RX to see if the cable is OK” layer-1 troubleshooting and “I wonder if I can solve the bonding on my Windows server by bridging the interfaces together” approach favored by a CCIE friend of mine.

The only way to protect your network from those stupidities is to use the ancient protection mechanisms available in traditional bridged networks: make the edge switches (VTEPs) STP roots, turn on BPDU guard and root guard, enable storm control…

However, while the mechanisms to use on the edge ports haven’t changed with introduction of VXLAN, some of these features might not work for VLANs bridged into VXLAN. Carefully read the documentation and release notes before choosing the hardware infrastructure for your data center fabric… and whenever in doubt set up a lab and try to break it. You can also engage our VXLAN/EVPN experts through ExpertExpress service – Nicola and Mitja solve VXLAN/EVPN challenges on daily basis, and Dinesh knows (almost) everything there is to know about it.


Want to know more about building data center fabrics with VXLAN and EVPN?

All these webinars are part of Standard ipSpace.net subscription. Alternatively, buy the Expert ipSpace.net subscription and choose Building Next-Generation Data Centers as your online course.

7 comments:

  1. All the points mentioned above are helpful.

    Also, EVPN has the capability to identify situations where host MAC/IP is moving behind different vtep. See https://tools.ietf.org/html/draft-malhotra-bess-evpn-irb-extended-mobility-01#page-18

  2. Poorly implemented network devices attached to an EVPN VXLAN fabric can cause what is perceived to be a loop. For example: Meraki MX devices emit VRRP packets to each other using the same source MAC address from 2 different units, which a VXLAN fabric properly sees as a problem and loop avoidance mechanisms kick in so that these packets are dropped.
  3. Something to look at — https://tools.ietf.org/html/draft-snr-bess-evpn-loop-protect-03
  4. ACI supports BPDU tunneling basically acting as an ethernet hub and therefore helping to mitigate some of those issues.

    Can we do the same with VXLAN-EVPN ?
    Replies
    1. Fortunately I haven't seen a single vendor doing anything along those lines...
  5. Cisco has just released "Southbound Loop Detection and Mitigation" on the current latest version (9.3.5). What do you think about it? https://blogs.cisco.com/datacenter/detecting-and-mitigating-loops-in-vxlan-networks

  6. @Thomas: The same thing has been implemented numerous times by various vendors including Frame Relay end-to-end probing, Ethernet CFM across Metro Ethernet, VMware with vSphere beacon probing, HP IRF (IIRC)... Then there were people running LLDP on a different MAC address (or OUI, can't remember).

    If you can't trust the underlying transport and/or signaling the only way forward is to implement your own overlay end-to-end signaling or probing mechanism. It took long enough for that to appear in VXLAN :D

Add comment
Sidebar