VXLAN Broadcast Domain Size Limitations

One of the attendees of my Building Next-Generation Data Center online course tried to figure out whether you can build larger broadcast domains with VXLAN than you could with VLANs. Here’s what he sent me:

I'm trying to understand differences or similarities between VLAN and VXLAN technologies in a view of (*cast) domain limitation.

There’s no difference between the two on the client-facing side. VXLAN is just an encapsulation technology and doesn’t change how bridging works at all (read also part 2 of that story).

The only difference between VLAN-based fabric and VXLAN-based fabric is the core transport. VLAN-based fabric uses STP/MLAG in the fabric core, TRILL/SPB/… based fabrics use their own routing protocols, and VXLAN uses IP routing. Edge flooding and learning behavior remains the same.

I covered the basics of TRILL and SPB (in case anyone is still interested) in Data Center Infrastructure for Networking Engineers webinar. Roger Lapuh did a deeper dive into SPB during his presentation @ Leaf-and-Spine Fabrics webinar. His presentation is accessible with free subscription.

EVPN is a different story as it’s IP aware… but keep in mind that EVPN became SIP of Networking - every implementation supports a different subset of features.

Several EVPN-based fabrics support ARP proxy at the fabric edge, reducing the number of broadcasts caused by ARPs… assuming someone is not ARPing for a non-existent IP address, in which case you’d probably see those ARPs flooded. Test, test, test… and make sure you also test all possible crazy scenarios.

EVPN-based fabric could implement pure IP transport and turn off flooding altogether, turning what looks like a VLAN into stable routed IP network (admittedly doing routing on host routes). I don’t think any vendor is brave enough to do that.

If you’re an subscriber and want to learn more about EVPN, watch the EVPN Technical Deep Dive webinar.

Yes, we know and understand why we should keep VLAN size limited (let’s say 1K hosts/guests/) but what about VXLAN segment size?

Same limitations apply - although EPVN-based fabrics (whether using VXLAN or MPLS or GRE or …) could reduce the amount of ARP traffic, there’s nothing stopping a single host from blasting the network with a gazillion RARPs per second (because why not) and impacting everyone else in the same segment.

Am I right that from business risk perspective I should keep VXLAN domain small as well because someone or something can impact all my 12.000 VM's in one VXLAN? Or is this technology resistant against broken frames/packets, flooding…?

A single flooding domain is a single failure domain. A VXLAN VNI (unless turned into pure routed solution) is a single flooding domain regardless of what the fabric and microsegmentation vendors are telling you. QED.

Long story short: Bridging doesn’t scale. Keep your failure domains small.


    God podcast from the “pushers” :-)
    I like when they talk about overly networking vs fabric.
    1. Is Vxlan with EVPN better den SPB in Campus?
    2. VXLAN needs just IP transport in the underlay - a well-known and well-tested technology. SPB is newer, and might thus contain more bugs on the transport side. EVPN is newer on the edge side. Choose your poison ;)
