One of the attendees of my Building Next-Generation Data Center online course tried to figure out whether you can build larger broadcast domains with VXLAN than you could with VLANs. Here’s what he sent me:
I'm trying to understand differences or similarities between VLAN and VXLAN technologies in a view of (*cast) domain limitation.
The only difference between VLAN-based fabric and VXLAN-based fabric is the core transport. VLAN-based fabric uses STP/MLAG in the fabric core, TRILL/SPB/… based fabrics use their own routing protocols, and VXLAN uses IP routing. Edge flooding and learning behavior remains the same.
I covered the basics of TRILL and SPB (in case anyone is still interested) in Data Center Infrastructure for Networking Engineers webinar. Roger Lapuh did a deeper dive into SPB during his presentation @ Leaf-and-Spine Fabrics webinar. His presentation is accessible with free ipSpace.net subscription.
EVPN is a different story as it’s IP aware… but keep in mind that EVPN became SIP of Networking - every implementation supports a different subset of features.
Several EVPN-based fabrics support ARP proxy at the fabric edge, reducing the number of broadcasts caused by ARPs… assuming someone is not ARPing for a non-existent IP address, in which case you’d probably see those ARPs flooded. Test, test, test… and make sure you also test all possible crazy scenarios.
EVPN-based fabric could implement pure IP transport and turn off flooding altogether, turning what looks like a VLAN into stable routed IP network (admittedly doing routing on host routes). I don’t think any vendor is brave enough to do that.
If you’re an ipSpace.net subscriber and want to learn more about EVPN, watch the EVPN Technical Deep Dive webinar.
Yes, we know and understand why we should keep VLAN size limited (let’s say 1K hosts/guests/) but what about VXLAN segment size?
Same limitations apply - although EPVN-based fabrics (whether using VXLAN or MPLS or GRE or …) could reduce the amount of ARP traffic, there’s nothing stopping a single host from blasting the network with a gazillion RARPs per second (because why not) and impacting everyone else in the same segment.
Am I right that from business risk perspective I should keep VXLAN domain small as well because someone or something can impact all my 12.000 VM's in one VXLAN? Or is this technology resistant against broken frames/packets, flooding…?
A single flooding domain is a single failure domain. A VXLAN VNI (unless turned into pure routed solution) is a single flooding domain regardless of what the fabric and microsegmentation vendors are telling you. QED.
Long story short: Bridging doesn’t scale. Keep your failure domains small.