Andy sent me this question:
I'm currently playing around with BGP & VXLANs and wondering: is there anything preventing from building a virtual IXP with VXLAN? This would be then a large layer 2 network - but why have nobody build this to now, or why do internet exchanges do not provide this?
There was at least one IXP that was running on top of VXLAN. I wanted to do a podcast about it with people who helped them build it in early 2015 but one of them got a gag order.
In the meantime, several IXPs deployed VXLAN in production including:
- INEX (they also open-sourced their management software) – pointer provided by Anonymous, more information from Nick Hillard in the comments;
- LONAP – pointer provided by Blake, more information from Will Hargrave in the comments;
- Equinix in several metro fabrics.
Want to know why you need L2 network to run an IXP? I wrote about that in 2012.
This leads me to another topic: IXPs are mostly local, nobody did yet span up one layer 2 VLAN throughout whole America or Europe. I've tried finding some information, but I don't know what I am missing. What prevents somebody from building such a large layer 2 network?
Point-to-point layer-2 networks spanning continents have been a reality since (at least) Frame Relay days, and there’s at least one SP offering L2-over-VXLAN across US and they might be using EVPN as the control plane. The trick to make these things work is to keep the L2 domain small and to minimize the impact of potential stupidities or bad hair day on either customer network or transport infrastructure.
Large L2 domains spanning continents or countries? It has been tried many times before, and failed miserably every single time. I’m positive someone will try to do it again now that you can move VMs across the continent.
Of course, latency may be an issue, but if you have a quite flat design STP should not be your problem ...
How about the fact that a single endpoint could bring down the whole network with a broadcast storm? All it takes is a broken NIC.
Keep in mind that even the regular broadcast caused by ARP gets so damaging in large L2 domains that people like AMS-IX had to deploy ARP Sponge to limit its damage.
Long story short: Friends don’t let friends build large layer-2 domains, more so if the said domain spans more than a single site. Or as Ethan Banks said once, nuked earth is not a nice sight.
Want to know more?
- Lukas Krattiger and myself will talk about multi-site and multi-pod data center fabrics (and how to build them in a relatively sane way) in another live session of Leaf-and-Spine Fabric Architectures webinar on March 29th;
- You’ll find even more information about data center fabrics in the Designing and Building Data Center Fabrics online course;
- Dinesh Dutt will talk about EVPN-with-VXLAN details in the second part of EVPN Technical Deep Dive webinar on April 5th.