My tweet about the latest proof of my layer-2 = single failure domain claim has raised numerous questions about the use of bridging (aka switching) within Internet Exchange Points (IXP). Let’s see why most IXPs use L2 switching and why L2 switching is the simplest solution to the problem they’re solving.
What is an Internet Exchange Point?
This section is a gross oversimplification intended for readers who have never been exposed to this topic. Please listen to the Packet Pushers Show 24 for a more in-depth IXP discussion.
Quick summary: IXP is the place where ISPs (member of the IXP) exchange traffic.
Only a few very large transit providers are considered Tier 1 networks (see Renesys blog for yearly updates), everyone else has to buy transit to the rest of the Internet from one or more of the bigger fish in the pond.
The smaller providers are thus interested in minimizing the amount of transit traffic and peering agreements are usually a good mechanism. However, there are usually tens or hundreds of ISPs operating in a given geographical area, and private peering between them would result in an N-square full mesh problem. It’s thus in interest of almost everyone to meet in a common place, connect to a shared infrastructure – Internet Exchange Point – and exchange traffic.
How does an IXP work?
To keep things simple, let’s gloss over the details, and assume that every ISP participating in an IXP brings its own router to premises owned by IXP, and connects it to a shared network infrastructure.
Each ISP has its own AS number and uses BGP to exchange routes with other ISPs. ISPs might decide to peer with everyone, or with a select set of peers, and accept all routes or just a few routes from their peers.
An ISP can also decide to implement local transit agreements across an IXP infrastructure, or prefer routes from one of the peering partners over routes received from another peering partner.
In the example in the following diagram, AS 3 receives two paths toward AS X, one from AS 2, one from AS 4. It might prefer the route through AS 4, whereas AS 1 cannot use that route, since it’s not peering with AS 4 (unless AS 2 is willing to provide transit services).
To summarize: each ISP participating in an IXP might have its own BGP routing policy, resulting in an individualized view of the local parts of the Internet.
IP- or MAC-based forwarding in an IXP?
In the previous diagrams, the IXP infrastructure was drawn as a symbolic Ethernet cable, and some very early IXPs were actually implemented that way, using either thick coax or Ethernet hubs.
Today we could use L2 or L3 switches to implement the IXP infrastructure. Ethernet-based IXP design is obvious and simple (while we’re still glossing over details): all ISP routers connect to a switched LAN.
We all know bridging doesn’t scale, so one might want to implement IP-based IXP infrastructure – all ISP routers would be connected to an IXP router, exchange BGP routes with it, and potentially still run BGP between themselves to support various routing policies.
This scenario might work as long as all ISPs share the same routing policy. IP uses hop-by-hop destination-only forwarding (tunnels are obvious exceptions triggering the scholastic Is MPLS Tunneling problem), and thus it’s impossible for the IXP router to forward packets from different ISPs based on their preferred routing policy.
Going back to our example: if the IXP router decides to prefer route to AS X going through AS 2, it’s impossible for AS 3 to forward the packets toward AS X through AS 4. While the router in AS 3 might decide to prefer the path advertised by AS 4, once the IP packets leave it and arrive at the IXP router, the IXP router will make its own independent forwarding decision and send the packets to AS 2.
Conclusion: Internet Exchange Points are one of the rare scenarios where large L2 domains actually make sense, and once they grow and get distributed across multiple locations (example: AMS-IX, LINX), they get exposed to the same set of problems all large L2 networks face, including occasional meltdowns.