In the previous blog post in this series, we figured out that you might not need link-layer addresses on point-to-point links. We also started exploring whether you need network-layer addresses on individual interfaces but didn’t get very far. We’ll fix that today and discover the secrets behind IP address-per-interface design.
In the early days of computer networking, there were three common addressing paradigms:
In the world of ubiquitous Ethernet and IP, it’s common to think that one needs addresses in packet headers in every layer of the protocol stack. We have MAC addresses, IP addresses, and TCP/UDP port numbers… and low-level addresses are assigned to individual interfaces, not nodes.
Turns out that’s just one option… and not exactly the best one in many scenarios. You could have interfaces with no addresses, and you could have addresses associated with nodes, not interfaces.
Years ago I wrote a series of blog posts comparing transparent bridging and IP routing, and creating How Networks Really Work materials seemed like a perfect opportunity to make that information more structured, starting with Transparent Bridging Fundamentals.
In the previous video in this series, I described how path discovery works in source routing and virtual circuit environments. I couldn’t squeeze the discussion of hop-by-hop forwarding into the same video (it would make the video way too long); you’ll find it in the next video in the same section.
After (hopefully) agreeing on what routing, bridging, and switching are, let’s focus on the first important topic in this area: how do we get a packet across the network? Yet again, there are three fundamentally different technologies:
- Source node knows the full path (source routing)
- Source node opened a path (virtual circuit) to the destination node and uses that path to send traffic
- The network performs hop-by-hop destination-address-based packet forwarding.
More details in the Getting Packets Across the Network video.
When I still cared about CCIE certification, I was always tripped up by the weird scenario with (A) mismatched ARP and MAC timeouts and (B) default gateway outside of the forwarding path. When done just right you could get persistent unicast flooding, and I’ve met someone who reported average unicast flooding reaching ~1 Gbps in his data center fabric.
One would hope that we wouldn’t experience similar problems in modern leaf-and-spine fabrics, but one of my readers managed to reproduce the problem within a single subnet in FabricPath with anycast gateway on spine switches when someone misconfigured a subnet mask in one of the servers.
If you’re working solely with IP-based networks, you’re probably quick to assume that hop-by-hop destination-only forwarding is the only packet forwarding paradigm that makes sense. Not true, even today’s networks use a variety of forwarding mechanisms, most of them called some variant of routing or switching.
What exactly is the difference between the two, and what is bridging? I’m answering these questions (and a few others like what’s the difference between data-, control- and management planes) in the Bridging, Routing and Switching Terminology video.
While packets should never be reordered in transit in transparent bridging, there’s no such guarantee in IP networks, and IP applications should tolerate out-of-order packets.
One of my regular readers who designs and builds networks supporting VoIP applications disagreed with that citing numerous real-life examples.
Of course he was right, but let’s get the facts straight first:
Let’s agree for a millisecond that you can’t find any other way to migrate your workload into a public cloud than to move the existing VMs one-by-one without renumbering them. Doing a clumsy cloud migration like this will get you the headaches and the cloud bill you deserve, but that’s a different story. Today we’ll talk about being clumsy the right and the wrong way.
There are two ways of solving today’s challenge:
Found this “gem” describing the differences between layer-2 and layer-3 on an unnamed $vendor web site.
Layer 2 is mainly concerned with the local delivery of data frames between network devices on the same network or local area network (LAN).
So far so good…
TL&DR: It’s 2020, and VXLAN with EVPN is all the rage. Thank you, you can stop reading.
On a more serious note, I got this questions from an Johannes Spanier after he read my do we need complex data center switches for NSX underlay blog post:
Would you agree that for smaller NSX designs (~100 hypervisors) a much simpler Layer2 based access-distribution design with MLAGs is feasible? One would have two distribution switches and redundant access switches MLAGed together.
I would still prefer VXLAN for a number of reasons:
A long while ago I got into an hilarious Tweetfest (note to self: don’t… not that I would ever listen) starting with:
Which feature and which Cisco router for layer2 extension over internet 100Mbps with 1500 Bytes MTU
The knee-jerk reaction was obvious: OMG, not again. The ugly ghost of BRouters (or is it RBridges or WAN Extenders?) has awoken. The best reply in this category was definitely:
I cannot fathom the conversation where this was a legitimate design option. May the odds forever be in your favor.
A dozen “this is a dumpster fire” tweets later the problem was rephrased as:
Got an interesting set of questions from a networking engineer who got stuck with the infamous “let’s push the **** down the stack” challenge:
So I am a rather green network engineer trying to solve the typical layer two stretch problem.
I could start the usual “friends don’t let friends stretch layer-2” or “your business doesn’t really need that” windmill fight, but let’s focus on how the vendors are trying to sell him the “perfect” solution:
One of my readers sent me a question along these lines…
VXLAN Network Identifier is 24 bit long, giving 16 us million separate segments. However, we have to map VNI into VLANs on most switches. How can we scale up to 16 million segments when we have run out of VLAN IDs? Can we create a separate VTEP on the same switch?
VXLAN is just an encapsulation format and does not imply any particular switch architecture. What really matters in this particular case is the implementation of the MAC forwarding table in switching ASIC.
Greg Cusanza in #BRK3192 just announced #Azure Extended Network, for stretching Layer 2 subnets into Azure!
As I know a little bit about how networking works within Azure, and I’ve seen something very similar a few times in the past, I was able to figure out what’s really going on behind the scenes in a few seconds… and got reminded of an old Russian joke I found somewhere on Quora: