Category: IP routing
TL&DR: You get unequal-cost multipath for free with distance-vector routing protocols. Implementing it in link state routing protocols is an order of magnitude more CPU-consuming.
Continuing our exploration of the Unequal-Cost Multipath world, why was it implemented in EIGRP decades ago, but not in OSPF or IS-IS?
Ignoring for the moment the “does it make sense” dilemma: finding downstream paths (paths strictly shorter than the current best path) is a side effect of running distance vector algorithms.
In the previous video in this series, I described how path discovery works in source routing and virtual circuit environments. I couldn’t squeeze the discussion of hop-by-hop forwarding into the same video (it would make the video way too long); you’ll find it in the next video in the same section.
One of my readers sent me this question:
My job required me to determine if one IP address is unicast or anycast. Is it possible to get this information from the bgp dump?
TL&DR: Not with anything close to 100% reliability. An academic research paper (HT: Andrea di Donato) documents a false-positive rate of around 10%.
If you’re not familiar with IP anycast: it’s a brilliant idea of advertising the same prefix from multiple independent locations, or the same IP address from multiple servers. Works like a charm for UDP (that’s how all root DNS servers are built) and supposedly pretty well across distant-enough locations for TCP (with a long list of caveats when used within a data center).
Every now and then I’m getting questions along the lines “why doesn’t X support unequal-cost multipathing (UCMP)?” for X in [ OSPF, BGP, IS-IS ].
To set the record straight: BGP does support some rudimentary form of unequal-cost multipathing with the DMZ Bandwidth community, but it only works across multiple egress points from a single autonomous system. Follow-up nerd knobs described how to use the same community over EBGP sessions; not sure whether anyone implemented that part (comments welcome).
Now that we know what regions and availability zones are, let’s go back to Daniel Dib’s question:
As I understand it, subnets in Azure span availability zones. Do you see any drawback to this? Does subnet matter if your VMs are in different AZs?
Wait, what? A subnet is stretched across multiple failure domains? Didn’t Ivan claim that’s ridiculous?
TL&DR: What I claimed was that a single layer-2 network is a single failure domain. Things are a bit more complex in public clouds. Keep reading and you’ll find out why.
Regardless of the technology used to get packets across the network, someone has to know how to get from sender to receiver(s), and as always you have multiple options:
- Almighty controller
- On-demand dynamic path discovery (example: probing)
- Participation in a routing protocol
For more details, watch Finding Paths Across the Network video.
This podcast introduction was written by Nick Buraglio, the host of today’s podcast.
In today’s evolving landscape of whitebox, brightbox, and software routing, a small but incredibly comprehensive routing platform called FreeRTR has quietly been evolving out of a research and education service provider network in Hungary.
We had the usual gloomy December weather during the end-of-year holidays, and together with the partial lockdown (with confusing ever-changing rules only someone in Balkans could dream up) it managed to put me in OCD mood… and so I decided to remove broken links from the old blog posts.
While doing that I figured out how fragile our industry is – I encountered a graveyard of ideas and products that would make Google proud. Some of those blog posts were removed, I left others intact because they still have some technical merits, and I made sure to write sarcastic update notices on product-focused ones. Consider those comments Easter eggs… now go and find them ;))
Imagine the following network running OSPF as the routing protocol. PE1–P1–PE2 is the primary path and PE1–P2–PE2 is the backup path. What happens on PE1 when the PE1–P1 link fails? What happens on PE2?
The second question is much easier to answer, and the answer is totally unambiguous as it only involves OSPF:
There are numerous technologies you can use to implement fast reroute, from the most complex to the easiest one:
In previous blog posts in this series we discussed whether it makes sense to invest into fast failover network designs, the topologies you can use in such designs, and the fault detection techniques. I also hinted at different fast failover implementations; this blog post focuses on some of them.
Hardware-based failover changes the hardware forwarding tables after a hardware-detectable link failure, most likely loss-of-light or transceiver-reported link fault. Forwarding hardware cannot do extensive calculations; the alternate paths are thus usually pre-programmed (more details below).
A while ago (eons before AWS introduced Gateway Load Balancer) I discussed the intricacies of AWS and Azure networking with a very smart engineer working for a security appliance vendor, and he said something along the lines of “it shows these things were designed by software developers – they have no idea how networks should work.”
In reality, at least some aspects of public cloud networking come closer to the original ideas of how IP and data-link layers should fit together than today’s flat earth theories, so he probably wanted to say “they make it so hard for me to insert my virtual appliance into their network.”
In the blog post introducing fast failover challenge I mentioned several typical topologies used in fast failover designs. It’s time to explore them.
Fast failover is (by definition) adjustment to a change in network topology that happens before a routing protocol wakes up and deals with the change. It can therefore use only locally available information, and cannot involve changes in upstream devices. The node adjacent to the failed link has to deal with the failure on its own without involving anyone else.
From historical perspective, any idea why OSPF guys invented their own transport protocol instead of just relying upon TCP?
I wasn’t there when OSPF was designed, but I have a few possible explanations. Let’s start with the what functionality should the transport protocol provide reasons:
In the introductory fast failover blog post I mentioned the challenge of fast link- and node failure detection, and how it makes little sense to waste your efforts on fast failover tricks if the routing protocol convergence time has the same order of magnitude as failure detection time.
Now let’s focus on realistic failure detection mechanisms and detection times. Imagine a system connecting a hardware switching platform (example: data center switch or a high-end router) with a software switching platform (midrange router):