Category: IP routing
The simplest way to implement layer-3 forwarding in a network fabric is to offload it to an external device1, be it a WAN edge router, a firewall, a load balancer, or any other network appliance.
In 2018 I tried to figure out whether the rush to deploy new routing protocols in leaf-and-spine fabrics is anything more than another blob of hype (RIFT, OpenFabric, BGP), considering OSPF got the job done for AWS. Those discussions probably sounded like a bunch of smart kids trying to measure outside temperature with a moist finger, so the only recommendation I could give in 2021 was “use the best tool for the job, keeping in mind you’re not Google or Microsoft”
It’s always better to measure than to have opinions, and a group of academics did just that. They developed Sybil – a tool to measure routing protocol performance in leaf-and-spine fabrics – and Dip Singh used it to compare BGP to IS-IS and OpenFabric.
It took vendors like Cisco years to start supporting routing protocols between MLAG-attached routers and a pair of switches in the MLAG cluster. That seems like a no-brainer scenario, so there must be some hidden complexities. Let’s figure out what they are.
We’ll use the familiar MLAG diagram, replacing one of the attached hosts with a router running a routing protocol with both members of the MLAG cluster (for example, R, S1, and S2 are OSPF neighbors).
A while ago, I wrote a blog post explaining why we should (mostly) disable ICMP redirects, triggering a series of comments discussing the root cause of ICMP redirects. A few of those blamed static routes, including:
Put another way, the presence or absence of ICMP Redirects is a red herring, usually pointing to architectural/design issues instead. In this example, using vPC Peer Gateway or, better yet, running a minimal IGP instead of relying on static routes eliminates ICMP Redirects from both the problem and solution spaces simultaneously.
Unfortunately, that’s not the case. You can get suboptimal routing that sometimes triggers ICMP redirects in well-designed networks running more than one routing protocol.
Imagine you built a layer-2 fabric with tons of VLANs stretched all over the place. Now the users want to exchange traffic between those VLANs, and the obvious question is: which devices should do layer-2 forwarding (bridging) and which ones should do layer-3 forwarding (routing)?
There are four typical designs you can use to solve that challenge:
- Exchange traffic between VLANs outside of the fabric (edge routing)
- Route on core switches (centralized routing)
- Route on ingress (asymmetric IRB)
- Route on ingress and egress (symmetric IRB)
This blog post is an overview of the design models; we’ll cover each design in a separate blog post.
After discussing network addressing and switching, routing, and bridging in the How Networks Really Work webinar, it was high time for a deep dive into routing protocols, starting (as always) with an overview.
Mark Seery wrote a fantastic must-read article explaining why routing will never be a solved problem.
You might want to enjoy it as a relaxing antidote after a painful exposure to SD-WAN (or SD-something-else) brainwashing.
Early bridges implemented a single bridging domain across all ports. Within a few years, we got multiple bridging domains within a single device (including bridging implementation in Cisco IOS). The capability to have multiple bridging domains stretched across several devices was still missing… until the modern-day Pandora opened the VLAN box and forever swamped us in the complexities of large-scale bridging.
Network terminology was easy in the 1980s: bridges forwarded frames between Ethernet segments based on MAC addresses, and routers forwarded network layer packets between network segments. That nirvana couldn’t last long; eventually, a big-enough customer told Cisco: “I don’t want to buy another box if I already have your too-expensive router. I want your router to be a bridge.”
Turning a router into a bridge is easier than going the other way round1: add MAC table and dynamic MAC learning, and spend an evening implementing STP.
When I started implementing the netlab VLAN module, I encountered (at least) three different ways of configuring physical interfaces and bridging domains even though the underlying packet forwarding operations (and sometimes even the forwarding hardware) are the same. That confusopoly is guaranteed to make your head spin for years, and the only way to figure out what’s going on behind the scenes is to go back to the fundamentals.
A friend of mine working for a mid-sized networking vendor sent me an intriguing question:
We have a product using an old ASIC that has 12K forwarding entries, and would like to extend its lifetime. I know you were mentioning some useful tricks, would you happen to remember what they were?
This challenge has no perfect solution, but there are at least three tricks I’ve encountered so far (as always, comments are most welcome):
Most large content providers use some sort of egress traffic engineering on edge web proxy/caching servers to optimize the end-user experience (avoid congested transit autonomous systems) and link utilization on egress links.
I was planning to write a blog post about the tricks they use for ages, and never found time to do it… but if you don’t mind watching a video, the Source Routing on the Edge presentation Oliver Herms had at iNOG::14v does a pretty good job explaining the concepts and a particular implementation.
Every now and then someone tells me how much better the global Internet would be if only we were using recursive layers (RINA) and hierarchical addresses. I always answer “that’s a business problem, not a technical one, and you cannot solve business problems by throwing technology at them”, but of course that has never persuaded anyone who hasn’t been running a large-enough business for long enough.
Syed Khalid Ali left the following question on an old blog post describing the use of IBGP and EBGP in an enterprise network:
From an enterprise customer perspective, should I run iBGP or iBGP+IGP (OSPF/ISIS/EIGRP) or IGP while doing mutual redistribution on the edge routers. I was hoping if you could share some thoughtful insight on when to select one over the another?
We covered tons of relevant details in the January 2022 Design Clinic, here’s the CliffNotes version. Keep in mind that the road to hell (and broken designs) is paved with great recipes and best practices, and that I’m presenting a black-and-white picture because I don’t feel like transcribing the discussion we had into an oversized blog post. People wrote books on this topic; I’m pretty sure you can search for Russ White and find a few of them.
Finally, there’s no good substitute for understanding how things work (which brings me to another webinar ;).
One of my readers sent me an intriguing challenge based on the following design:
- He has a data center with two core switches (C1 and C2) and two Cisco Nexus edge switches (E1 and E2).
- He’s using static default routing from core to edge switches with HSRP on the edge switches.
- E1 is the active HSRP gateway connected to the primary WAN link.
The following picture shows the simplified network diagram: