ICMP Redirects and Suboptimal Routing

A while ago, I wrote a blog post explaining why we should (mostly) disable ICMP redirects, triggering a series of comments discussing the root cause of ICMP redirects. A few of those blamed static routes, including:

Put another way, the presence or absence of ICMP Redirects is a red herring, usually pointing to architectural/design issues instead. In this example, using vPC Peer Gateway or, better yet, running a minimal IGP instead of relying on static routes eliminates ICMP Redirects from both the problem and solution spaces simultaneously.

Unfortunately, that’s not the case. You can get suboptimal routing that sometimes triggers ICMP redirects in well-designed networks running more than one routing protocol.

Suboptimal Routing and ICMP Redirects

I badly mismanaged the details in the original version of this section. Fortunately, I have attentive readers like Henk who are quick to set me straight. Thank you!

Let’s define suboptimal routing first. In the context of this blog post, suboptimal routing happens when a router has to send a packet back to the ingress interface because the upstream router (or host) sent it to a suboptimal next hop.

Suboptimal routing was a big deal in the days of early Ethernet networks and CPU-based packet forwarding, and the designers of IPv4 tried to solve the simpler case (host using a default route pointing to a suboptimal first-hop router) with ICMP redirects. The rules for generating ICMP redirects are set out in RFC 792; here’s a slightly rephrased version:

If the next hop router and the host identified by the internet source address of the datagram are on the same network, a redirect message is sent to the host.

In most sane cases, that means that the routers send ICMP redirects to directly-connected hosts. As we’ll be talking about suboptimal routing between core and edge routers, we shouldn’t worry about ICMP redirects, but (somewhat surprisingly) not disabling them might kill the performance of your network. Here’s what’s going on behind the scenes.

It was easy to check for ICMP redirects in the days of software-based packet forwarding – the packet forwarding software could compare ingress and egress interfaces after performing the FIB lookup, do an additional check on the source IP address, and punt the packet to the IP process to generate the ICMP redirect if needed1.

Now imagine trying to emulate the above algorithm with an ASIC. Complex-enough ASICs would be able to perform all the necessary checks in hardware and send the packet to the CPU only when absolutely needed. Less capable ASICs (or hasty implementations) would send all packets that would have to exit through the ingress interface to the CPU whenever ICMP redirects are enabled on the interface (just in case an ICMP redirect has to be sent).

Now would be a perfect time to leave a comment telling me to stop worrying because all modern ASICs deal with the above now-hypothetical scenario, and all vendors implemented it correctly for ages, but you simply cannot tell me the details due to the NDAs you had to sign.

Old-timers might remember the ip route-cache same-interface command that disabled that behavior on Cisco IOS; the more things change the more they stay the same.

Back to Suboptimal Routing

Now let’s get back to the simple data center network that triggered the discussion, and imagine that:

  • E1 and E2 are routers connected to the global Internet. For whatever reason they have the full BGP table.
  • C1 and C2 are core layer-3 switches. They are not expensive enough to be able to install the entire BGP table into the forwarding ASIC.
  • E1, E2, C1, and C2 are connected to a shared transit VLAN.
Abstract layer-3 connectivity

Abstract layer-3 connectivity

You could use a variety of mechanisms to make C1 and C2 work with suboptimal information. In most cases, you’d run an IGP between the four devices, keep the complex stuff limited to E1/E2, and advertise the default route from E1 and E2 toward C1/C2.

The traffic sent through C1/C2 toward the Internet will sometimes land on the wrong edge router – the core switches simply don’t have enough information to select the optimal forwarding path. The edge router receiving such traffic has to forward it to the other edge router. There are several ways you could meet that requirement:

  • Use the shared VLAN between E1, E2, C1, and C2 to forward the misdirected traffic;
  • Add another VLAN connecting E1 and E2.

If you use the shared VLAN to forward the misdirected traffic between E1 and E2, the egress interface on the first-hop edge router matches the ingress interface, and based on how the packet forwarding is implemented in that router, the packet might have to be switched by the CPU (or dropped to the slow forwarding path).

Takeaways:

  • ICMP redirects have nothing to do with static routes.
  • Suboptimal routing might not trigger ICMP redirects, but could result in degraded performance if ICMP redirects are not disabled.
  • While it’s possible to design networks that suboptimal routing, it could happen in well-thought-out designs.
  • Disable ICMP redirects on all segments that don’t have directly-connected hosts, and everywhere you use a first-hop redundancy protocol or anycast gateway2.

Finally, would it be possible to generate ICMP redirects on E1 and E2 even though there are no hosts connected to that segment? Of course, but I’ll leave the details as an exercise for the interested reader.


  1. The code generating an ICMP redirect has to allocate memory for the additional packet, and that was usually a Mission Impossible in the (interrupt-driven) fast forwarding path. ↩︎

  2. The proof of the last claim is left as an exercise for the reader. ↩︎

2 comments:

  1. Since ICMP redirects can be used for MITM attacks, hosts are often configured to ignore them. In this case, ICMP redirects should also be disabled on segments with directly-connected hosts.

  2. I'm not sure I understand the topology. Are C1 and C2 bridges or routers? You don't call them routers, so maybe they are not routers? You give them different colours, so maybe they are not routers? You talk about BGP on C1 and C2, so maybe they are routers? All 4 devices are in a picture called "Layer-3 connectivity", so maybe C1 and C2 are routers? Also, real shared ethernet doesn't exist anymore. For that picture to work, there needs to be a bridge in between those 4 devices. Another clue that all four E1, E2, C1, C2 are routers?

    If C1 and C2 are bridges, then the hosts (which are beneath C1 and C2 in that picture) pick the default-gateway: E1 or E2. And they set the mac-address in their outgoing frames to E1's or E2's mac-address. C1 and C2, as they are bridges, can only forward based on that. In this case, having redirects is good. Because the hosts will switch next-hop for individual destinations as required. In this case, keep redirects on E1 and E2 enabled. That will help optimize traffic.

    If C1 and C2 are routers, everything changes.

    A very old rule of thumb that I learned 3 decades ago: Hosts listen to redirects. Routers do not. So if this is still true, then having redirects enabled on E1 and E2 won't help, because C1 and C2 won't listen/use the redirects anyway.

    But if it is that simple, why would E1 and E2 even try to send redirects to C1 and C2? My first Google search hits this page:
    https://www.cisco.com/c/en/us/support/docs/ip/routing-information-protocol-rip/13714-43.html#topic1

    When Are ICMP Redirects Sent?
    Cisco routers send ICMP redirects when all of these conditions are met:
    1. The interface on which the packet comes into the router is the same interface on which the packet gets routed out.
    2. The subnet or network of the source IP address is on the same subnet or network of the next-hop IP address of the routed packet.
    <2 more conditions, deleted>

    This is just IOS-XE. I don't know if other OSes do the same thing. (I don't even know what the OS I work on does, sorry).

    Look at rule 2. "The subnet or network of the source IP address is on the same subnet or network of the next-hop IP address of the routed packet." That means that in your scenario, when C1 and C2 are routers, the interfaces of E1 and E2 will not be in the same subnet as the ip-addresses of the hosts below C1 and C2. And thus that 2nd rule here won't apply. And thus E1 and E2 will not send redirects to C1 and C2. Even when E1 and E2 have redirects enabled.

    Your Friendly Router Vendor has already taken care of the issue, it seems. :)

    Am I missing something?

Add comment
Sidebar