Interim Forwarding Loops in OSPF or IS-IS Networks

One of my readers sent me this question (slightly rephrased):

Assume you have A,B and C connected in a triangle (with an alternate longer path to C). What happens if C loses its links to A and B? Won’t the traffic to C loop between A and B for a while?

As always, it depends.

Here’s the network topology diagram we’ll use (thanks to ASCIIFlow Infinity):

+-------+       +-------+
|   A   +-------+   B   +-----+
+--+----+       +----+--+     |
   |                 |        |
   |    +-------+    |    +---+---+
   +----+   C   +----+    | Slow  |
        +---+---+         +---+---+
            |                 |
            +-----------------+

The actual sequence of events happening on a router obviously depends on particular details of control plane implementation, but it’s reasonable to expect something along these lines:

  • Forwarding entries using interface X are removed as soon as interface X goes down;
  • Lacking alternate ECMP entries to the same destination, the router would install pre-computed backup entries (for example, the results of LFA computation) into the forwarding table;

We’ll ignore the details of how the backup entries are installed. Ideally the control-plane software changes the next-hop groups not the actual forwarding entries.

  • Lacking pre-computed backup entries, the router will recalcuate the main routing table (there might be alternate routes with higher administrative distance) and repopulate the forwarding table if such entries exist.

There’s nothing else the routing/forwarding table manipulation software can do at that moment.

In our scenario, A and B have a loop-free alternate path to C, and if you configured LFA on them the FIB manipulation software would install the alternate route (A => B and B => A) resulting in a temporary forwarding loop… unless you configured LFA Downstream Path.

If you haven’t configured LFA, A and B will have no usable route to C (we’ll ignore summary/default routes for the moment) and the traffic sent to C will be dropped.

At the same time, the routing protocols monitoring interface X kick in and start their work:

  • New router LSA/LSP is generated (assuming we’re not dealing with P2P links) unless there have been so many recent changes that the LSA generation is throttled;
  • New LSA/LSP describing local topology change is flooded. These update packets might be delayed based on any OSPF/IS-IS packet pacing configured on the device;
  • Asynchronously to that, SPF process is eventually run based on how SPF timers are configured, generates new best routes, and sends them to the routing table.

Now for the “it depends” part:

  • If the SPF process is run before the changed LSAs are received from the OSPF/IS-IS neighbor, we’ll get a temporary loop (assuming LFA hasn’t already generated one) that will disappear the second time the SPF process is run;
  • If the changed LSA is received before the SPF process is run (and LFA was not used), there will be no forwarding loop.

Please don’t read the above paragraphs as LFA is bad. It is not. This blog post is evaluating the consequences of a rare event (multiple link loss) from the understanding RIB, FIB and SPF perspective. If the network loses a single link (A => C or B => C) using LFA results in faster convergence.

For even more details, read the Microloops! blog post by Russ White.

1 comments:

  1. It is always worth connecting sniffer during failure & recovery tests to see what is going on under the hood (in my work is a standard part of system development process).
    It is exciting to see something interesting, then assess if it is as expected. I do see all kind of micro loops during such tests (it's interesting to see ospf split-brain scenario where the DR/BR are in one part of the split network and the other needs to re-elect the DR/BR then what happens after we merge this halves of the network).

    In real business we simply measure the re-convergence time to see if we comply with real requirements.
Add comment
Sidebar