OSPF Areas and Summarization: Theory and Reality

While most readers, commenters, and Twitterati agreed with my take on the uselessness of OSPF areas and inter-area summarization in the 21st century, a few of them pointed out that in practice, the theory and practice are not the same. Unfortunately, most counterexamples failed due to broken implementations or vendor “optimizations.”

Broken OSPF Implementation

Someone (name and other details withheld for an apparent reason) described the problem they faced in their data center network: while switches from most vendors worked like a charm without route summarization (they had to use OSPF areas due to the network size), a particular vendor’s switches needed seconds to recompute the topology and install the new routes in the forwarding table.

They tried using BGP, and the problem disappeared, proving it must have been a broken OSPF implementation.

Potential workarounds:

  • Use BGP;
  • Use stub areas, either with aggressive summarization or as totally stubby areas (no inter-area routes inserted into an area).
The second workaround requires inter-spine links to avoid black holes after leaf-to-spine link failures. The proof is left as an exercise to the reader, or you could cheat and watch the Leaf-and-Spine Fabric Designs webinar (more specifically, the Route Summarization and Link Aggregation video in the Layer-3 Fabrics with Non-redundant Server Connectivity section).

Vendor Licensing

Jochen Bartl sent me this message:

Another sad but valid reason to use summarization could also be due to licensing. There are still vendors out there who put artificial limitations on their data center gear. Nexus 5600, for example, supports only 256 dynamic routes in the LAN_BASE_SERVICES_PKG license. Ordering and getting budget for a license upgrade can become quite a challenge on its own in some organizations ;-)

Well done, Cisco! Seeing you guys supporting clean and straightforward network designs is so lovely.

Old Gear

Finally, there’s always a dinosaur hiding in a dusty closet. As an anonymous commenter wrote:

Got some customers still using very old routers managing critical services with 7+ yrs uptime that can’t process a large DB and cannot be touched whatever is the issue/project.

Deal with these as you would with a broken OSPF implementation: isolate them in a stub area and hope they fail soon. It’s even better if they’re so old you can’t replace them anymore.

Add comment