While most readers, commenters, and Twitterati agreed with my take on the uselessness of OSPF areas and inter-area summarization in 21st century, a few of them pointed out that in practice, the theory and practice are not the same. Unfortunately, most of those counterexamples failed due to broken implementations or vendor “optimizations”.
Broken OSPF Implementation
Someone (name and other details withheld for an obvious reason) described the problem they faced in their data center network: while switches from most vendors worked like a charm without route summarization (they had to use OSPF areas due to the network size), a particular vendor’s switches needed seconds to recompute the topology and install the new routes in the forwarding table.
They tried using BGP and the problem disappeared, proving that it must have been a broken OSPF implementation.
- Use BGP;
- Use stub areas, either with aggressive summarization or as totally stubby areas (no inter-area routes inserted into an area).
The second workaround requires inter-spine links to avoid black holes after leaf-to-spine link failures. The proof is left as an exercise to the reader, or you could cheat and watch the Leaf-and-Spine Fabric Designs webinar (more specifically, the Route Summarization and Link Aggregation video in the Layer-3 Fabrics with Non-redundant Server Connectivity section).
Jochen Bartl sent me this message:
Another sad but valid reason to use summarization could be also due to licensing. There are still vendors out there who put artificial limitations on their data center gear. Nexus 5600 for example supports only 256 dynamic routes in the LAN_BASE_SERVICES_PKG license. Ordering and getting budget for a license upgrade can become quite a challenge on its own in some organizations ;-)
Well done, Cisco! It’s so nice to see you guys supporting clean and simple network designs.
Finally, there’s always a dinosaur hiding in a dusty closet. As an anonymous commenter wrote:
Got some customers still using very old routers managing critical services with 7+ yrs uptime that can't process a large DB and cannot be touched whatever is the issue/project.
Deal with these like you would with a broken OSPF implementation: isolate them in a stub area, and hope they fail soon (even better if they’re so old you can’t even replace them anymore).