Category: design
OSPF Areas and Summarization: Theory and Reality
While most readers, commenters, and Twitterati agreed with my take on the uselessness of OSPF areas and inter-area summarization in the 21st century, a few of them pointed out that in practice, the theory and practice are not the same. Unfortunately, most counterexamples failed due to broken implementations or vendor “optimizations.”
Do We Still Need OSPF Areas and Summarization?
One of my ExpertExpress design discussions focused on WAN network design and the need for OSPF areas and summarization (the customer had random addressing and the engineers wondered whether it makes sense to renumber the network to get better summarization).
I was struggling with the question of whether we still need OSPF areas and summarization in 2016 for a long time. Here are my thoughts on the topic; please share yours in the comments.
Using BGP in Leaf-and-Spine Fabrics
In the Leaf-and-Spine Fabric Designs webinar series we started with the simplest possible design: non-redundant server connectivity with bridging within a ToR switch and routing across the fabric.
After I explained the basics (including routing protocol selection, route summarization, link aggregation and addressing guidelines), Dinesh Dutt described how network architects use BGP when building leaf-and-spine fabrics.
Why Is Stretched ACI Infinitely Better than OTV?
Eluehike Chedu asked an interesting question after my explanation of why stretched ACI fabric (or alternatives, see below) is the least horrible way of stretching a subnet: What about OTV?
Time to go back to the basics. As Dinesh Dutt explained in our Routing on Hosts webinar, there are (at least) three reasons why people want to see stretched subnets:
Scaling L3-Only Data Center Networks
Andrew wondered how one could scale the L3-only data center networking approach I outlined in this blog post and asked:
When dealing with guests on each host, if each host injects a /32 for each guest, by the time the routes are on the spine, you're potentially well past the 128k route limit. Can you elaborate on how this can scale beyond 128k routes?
Short answer: it won’t.
Optimize Your Data Center: Use Distributed File System
Let’s continue our journey toward two-switch data center. What can we do after virtualizing the workload, getting rid of legacy technologies, and reducing the number of server uplinks to two?
How about replacing dedicated storage boxes with distributed file system?
In late September, Howard Marks will talk about software-defined storage in my Building Next Generation Data Center course. The course is sold out, but if you register for the spring 2017 session, you’ll get access to recording of Howard’s talk.
Awesome Response: Complexity Sells
Russ White wrote an awesome response to my Complexity Sells post:
[…] What we cannot do is forget that complexity is real, and we need to learn to manage it. What we must not do is continue to think we can play in the land of dragons forever, and not get burnt. […]
Now go and read the whole blog post ;)
Optimize Your Data Center: Reduce the Number of Uplinks
Remember our journey toward two-switch data center? So far we:
Time for the next step: read a recent design guide from your favorite hypervisor vendor and reduce the number of server uplinks to two.
Not good enough? Building a bigger data center? There’s exactly one seat left in the Building Next Generation Data Center online course.
Optimize Your Data Center: Ditch the Legacy Technologies
In our journey toward two-switch data center we covered:
It’s time for the next step: get rid of legacy technologies like six 1GE interfaces per server or two FC interface cards in every server.
Need more details? Watch the Designing Private Cloud Infrastructure webinar. How about an interactive discussion? Register for the Building Next-Generation Data Center course.
OpenStack Networking, Availability Zones and Regions
One of my ExpertExpress engagements focused on networking in a future private cloud that might be built using OpenStack. The customer planned to deploy multiple data centers, and I recommended that they do everything they can to make sure they don’t make them a single failure domain.
Next step: translate that requirement into OpenStack terms.
Let’s Focus on Realistic Design Scenarios
An engineer working for a large system integrator sent me this question:
Since you are running a detailed series on leaf-and-spine fabrics these days, could you please suggest if following design scenarios of Facebook and Linkedin Data centers are also covered?
Short answer: No.
Unexpected Recovery Might Kill Your Data Center
Here’s an interesting story I got from one of my friends:
- A large organization used a disaster recovery strategy based on stretched IP subnets and restarting workloads with unchanged IP addresses in a secondary data center;
- Once they experienced a WAN connectivity failure in the primary data center and their disaster recovery plan kicked in.
However, while they were busy restarting the workloads in the secondary data center, and managed to get most of them up and running, the DCI link unexpectedly came back to life.
Optimize Your Data Center: Virtualize Your Servers
A month ago I published the video where I described the idea that “two switches is all you need in a medium-sized data center”. Now let’s dig into the details: the first step you have to take to optimize your data center infrastructure is to virtualize all servers.
For even more details, watch the Designing Private Cloud Infrastructure webinar, or register for the Building Next-Generation Data Center course.
Video: All You Need Are Two Switches
I’ve been telling you to build small-to-midsized data center with two switches for years ;) A few weeks ago I’ve turned the presentation I had on that topic into a webinar and the first video from that webinar (now part of Designing Private Cloud Infrastructure) is already public.
BGP or OSPF? Does Topology Visibility Matter?
One of the comments added to my Using BGP in Data Centers blog post said:
With symmetric fabric… does it make sense for a node to know every bit of fabric info or is reachability information sufficient?
Let’s ignore for the moment that large non-redundant layer-3 fabrics where BGP-in-Data-Center movement started don’t need more than endpoint reachability information, and focus on a bigger issue: is knowledge of network topology (as provided by OSPF and not by BGP) beneficial?