Category: design
Why Is Stretched ACI Infinitely Better than OTV?
Eluehike Chedu asked an interesting question after my explanation of why stretched ACI fabric (or alternatives, see below) is the least horrible way of stretching a subnet: What about OTV?
Time to go back to the basics. As Dinesh Dutt explained in our Routing on Hosts webinar, there are (at least) three reasons why people want to see stretched subnets:
Scaling L3-Only Data Center Networks
Andrew wondered how one could scale the L3-only data center networking approach I outlined in this blog post and asked:
When dealing with guests on each host, if each host injects a /32 for each guest, by the time the routes are on the spine, you're potentially well past the 128k route limit. Can you elaborate on how this can scale beyond 128k routes?
Short answer: it won’t.
Optimize Your Data Center: Use Distributed File System
Let’s continue our journey toward two-switch data center. What can we do after virtualizing the workload, getting rid of legacy technologies, and reducing the number of server uplinks to two?
How about replacing dedicated storage boxes with distributed file system?
In late September, Howard Marks will talk about software-defined storage in my Building Next Generation Data Center course. The course is sold out, but if you register for the spring 2017 session, you’ll get access to recording of Howard’s talk.
Awesome Response: Complexity Sells
Russ White wrote an awesome response to my Complexity Sells post:
[…] What we cannot do is forget that complexity is real, and we need to learn to manage it. What we must not do is continue to think we can play in the land of dragons forever, and not get burnt. […]
Now go and read the whole blog post ;)
Optimize Your Data Center: Reduce the Number of Uplinks
Remember our journey toward two-switch data center? So far we:
Time for the next step: read a recent design guide from your favorite hypervisor vendor and reduce the number of server uplinks to two.
Not good enough? Building a bigger data center? There’s exactly one seat left in the Building Next Generation Data Center online course.
Optimize Your Data Center: Ditch the Legacy Technologies
In our journey toward two-switch data center we covered:
It’s time for the next step: get rid of legacy technologies like six 1GE interfaces per server or two FC interface cards in every server.
Need more details? Watch the Designing Private Cloud Infrastructure webinar. How about an interactive discussion? Register for the Building Next-Generation Data Center course.
OpenStack Networking, Availability Zones and Regions
One of my ExpertExpress engagements focused on networking in a future private cloud that might be built using OpenStack. The customer planned to deploy multiple data centers, and I recommended that they do everything they can to make sure they don’t make them a single failure domain.
Next step: translate that requirement into OpenStack terms.
Let’s Focus on Realistic Design Scenarios
An engineer working for a large system integrator sent me this question:
Since you are running a detailed series on leaf-and-spine fabrics these days, could you please suggest if following design scenarios of Facebook and Linkedin Data centers are also covered?
Short answer: No.
Unexpected Recovery Might Kill Your Data Center
Here’s an interesting story I got from one of my friends:
- A large organization used a disaster recovery strategy based on stretched IP subnets and restarting workloads with unchanged IP addresses in a secondary data center;
- Once they experienced a WAN connectivity failure in the primary data center and their disaster recovery plan kicked in.
However, while they were busy restarting the workloads in the secondary data center, and managed to get most of them up and running, the DCI link unexpectedly came back to life.
Optimize Your Data Center: Virtualize Your Servers
A month ago I published the video where I described the idea that “two switches is all you need in a medium-sized data center”. Now let’s dig into the details: the first step you have to take to optimize your data center infrastructure is to virtualize all servers.
For even more details, watch the Designing Private Cloud Infrastructure webinar, or register for the Building Next-Generation Data Center course.
Video: All You Need Are Two Switches
I’ve been telling you to build small-to-midsized data center with two switches for years ;) A few weeks ago I’ve turned the presentation I had on that topic into a webinar and the first video from that webinar (now part of Designing Private Cloud Infrastructure) is already public.
BGP or OSPF? Does Topology Visibility Matter?
One of the comments added to my Using BGP in Data Centers blog post said:
With symmetric fabric… does it make sense for a node to know every bit of fabric info or is reachability information sufficient?
Let’s ignore for the moment that large non-redundant layer-3 fabrics where BGP-in-Data-Center movement started don’t need more than endpoint reachability information, and focus on a bigger issue: is knowledge of network topology (as provided by OSPF and not by BGP) beneficial?
Using BGP in Data Center Fabrics
While the large data centers increasingly use BGP as the routing protocol within their fabrics, the enterprise engineers tend to shy away from that idea because they think BGP is too complex/scary/hard-to-configure/obsolete/unknown/whatever.
It’s time to fix that.
The Grumpy Old Network Architects and Facebook
Nuno wrote an interesting comment to my Stretched Firewalls across L3 DCI blog post:
You're an old school, disciplined networking leader that architects networks based on rock-solid, time-tested designs. But it seems that the prevailing fashion in network design and availability go against your traditional design principles: inter-site firewall clustering, inter-site vMotion, DCI, etc.
Not so fast, my young padawan.
Let’s define prevailing fashion first. You might define it as Kool-Aid id peddled by snake oil salesmen or cool network designs by people who know what they’re doing. If we stick with the first definition, you’re absolutely right.
Now let’s look at the second camp: how people who know what they’re doing build their network (Amazon VPC, Microsoft Azure or Bing, Google, Facebook, a number of other large-scale networks). You’ll find L3 down to ToR switch (or even virtual switch), and absolutely no inter-site vMotion or clustering – because they don’t want to bet their service, ads or likes on the whims of technology that was designed to emulate thick yellow cable.
This isn't the first time that readers have asked you about these technologies, and it won't be the last. Vendors will continue to market them despite their shortcomings, and customers will continue to eat them up.
As long as there will be someone willing to believe in fairy tales and Santa Claus, there will be someone dressed in red coat and fake beard yelling “Ho, Ho, Ho!”
Enterprise IT managers sometimes act like small kids. They don’t want to hear that they have people- and process problems, and love to believe that the next magical bit of technology will solve whatever it is that bothers them. Vendors obviously love to explore these cravings and sell them ever-more-complex solutions.
I'd like to think that vendors will also continue to work out the kinks and over time the technology will become rock solid and time-tested.
I am positive you can make any technology almost-rock-solid. You can also make pigs fly (see RFC 1925 sect. 2.3). However, have you included the fuel costs in your TCO?
Also, the more complex a technology is, the likelier it is to crash down like a house of cards, and you’ll be left with an incomprehensible mix of bits and pieces that will be impossible to put back together (see also: You can’t reformat your data center).
Nino concluded his comment with a question:
Are you too stuck on past, traditional designs and not being open to new ways of building IT? I get that IT is very cyclical, and these new trends may die in the future...or thrive, and the customers may either fail...or succeed.
I am very open to new ways of building IT. I preach the need for meaningful SDN (not the centralized control plane crap), network automation, and proper application architecture. I just refuse to believe in fairy tales, and solving non-technical problems with technology.
Finally…
Looking for more red pills? Explore my SDN webinars, Designing Active/Active Data Centers webinar, and vMotion-related blog posts.
Presentation: All You Need Are Two Switches
I was asked to present a data-center-related talk last week and decided to focus on one of my favorite topics: because most people don’t have more than a few hundred servers in their data center, they don’t need more than two switches (or a rack of servers).
Not surprisingly, an equipment reseller sitting in the room was not amused.
The video and the slide deck are already online, but there’s a minor challenge: the whole event was in Slovenian ;) However, I plan to record the same topic in English once my SDN travels stop.