Category: design

Scaling L3-Only Data Center Networks

Andrew wondered how one could scale the L3-only data center networking approach I outlined in this blog post and asked:

When dealing with guests on each host, if each host injects a /32 for each guest, by the time the routes are on the spine, you're potentially well past the 128k route limit. Can you elaborate on how this can scale beyond 128k routes?

Short answer: it won’t.

read more see 7 comments

Optimize Your Data Center: Use Distributed File System

Let’s continue our journey toward two-switch data center. What can we do after virtualizing the workload, getting rid of legacy technologies, and reducing the number of server uplinks to two?

How about replacing dedicated storage boxes with distributed file system?

In late September, Howard Marks will talk about software-defined storage in my Building Next Generation Data Center course. The course is sold out, but if you register for the spring 2017 session, you’ll get access to recording of Howard’s talk.

add comment

Optimize Your Data Center: Reduce the Number of Uplinks

Remember our journey toward two-switch data center? So far we:

Time for the next step: read a recent design guide from your favorite hypervisor vendor and reduce the number of server uplinks to two.

Not good enough? Building a bigger data center? There’s exactly one seat left in the Building Next Generation Data Center online course.

add comment

Optimize Your Data Center: Ditch the Legacy Technologies

In our journey toward two-switch data center we covered:

It’s time for the next step: get rid of legacy technologies like six 1GE interfaces per server or two FC interface cards in every server.

Need more details? Watch the Designing Private Cloud Infrastructure webinar. How about an interactive discussion? Register for the Building Next-Generation Data Center course.

see 5 comments

Unexpected Recovery Might Kill Your Data Center

Here’s an interesting story I got from one of my friends:

  • A large organization used a disaster recovery strategy based on stretched IP subnets and restarting workloads with unchanged IP addresses in a secondary data center;
  • Once they experienced a WAN connectivity failure in the primary data center and their disaster recovery plan kicked in.

However, while they were busy restarting the workloads in the secondary data center, and managed to get most of them up and running, the DCI link unexpectedly came back to life.

read more see 4 comments

Optimize Your Data Center: Virtualize Your Servers

A month ago I published the video where I described the idea that “two switches is all you need in a medium-sized data center”. Now let’s dig into the details: the first step you have to take to optimize your data center infrastructure is to virtualize all servers.

For even more details, watch the Designing Private Cloud Infrastructure webinar, or register for the Building Next-Generation Data Center course.

see 1 comments

BGP or OSPF? Does Topology Visibility Matter?

One of the comments added to my Using BGP in Data Centers blog post said:

With symmetric fabric… does it make sense for a node to know every bit of fabric info or is reachability information sufficient?

Let’s ignore for the moment that large non-redundant layer-3 fabrics where BGP-in-Data-Center movement started don’t need more than endpoint reachability information, and focus on a bigger issue: is knowledge of network topology (as provided by OSPF and not by BGP) beneficial?

read more see 6 comments

The Grumpy Old Network Architects and Facebook

Nuno wrote an interesting comment to my Stretched Firewalls across L3 DCI blog post:

You're an old school, disciplined networking leader that architects networks based on rock-solid, time-tested designs. But it seems that the prevailing fashion in network design and availability go against your traditional design principles: inter-site firewall clustering, inter-site vMotion, DCI, etc.

Not so fast, my young padawan.

Let’s define prevailing fashion first. You might define it as Kool-Aid id peddled by snake oil salesmen or cool network designs by people who know what they’re doing. If we stick with the first definition, you’re absolutely right.

Now let’s look at the second camp: how people who know what they’re doing build their network (Amazon VPC, Microsoft Azure or Bing, Google, Facebook, a number of other large-scale networks). You’ll find L3 down to ToR switch (or even virtual switch), and absolutely no inter-site vMotion or clustering – because they don’t want to bet their service, ads or likes on the whims of technology that was designed to emulate thick yellow cable.

Want to know how to design an application to work over a stable network? Watch my Designing Active-Active and Disaster Recovery Data Centers webinar.

This isn't the first time that readers have asked you about these technologies, and it won't be the last. Vendors will continue to market them despite their shortcomings, and customers will continue to eat them up.

As long as there will be someone willing to believe in fairy tales and Santa Claus, there will be someone dressed in red coat and fake beard yelling “Ho, Ho, Ho!”

Enterprise IT managers sometimes act like small kids. They don’t want to hear that they have people- and process problems, and love to believe that the next magical bit of technology will solve whatever it is that bothers them. Vendors obviously love to explore these cravings and sell them ever-more-complex solutions.

I'd like to think that vendors will also continue to work out the kinks and over time the technology will become rock solid and time-tested.

I am positive you can make any technology almost-rock-solid. You can also make pigs fly (see RFC 1925 sect. 2.3). However, have you included the fuel costs in your TCO?

Also, the more complex a technology is, the likelier it is to crash down like a house of cards, and you’ll be left with an incomprehensible mix of bits and pieces that will be impossible to put back together (see also: You can’t reformat your data center).

Nino concluded his comment with a question:

Are you too stuck on past, traditional designs and not being open to new ways of building IT? I get that IT is very cyclical, and these new trends may die in the future...or thrive, and the customers may either fail...or succeed.

I am very open to new ways of building IT. I preach the need for meaningful SDN (not the centralized control plane crap), network automation, and proper application architecture. I just refuse to believe in fairy tales, and solving non-technical problems with technology.

Finally…

Looking for more red pills? Explore my SDN webinars, Designing Active/Active Data Centers webinar, and vMotion-related blog posts.

see 8 comments

Presentation: All You Need Are Two Switches

I was asked to present a data-center-related talk last week and decided to focus on one of my favorite topics: because most people don’t have more than a few hundred servers in their data center, they don’t need more than two switches (or a rack of servers).

Not surprisingly, an equipment reseller sitting in the room was not amused.

The video and the slide deck are already online, but there’s a minor challenge: the whole event was in Slovenian ;) However, I plan to record the same topic in English once my SDN travels stop.

see 2 comments
Sidebar