design « ipSpace.net blog

Wednesday, March 30, 2016 11:40 +0200

Video: All You Need Are Two Switches

I’ve been telling you to build small-to-midsized data center with two switches for years ;) A few weeks ago I’ve turned the presentation I had on that topic into a webinar and the first video from that webinar (now part of Designing Private Cloud Infrastructure) is already public.

Watch the video

add comment

Wednesday, February 10, 2016 08:32 +0100

BGP or OSPF? Does Topology Visibility Matter?

One of the comments added to my Using BGP in Data Centers blog post said:

With symmetric fabric… does it make sense for a node to know every bit of fabric info or is reachability information sufficient?

Let’s ignore for the moment that large non-redundant layer-3 fabrics where BGP-in-Data-Center movement started don’t need more than endpoint reachability information, and focus on a bigger issue: is knowledge of network topology (as provided by OSPF and not by BGP) beneficial?

Using BGP in Data Center Fabrics

While the large data centers increasingly use BGP as the routing protocol within their fabrics, the enterprise engineers tend to shy away from that idea because they think BGP is too complex/scary/hard-to-configure/obsolete/unknown/whatever.

It’s time to fix that.

The Grumpy Old Network Architects and Facebook

Nuno wrote an interesting comment to my Stretched Firewalls across L3 DCI blog post:

You're an old school, disciplined networking leader that architects networks based on rock-solid, time-tested designs. But it seems that the prevailing fashion in network design and availability go against your traditional design principles: inter-site firewall clustering, inter-site vMotion, DCI, etc.

Not so fast, my young padawan.

Let’s define prevailing fashion first. You might define it as Kool-Aid id peddled by snake oil salesmen or cool network designs by people who know what they’re doing. If we stick with the first definition, you’re absolutely right.

Now let’s look at the second camp: how people who know what they’re doing build their network (Amazon VPC, Microsoft Azure or Bing, Google, Facebook, a number of other large-scale networks). You’ll find L3 down to ToR switch (or even virtual switch), and absolutely no inter-site vMotion or clustering – because they don’t want to bet their service, ads or likes on the whims of technology that was designed to emulate thick yellow cable.

Want to know how to design an application to work over a stable network? Watch my Designing Active-Active and Disaster Recovery Data Centers webinar.

This isn't the first time that readers have asked you about these technologies, and it won't be the last. Vendors will continue to market them despite their shortcomings, and customers will continue to eat them up.

As long as there will be someone willing to believe in fairy tales and Santa Claus, there will be someone dressed in red coat and fake beard yelling “Ho, Ho, Ho!”

Enterprise IT managers sometimes act like small kids. They don’t want to hear that they have people- and process problems, and love to believe that the next magical bit of technology will solve whatever it is that bothers them. Vendors obviously love to explore these cravings and sell them ever-more-complex solutions.

I'd like to think that vendors will also continue to work out the kinks and over time the technology will become rock solid and time-tested.

I am positive you can make any technology almost-rock-solid. You can also make pigs fly (see RFC 1925 sect. 2.3). However, have you included the fuel costs in your TCO?

Also, the more complex a technology is, the likelier it is to crash down like a house of cards, and you’ll be left with an incomprehensible mix of bits and pieces that will be impossible to put back together (see also: You can’t reformat your data center).

Nino concluded his comment with a question:

Are you too stuck on past, traditional designs and not being open to new ways of building IT? I get that IT is very cyclical, and these new trends may die in the future...or thrive, and the customers may either fail...or succeed.

I am very open to new ways of building IT. I preach the need for meaningful SDN (not the centralized control plane crap), network automation, and proper application architecture. I just refuse to believe in fairy tales, and solving non-technical problems with technology.

Finally…

Looking for more red pills? Explore my SDN webinars, Designing Active/Active Data Centers webinar, and vMotion-related blog posts.

see 8 comments

Tuesday, November 17, 2015 10:24 +0100

Presentation: All You Need Are Two Switches

I was asked to present a data-center-related talk last week and decided to focus on one of my favorite topics: because most people don’t have more than a few hundred servers in their data center, they don’t need more than two switches (or a rack of servers).

Not surprisingly, an equipment reseller sitting in the room was not amused.

The video and the slide deck are already online, but there’s a minor challenge: the whole event was in Slovenian ;) However, I plan to record the same topic in English once my SDN travels stop.

Watch the slide deck

see 2 comments

Tuesday, October 6, 2015 08:05 +0200

Building Carrier-Grade Cloud Infrastructure

During one of my SDN workshops, an attendees asked me “How do you build carrier-grade (5 nines) cloud infrastructure with VMware NSX?”

Short answer: You don’t… and it’s a wrong question anyway.

Designing Active-Active and Disaster Recovery Data Centers

A year ago I was a firm believer in the unlimited powers of Software-Defined Data Centers and their ability to simplify workload migrations. After all, if you can use an API to create any data center object, what’s stopping you from moving the workload running in a data center to another location.

As always, there’s a huge difference between theory and reality.

How Complex Is Your Data Center?

Sometimes it seems like the networking vendors try to (A) create solutions in search of problems, (B) boil the ocean, (C) solve the scalability problems of Google or Amazon instead of focusing on real-life scenarios or (D) all of the above.

Bryan Stiekes from HP decided to do a step in the right direction: let’s ask the customers how complex their data centers really are. He created a data center complexity survey and promised to share the results with me (and you), so please do spend a few minutes of your time filling it in. Thank you!

Take the survey

see 3 comments

Wednesday, September 2, 2015 07:47 +0200

Private and Public Clouds, and the Mistakes You Can Make

A few days ago I had a nice chat with Christoph Jaggi about private and public clouds, and the mistakes you can make when building a private cloud – the topics we’ll be discussing in the Designing Infrastructure for Private Clouds workshop @ Data Center Day in Berne in mid-September.

The German version of our talk has been published on Inside-IT; those of you not fluent in German will find the English version below.

Cumulus Linux Data Center Architectures

After introducing the concepts of Cumulus Linux in the Data Center Fabrics update session, Dinesh Dutt described the typical data center architectures implemented with Cumulus Linux and the lessons everyone should learn from large-scale web properties.

Watch the video

add comment

Tuesday, July 14, 2015 14:29 +0200

Can You Avoid Networking Software Bugs?

One of my readers sent me an interesting reliability design question. It all started with a catastrophic WAN failure:

Once a particular volume of encrypted traffic was reached the data center WAN edge router crashed, and then the backup router took over, which also crashed. The traffic then failed over to the second DC, and you can guess what happened then...

Obviously they’re now trying to redesign the network to avoid such failures.

Save the Date: Designing Infrastructure for Private Clouds Workshop in Switzerland

Gabi Gerber (the wonderful mastermind behind the Data Center Day event) is helping me bring my Designing Infrastructure for Private Clouds workshop (one of the best Interop 2015 workshops) to Switzerland.

This is the only cloud design workshop I’m running in Europe in 2015. If you’d like to attend it, this is your only chance – register NOW.

So You Need ISSU on Your ToR switch? Really?

During the Cumulus Linux presentation Dinesh Dutt had at Data Center Fabrics webinar, someone asked an unexpected question: “Do you have In-Service Software Upgrade (ISSU) on Cumulus Linux” and we both went like “What? Why?”

Dinesh is an honest engineer and answered: “No, we don’t do it” with absolutely no hesitation, but we both kept wondering, “Why exactly would you want to do that?”

Case Study: Scale-Out Cloud Infrastructure

I helped several customers design scale-out private or public cloud infrastructure. In every case, I tried to start with a reasonably small pod (based on what they’d consider acceptable loss unit – another great term I inherited from Chris Young), connected them to a shared L3 backbone (either within a data center or across multiple data centers), and then tried to address the inevitable desire for stretched layer-2 connectivity.

You’ll find a summary of these designs in my next ExpressExpress case study: Scale-Out Private Cloud Infrastructure, and if you need more details, I’m usually available for online consulting.

More details…

add comment

Tuesday, April 28, 2015 15:19 +0200

How Do I Start My IPv6 Addressing Plan?

One of my readers was reading the Preparing an IPv6 Addressing Plan document on RIPE web site, and found that the document proposes two approaches to IPv6 addressing: encode location in high-order bits and subnet type in low-order bits (the traditional approach) or encode subnet type in high-order bits and location in low-order bits (totally counter intuitive to most networking engineers). His obvious question was: “Is anyone using type-first addressing in production network?”

Terastream project seems to be using service-first format; if you’re doing something similar, please leave a comment!

Category: design

Finally…