Category: design

You're Responsible for Resiliency of Your Public Cloud Deployment

Enterprise environments usually implement “mission-critical” applications by pushing high-availability requirements down the stack until they hit networking… and then blame the networking team when the whole house of cards collapses.

Most public cloud providers are not willing to play the same stupid blame-shifting game - they live or die by their reputation, and maintaining a stable service is their highest priority. They will do their best to implement a robust and resilient infrastructure, but will not do anything that could impact its stability or scalability… including the snake oil the virtualization and networking vendors love to sell to their gullible customers. When you deploy your application workloads into a public cloud, you become responsible for the resiliency of your own application, and there’s no magic button that could allow you to push the problems down the stack.

read more see 1 comments

Public Cloud Cannot Change the Laws of Physics

Listening to public cloud evangelists and marketing departments of vendors selling over-the-cloud networking solutions or multi-cloud orchestration systems, you could start to believe that migrating your workload to a public cloud would solve all your problems… and if you’re gullible enough to listen to them, you’ll get the results you deserve.

Unfortunately, nothing can change the fundamental laws of physics, networking, or application architectures:

read more add comment

Do We Need Math in Networking?

I should have known better, but I couldn’t resist being pulled into a Twitter spat around the question “whether networking engineers need to know something about math” a long while ago.

Before going into the details, let’s start with Wikipedia definition: “Engineering is the use of scientific principles to design and build machines, structures, and other things” including “specific emphasis on particular areas of applied mathematics, applied science, and types of application”.

So feel free to believe that you don’t need any math or other science (because there’s very little science behind what we do in networking) in your job, in which case you might want to stop reading… but then at least please think twice about your job title.

read more see 5 comments

Figure Out What Problem You’re Trying to Solve

A long while ago I got into an hilarious Tweetfest (note to self: don’t… not that I would ever listen) starting with:

Which feature and which Cisco router for layer2 extension over internet 100Mbps with 1500 Bytes MTU

The knee-jerk reaction was obvious: OMG, not again. The ugly ghost of BRouters (or is it RBridges or WAN Extenders?) has awoken. The best reply in this category was definitely:

I cannot fathom the conversation where this was a legitimate design option. May the odds forever be in your favor.

A dozen “this is a dumpster fire” tweets later the problem was rephrased as:

read more add comment

You Don't Need IP Renumbering for Disaster Recovery

This is a common objection I get when trying to persuade network architects they don’t need stretched VLANs (and IP subnets) to implement data center disaster recovery:

Changing IP addresses when activating DR is hard. You’d have to weigh the manageability of stretching L2 and protecting it, with the added complexity of breaking the two sites into separate domains [and subnets]. We all have apps with hardcoded IP’s, outdated IPAM’s, Firewall rules that need updating, etc.

Let’s get one thing straight: when you’re doing disaster recovery there are no live subnets, IP addresses or anything else along those lines. The disaster has struck, and your data center infrastructure is gone.

read more see 1 comments

Disaster Recovery and Failure Domains

One of the responses to my Disaster Recovery Faking blog post focused on failure domains:

What is the difference between supporting L2 stretched between two pods in your DC (which everyone does for seamless vMotion), and having a 30ms link between these two pods because they happen to be in different buildings?

I hope you agree that a single broadcast domain is a single failure domain. If not, let agree to disagree and move on - my life is too short to argue about obvious stuff.

read more add comment

Stretched VLANs and Failing Firewall Clusters

After publishing the Disaster Recovery Faking, Take Two blog post (you might want to read that one before proceeding) I was severely reprimanded by several people with ties to virtualization vendors for blaming virtualization consultants when it was obvious the firewall clusters stretched across two data centers caused the total data center meltdown.

Let’s chase that elephant out of the room first. When you drive too fast on an icy road and crash into a tree who do you blame?

  • The person who told you it’s perfectly OK to do so;
  • The tire manufacturer who advertised how safe their tires were?
  • The tires for failing to ignore the laws of physics;
  • Yourself for listening to bad advice

For whatever reason some people love to blame the tires ;)

read more see 10 comments

Disaster Recovery Faking, Take Two

An anonymous (for reasons that will be obvious pretty soon) commenter left a gem on my Disaster Recovery Test Faking blog post that is way too valuable to be left hidden and unannotated.

Here’s what he did:

Once I was tasked to do a DR test before handing over the solution to the customer. To simulate the loss of a data center I suggested to physically shutdown all core switches in the active data center.

That’s the right first step: simulate the simplest scenario (total DC failure) in a way that is easy to recover (power on the switches). It worked…

read more see 11 comments

Worth Reading: Anycast DNS in Enterprise Networks

Anycast (advertising the same IP address from multiple servers/locations) has long been used to implement scale-out public DNS services (the whole root DNS system runs on massive anycast), but it’s not as common in enterprise networks.

The blog posts written by Tom Bowles should get you there. He started with the idea and described his implementation using Infoblox DNS.

Want to know even more? I covered numerous load balancing mechanisms including anycast in Data Centers Infrastructure for Networking Engineers webinar.

add comment

Redundant BGP Connectivity on a Single ISP Connection

A while ago Johannes Weber tweeted about an interesting challenge:

We want to advertise our AS and PI space over a single ISP connection. How would a setup look like with 2 Cisco routers, using them for hardware redundancy? Is this possible with only 1 neighboring to the ISP?

Hmm, so you have one cable and two router ports that you want to connect to that cable. There’s something wrong with this picture ;)

read more see 2 comments
Sidebar