Designing Active-Active and Disaster Recovery Data Centers
A year ago I was a firm believer in the unlimited powers of Software-Defined Data Centers and their ability to simplify workload migrations. After all, if you can use an API to create any data center object, what’s stopping you from moving the workload running in a data center to another location.
As always, there’s a huge difference between theory and reality.
Reality Distortion Field Has Failed
Being a slightly skeptical eternal optimist, I created a workshop description for Interop Las Vegas 2015 which still sounded pretty positive and mentioned SDDC as a potential solution.
In December 2014, the reality hit… hard. I was running a workshop for a global organization that was sold on a simple idea: using SDDC (from the vendor that created the acronym) it’s easy to pick up your toys (= application workload), pack them in a large bag, walk away to a different sandbox (= public cloud), drop them out of the bag and continue playing.
During the workshop we identified numerous obstacles and missing orchestration components, and concluded that it’s totally impossible to achieve what they planned to do. The best they could do at that time was to manually recreate network infrastructure (= subnets) and services (= firewalls and load balancers) in a second virtualized environment (disaster recovery data center or public cloud), and afterwards restart VMs from the failed data center in that cloud.
The only approach that would do what my customer wanted at that time was automated application deployment using tools like Cloudify, but that solution was further away from their grasp than Alpha Centauri – they were a traditional enterprise IT shop with manual non-repeatable server creation and application deployment processes.
After three days we had to conclude that there’s nothing SDDC could do for them to solve their immediate workload migration problems, and that they should focus on automating application development and deployment processes (yeah, I know I sound like Captain Obvious).
It seems that NSX 6.2 and SRM 6.1 might be a step in the right direction, but I have to read the documentation to find out the potential “minor” details.
Adjusting to Reality
Based on that traumatic experience, I decided to refocus my Interop presentation on what works now in real life and not surprisingly the best answer is “proper application architecture”.
Anyway, the Interop workshop documented numerous challenges you might encounter on your journey (including finite bandwidth, non-zero latency, unpredictable failures, bad application architectures, vendors promoting obviously-stupid things), and resulted in a fantastic experience for the attendees even though the workshop was just before the evening party and I ran way overtime.
An updated version of that workshop is now becoming a webinar with a more appropriate title: Designing Active-Active and Disaster Recovery Data Centers. I split the webinar in two live sessions due to its length: the first one on October 14th, the second one on November 11th.
To register for the live webinar session, go to the webinar description page (subscribers can obviously register free of charge).
Keeping Past Promises
I promised the people who bought the Designing Private Cloud Infrastructure webinar in the past (before October 1st 2015) access to the contents of this webinar. If you’re one of them, you’ll get a notification that you got access to the new webinar and will be able to register for the live session.
Fixed, thank you!
I've noticed that a lot of blog posts I've done targeted at beginners don't get a lot of views (like https://jayswan.github.io/2013/10/16/java-is-to-javascript-as-car-is-to/).
Do you think this is because blog audiences aren't the ones who need those topics? I've noticed that your posts have gotten steadily more advanced over the years, and wondered if that was part of the reason.