Stretched clusters: almost as good as heptagonal wheels

Some people are changing round wheels to heptagonal format because they will roll better. Some other people are building stretched high-availability clusters – clusters of servers stretched over multiple data centers. Unfortunately only one of these claims is false.

Similar to the stretched firewalls design, stretched tightly coupled HA clusters are vulnerable – you lose the inter-DC link for long enough time (depending on how the cluster heartbeat is configured, a few seconds could be enough) and you have a total disaster on your hands.

Best case – partitioned cluster. Most HA clustering solutions available today (including Microsoft’s WSFC) handle cluster partitioning (some members get isolated from the rest) the same way: the isolated minority is shut down and the services they offered are restarted on the remaining nodes. The methods used to figure out which part constitutes the minority differ, but they all use variants of voting mechanisms and quorums (with disks or file systems thrown in sometimes to make the total number of votes odd).

Did I mention the services were restarted? That’s a lengthy outage right there without even taking in consideration the excessive load those services place on the remaining members of the cluster (in a balanced two-DC design, a DCI link outage would probably take down approximately half of the cluster nodes).

Worst case – split brain. If you manage to misconfigure the quorum algorithm (or shared storage used as a missing vote), both parts of partitioned cluster would think they are the remaining majority. Both parts would restart all services that were running on the “lost” half, so you’ll get two copies of each service, each copy writing to its own local copy of the storage, completely ruining your data in the process. Recovering from a split brain event takes hours (at least), starting with a complete shutdown and data restore (you do have backups, don’t you?).

Hoping it won’t happen doesn’t help. Think twice before writing “if we lose our DCI link, the split brain will be the least of our problems” in the comments. Do you really want to bet your whole IT infrastructure on a WAN link?

What can I do to stop it? Sometimes not much, because everyone else crazed by the flat-earth nirvana stories won’t listen. In this case make sure you document your objections and predictions – at least you’ll have an “I told you so” document.

Sometimes it might help to ask the stretched cluster designers what happens if the DCI link does goes down and the cluster partitions. They just might pause and reconsider.

Best case, you’re working in an organization where apps, server and networking people actually talk to each other and work together to solve the business problems (stretched clusters are not solving business problems, they are kludges around bad application architecture). That’s the perfect place to start the scale-out application architecture discussion and the role load balancers (I can’t make myself say Application Delivery Controller) play in it. If you want to learn more about that, you’ll find plenty of information in my Data Center Interconnects webinar (recording).

And just in case you want to listen to my ramblings – here are a few things I said about stretched clusters, the best idea since the heptagonal wheels.

6 comments:

  1. Ivan, I have told you before that "Heptagon is the new Round".

    It's simple science my friend. Heptagons are made up of a series of straight lines. As any one with even an elementary knowledge of geometry knows "A straight line is the shortest path between two points". If a you take that basic fact then we can deduce that a curved line between those same two points would be much slower.

    Our heptagonal solution is simply much more "cutting edge" than the rounder wheel solutions implemented by "legacy vendors". I can concede that a there are benefits to efficiency gained with rounder wheels, but I beleive that this can be achieved better by implementing more straight sides to our solutions.

    On that point, I would also like to announce that in 18 months we will release our "Decagonal" form factor. Our engineers are also currently working on a "Tetracontagonal" solution with an ultimate goal of making the same solution scalable to a "Hectogonal" solution in the next 3-5 years.

    The future is here now. "Get on the Heptagon". You don't want to be left behind waiting for a standard implementation of this.

    Thankyou for your time,

    Kurt (@networkjanitor)

    ReplyDelete
  2. Ivan Pepelnjak10 June, 2011 11:04

    This is way better than my post =-X ... and using more solid arguments than many vendor claims we see these days :-P

    ReplyDelete
  3. Almost two months later this post still makes me laugh.

    ReplyDelete
  4. Thank you for this article. I'm going to get my clients to read this page before they insist on active/active vSphere cluster.
    Cheers!
    e1@vmware.com

    ReplyDelete
  5. How do you build HA mail service (for example MS Exchange) on active/active datacenters which do not have streched VLANs/subnets?

    ReplyDelete
    Replies
    1. How about using features Microsoft introduced more than 2 years ago?

      http://blog.ipspace.net/2011/06/multisite-clusters-done-right-by-none.html

      Delete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.