High availability fallacies

I’ve already written about the stupidities of risking the stability of two data centers to enable live migration of “mission critical” VMs between them. Now let’s take the discussion a step further – after hearing how critical the VM the server or application team wants to migrate is, you might be tempted to ask “and how do you ensure its high availability the rest of the time?” The response will likely be along the lines of “We’re using VMware High Availability” or even prouder “We’re using VMware Fault Tolerance to ensure even a hardware failure can’t bring it down.”

I have some bad news for the true believers in virtualization-supported high availability – quite a few of them probably don’t understand how it works. VMware HA is a great solution, but the best it can do is to restart a VM after it crashes or after the hypervisor host fails (and working on the VM level, it usually can’t detect a hung service). The VM has to go through full power-up process and all the services the VM runs have to perform whatever recovery procedures they need to run before the VM (and its services) are fully operational.

VMware FT is an even more interesting case. It runs two parallel copies of the same VM (and ensures they're continuously synchronized) – a perfect solution if you’re running a very lengthy procedure and don’t want a hardware failure to interrupt it. Unfortunately, software failures happen more often than hardware ones ... and if the VM crashes, both copies (running in sync) will crash simultaneously. Likewise, if the application service running in the VM crashes (or hangs), it will do so in both copies of the VM.

Update 2011-08-09: As expected, an interesting Twitter discussion followed this blog post. Among other interesting remarks, Duncan (Yellow Bricks) Epping rightfully pointed out that the VMware HA/FT products function exactly as described. That’s absolutely true – VMware’s documentation is extremely precise in describing how HA and FT work.

You can read more about high availability fallacies in an article I wrote for SearchNetworking (the title is a bit misleading) ... and remember: scale-out application architecture combined with load balancers is still the only way to reach true high availability.

Even more information

You’ll find in-depth discussions of high-availability architectures, impacts of vMotion and various types of data center interconnects in my webinars: Data Center 3.0 for Networking Engineers (recording), Data Center Interconnects (recording) and VMware Networking Deep Dive (recording or live session). All three webinars are also available as part of the yearly subscription.

10 comments:

  1. Fully agreed. A good typical example is Marathon Everrun VM. At the end of the day, its the end users who feel cheated when their services in VM hung and the VM-HA or FT is not application-aware.

    ReplyDelete
  2. don't forget with FT we can sync with a little different time ! so we can adjust a crash when we change something on the first VM.

    ReplyDelete
  3. Ivan Pepelnjak08 August, 2011 10:37

    I'm not sure what you're trying to tell me. FT syncs every single I/O operation (including KVM events). This blog post has a good introductory explanation:

    http://lonesysadmin.net/2011/04/19/vmware-fault-tolerance-determinism-and-smp/

    ReplyDelete
  4. Ivan, the problem is that creating a high availability solution for the front end is a no brainer. Put more than two instances and a LB in front. Done.

    The problem is to provide a HA solution for anything that has to do with persistent local data. This may include the database in (relatively) modern 3 tiers app but it also includes more traditional Enterprise applications (Exchange being an example).

    It is not even worth discussing how to provide resiliency to the front end. It's done. Focus your energies for the back-end.

    Massimo.

    ReplyDelete
  5. Ivan Pepelnjak08 August, 2011 19:36

    We totally agree - back end is a tough nut. However, until you solve the DB (more precisely, ACID data store) problem, you won't have a truly HA application. VMware HA or Windows failover cluster(s) buy you nothing but automatic restart after a hardware failure. The DB service still has to restart (and roll back all pending transactions) after every failure, which takes a significant amount of time.

    However, both SQL Server and MySQL offer a redundant server configuration, where the second server can take over immediately when the first one fails. High-end MySQL offers an even better distributed solution. So the problems can be solved ... but it's easier to offload them to someone else and believe in unicorn tears.

    ReplyDelete
  6. None of the things you are referring to Ivan provides a consistent failover scenario at the best of my knowledge. The reason for which it starts sooner on the other side is because it has lost all transactions the application think have been committed. It's good if you are hosting an application that shares pictures... not good if you deal with money.

    Having this said there is clearly a trend for which this backend is being made more "scale out" friendly... but it will be a long way to go.

    My 2 cents.

    ReplyDelete
  7. Ivan Pepelnjak08 August, 2011 20:06

    MySQL cluster provides true failover. A data node dies, at least one other node already has all its data. If I remember correctly, it's supported in single IP subnet configuration (with database replication recommended for long-distance needs).

    SQL Server provides database mirroring (which can be synchronous if you want to retain total consistency).

    And we (yet again) agree that the backend has a long way to go ;)

    ReplyDelete
  8. Duncan Yellow Bricks09 August, 2011 08:09

    I am reading this article again... The funny thing is that I understand what you are trying to get at but this is only true in an ideal world where Applications are specifically written to support a setup that includes load balancers and a shared database. Although everyone wants this to be true, reality is that we are nowhere near this ideal world.

    In most enterprise organizations I have been at least 80% of the applications, which are essential to the line-of-business day-to-day, don't support this kind of set up. This is one of the reasons HA is so widely adopted today. On top of that there is a substantial cost associated to load balancers and a shared database configuration (yes needs to be clustered / distributed as well) which might be more than the SLA requires. In those cases vSphere HA / FT / VM and App Monitoring are the way to go. 5 clicks and it is configured, no need to have special skills to enable it... just point and click.

    Once again, I agree that using a vFabric load balanced setup (shameless plug :)) would be ideal, but there are far too many legacy apps out there. Even in the largest enterprise orgs the IT department cannot control this, even the line-of-business cannot control it... main reason being that they are suppliers not taking the time to invest.

    Go vSphere HA

    Duncan
    yellow-bricks.com

    ReplyDelete
  9. Duncan Yellow Bricks09 August, 2011 09:14

    You are making a lot of assumptions here. You are assuming that all critical applications have a huge database. Many applications that are used on a day-to-day basis have a small database. Many apps for instance used at financial institutions are simple apps just to calculate what a mortgage would cost. Now although this might be 20MB app it is essential to the line-of-business and you might not think it is critical but they feel it is.

    Unfortunately critical doesn't equal current or mature application architecture.

    ReplyDelete
  10. We are planning to use VMware's FT to run a redundant Citrix NetScaler VPX for our internet facing applications.(10-30k req/sec)
    We could go for Netscaler's traditional cluster setup, but that would require us buying 2x licenses. With our existing FT license we get just as much reliability with no extra cost.
    If software inside of that VM were to die, we would be in exactly the same situation as running it on a dedicated box.

    ReplyDelete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.