VSAN: As Always, Latency Is the Real Killer
When I wrote my stretched VSAN post, I thought VSAN uses asynchronous replication across WAN. Duncan Epping quickly pointed out that it uses synchronous replication, and I fixed the blog post.
The “What about latency?” question immediately arose somewhere in my subconscious, but before I could add that thought to the blog post, Anders Henke wrote a lengthy comment that totally captured what I was thinking, so I’m including it in its entirety:
Note that any kind of synchronous replication also suffers from the extra network latency. Having said this, VMware's VSAN must be designed for a local network only.
They do have dedicated products for asynchronous remote replication, and one can probably combine VSAN with them. But please don't ignore the added latency and physics :-)
As a worst-case example: your HDD has an average rotational latency of around 2-3ms - the time until a sector can be read or written. Assuming the sector is written instantly, it will still take 2ms on an average write operation.
If you're doing replication in the metro area with a millisecond roundtrip of overall network latency, this latency will add up for any write requests: your remote HDD probably won't have its data committed in 2ms, but in 2+1=3ms.
You could use SDDs in the hypervisor hosts for local caching, reducing the overall latency to (almost) WAN latency, but then the difference between local VSAN and stretched VSAN would be even worse. See also below.
Also note that VMware claims the maximum latency supported for stretched VSAN is 5 msec. By now you should be able to figure out what that does to your write performance.
Depending on what your application actually does and how often synchronous data is forced onto disk, sync replication in this setup may functionally decrease the overall hard disk performance by up to 50%.
Of course, in real life the various writeback-caches in operating systems, hypervisors, RAID controllers and hard disks lie about having something "really" written onto disk, so those 50% are "worst case" for "every single sector/block is forced to disk". Even if the write is not forced to disk, the network latency still adds up before the remote system can promise "having it written". So overall, the network latency adds up to the access time.
Even in a standard OLTP mix (70% read, 30% write), the impact of high-latency writes is obvious: the read performance doesn't change, the write performance gets noticeably worse.
If your application doesn't cope with extra latency on writes and you still require synchronous writes, you may need to switch from HDD to SSD (reducing the local access time close from 2ms to zero, leaving you with pure network latency).
With more remote locations, the problem becomes worse: 3ms is negligible in world of WAN, but if your 2ms hard disk suddenly takes 5ms before some data can be written, it is a considerable decrease.
And when your top-notch high performance database's average write latency suddenly jumps from 0.1ms (SSD) to 3.1ms (remote SSD), someone will probably notice (+3000%).
Summary: As always, think before you jump, don't believe in bandwidth fairy, and consider all the implications of stretched technologies… but if you’re a regular reader of my blog, you probably know that by now ;)
I understand what he is trying to say, but we are forgetting that we are trying to solve a business problem here. Any stretched storage platform has the same challenge when it comes to latency, yet NetApp Metro, EMC VPLEX, 3Par etc etc are still relatively popular solutions. Why? Well simply because in many cases it is 10x easier to provide this level of resiliency through an infrastructure level solution rather than to rely on 3rd party application providers to change their full architecture to provide you the resiliency you need. As you know getting large vendors to change their application architecture isn't easy, and can take years... if at all.
These types of solutions are developed for relatively short distances, and still relatively low latency. Sure it has been validated to be able to incur a hit of 5ms, that doesn't mean that from a customer point of view this would be acceptable. That decision is up to the customer. Same applies to bandwidth, what can your afford, what is available in your region / between sites etc.
Stretched infrastructures are not easy to architect, or deploy for that matter, but I truly believe with Virtual SAN we made the storage aspects 10x easier to manage and deploy than they have ever been before.
That was not the point I was trying to get across. I do think that customers are concerned about latency, at the same time they are concerned about availability. It is for them to figure out what is more important.
What is the alternative you have? I am not sure what the point is people are trying to make with discussions like these. It is not like it is easy to get a whole application architecture changed.
I'm positive most people reading about stretched VSAN never considered the impact of additional latency (even I thought it didn't matter).
http://www.vmware.com/files/pdf/products/vsan/vmware-virtual-san-6.1-stretched-cluster-bandwidth-sizing.pdf