Before Talking about vMotion across Continents, Read This

I expect to hear a lot about the “wonderful” idea of moving running VMs 100 msec away (across the continent) in the upcoming weeks. I would recommend you read a few of my older blog posts before considering it… and don’t waste time trying to persuade the true believers with technical arguments – talk with whoever will foot the bill or walk away.

The Basics

Challenges of Long-Distance vMotion and Stretched VLANs

Why Long-Distance vMotion Doesn’t Make Sense

A Few Odd Bits and Pieces

Latest blog posts in Disaster Recovery series

7 comments:

  1. I have read somewhere that NSX/Network properties can be vMotioned with the VM, so I guess with NSX (I am not a NSX specialist at all) you can vMotion like the BGP/OSPF announcing? That would solve some of the issues right?

    Assuming that when you vMotion a VM to another vCenter, the VM Storage vMotion is implied, and so uses the Storage from the other Site. So that in the end, you will have migrated the VM to another PoP of yours, and the VM does not keep any links/references to the old DC. (so we haven't the L2 link brokeness).

    From my perspective, the feature is really nice. But they should have called it "vMigrate" or "Live-Export" or I don't know, but not vMotion too. Because the reason why I'm going for the process that implies vMotion intra vCenter and the process that implies vMotion inter-DC isn't the same.

    If we are all fine with the idea that spanning a L2 network for DR reasons is not a smart Idea, I can't find a lot of use-cases where you want to LIVE migrate a VM to another Datacenter for production reasons.



    Replies
    1. What you're talking about has been supported for ages, and even simpler if you're willing to re-address the VM at the other end.

      What all the marketing gurus are so excited about is moving running VM across the continent and retaining all application sessions while doing so. As you said, not exactly a smart idea, but that never bothered some people.
    2. Intel was (or still is) talking about disaggregating servers via PCI over fiber. I wonder if you could do kind-of the same thing by pulling the network stack out of the VM into the host. Some piece of software (or multiple instances of said software, running in threads or whatever) would handle all VM network state external to the VM, while the drivers in the VM would be more like wrappers. So ARP,DNS, TCP windowing, etc, etc would be handled by the external piece of software, and it would communicate over some channel kind of like a software PCI channel to the VM. Dissaggregated VMs. Then you could probably simplify the whole stack in the hypervisor from OVS or now VPP up through the VM network stack. Then synchronizing network state between sites could become easier. Or not.
    3. Disaggregating the TCP stack might be the least of the problems. The real problems are bandwidth, latency and break in aggregation boundaries if you start moving IP addresses outside of their subnets.
  2. Haha, as I read the news about this latest vmware announcement I though "Ivan will love this", opened your site and here's your blogpost! Quick reaction. It might be worth reposting this blogpost every now and then as a preventive measure. Lest we forget
  3. Agree with all of your statements on the 'wonderful' (2014/09) merits.

    I wanted to point out an additional 'political' advantage. Customers don't like the 'idea' of down time, so long distance and vcenter based vmotions will I expect be used as migration tools.

    Currently I've been involved in a scenario where new servers were pre-staged and 3rd-party tools used to stream between environments via production interfaces.

    Service providers (especially large ones) are typically change adverse. When cloud product v.1 is end-of-sale, and cloud product v.3 is current migration between them is very difficult currently (VMware environments but separate infrastructure).
  4. I want to vMotion things across the continent, because it sounds cool!

    It might also be convenient to throw random workloads that need some spare cycles onto remote physical server resources that might happen to have spare cycles.

Add comment
Sidebar