Building Network Automation Solutions
6 week online course starting in September 2017

vMotion Enhancements in vSphere 6

VMware announced several vMotion enhancements in vSphere 6, ranging from “finally” to “interesting”.

vMotion across virtual switches. Finally. The tricks you had to use previous were absolutely bizarre.

vMotion across routed networks. Finally someone learned how to spell routing. What really bothers me about this one is that vMotion across routed networks worked forever (probably relying on proxy ARP), it just wasn't supported. I was always wondering what the real reason for the lack of support was – maybe they had to implement VRF-like functionality to ensure vMotion traffic uses a different routing table than iSCSI or NFS traffic.

vMotion across vCenter servers. This one is a clear illustration of how stupid the long-distance vMotion ideas were. If you wanted to do vMotion across multiple data centers, they had to use a single vCenter, making them a single management-plane failure domain (not to mention the minor challenge of losing control of all but one data center if the DCI link fails).

Long-distance vMotion, which now tolerates 100 msec RTT. As expected, it took approximately 10 femtoseconds before a VMware EVP started promoting vMotion between East- and West Coast (details somewhere in the middle of this blog post).

Note to VMware: just because you fixed your TCP stack (which is good), long-distance vMotion makes absolutely no more sense than it did before… not that I would ever expect some people promoting it to understand the nuances of why that’s so.

15 comments:

  1. "vMotion across virtual switches." Could you explain this one in a bit more detail? I'm a bit dense this morning, and the article you linked to skips over it too? Is this just changing the portgroup(s) the vm connects to during a vmotion?

    ReplyDelete
    Replies
    1. It was impossible prior to vSphere 6 to vMotion a VM from a vSwitch to a vDS (or back) ... or between two clusters that were using different vDS switches (two instances of Nexus 1000V, for instance).

      Delete
  2. VMware vSphere, at least ESX part, is based on Linux and Linux supports multiple routing tables (something VRF like) for many years. But I still remember times in 90s when that was not supported.

    ReplyDelete
    Replies
    1. ESXi is 100% VMkernel, no Linux.

      Delete
  3. Ivan, can you please provide some details around your statement of, "long-distance vMotion makes absolutely no more sense than it did before,"? as I'd like to learn the details behind why this is a bad idea.

    Thx

    ReplyDelete
    Replies
    1. http://blog.ipspace.net/search/label/vMotion

      Delete
    2. Having other people quoting your blog posts in replies to comments on your blog - priceless ;) Thank you!

      Delete
    3. Hello Ivan,
      Your blog posts and webinars are absolute gold.
      I understand why long-distance vMotion doesn't make sense - large layer 2 extension/traffic hair-pinning/storage issues/extending failure domain, etc. The question is, how do we get around it? What is the alternative - assuming we are not going to get better designed applications in the near future? What would you recommend for geographical redundancy/DR?

      Looking forward to chapter 9 of your "Data Center Design Case Studies" book!

      Delete
    4. So does long distance vMotion truly never make sense?
      What about disaster avoidance? I understand the objections about Layer 2 connections between datacenters, but not about long-distance vMotion and desire to provide continuity to non-redundantly designed applications.

      Suppose we do a long-distance vMotion across vSwitches, and have no true L2 extension, but say an OSPFD instance on the VM announcing its routed /32, or a Proxy-Arp gateway instead of a true L2 extension.

      Maybe long-distance vMotion has great uses, but we aren't sufficiently imaginative to give them fair consideration.

      Delete
  4. Hi Ivan,

    We started a discussion yesterday at sigs but I assume this infrastructure is broken, right? http://wahlnetwork.com/wn/wp-content/uploads/2014/08/vmotion-graphic-650x401.png

    ReplyDelete
  5. Well, it probably is a single failure domain (due to L2 network at the bottom). Whether that's broken seems to be debatable (at least in some circles ;).

    ReplyDelete
  6. Vmotion across routed networks sounds great. However, won't you still need some sort of L2 Extension method in place for everything to work once everthing is Vmotioned over. At least work without a whole lot of other changes to put into place?

    ReplyDelete
    Replies
    1. Now you could actually do fully supported vMotion across routed infrastructure (with L2 VM segment being implemented with VXLAN).

      Previously you could use VXLAN to implement overlay L2 VM segment, but were supposed to have L2 connectivity between vMotion interfaces of vSphere hosts.

      Delete
    2. Previously to do VXLAN with DVS and vMotion; I think both clusters needed to be in the same vCenter and use the same DVswitch.

      That was pretty restrictive and limited, especially if you are considering disaster recovery or avoidance to be a goal.

      What happens if your one DVswitch gets corrupted, or a script kiddie hacks into vCenter and sends Mass force-poweroff+Delete VM commands? Now both clusters are down!


      Delete
    3. James,
      Your vCenter should probably be pretty well isolated from the Script Kiddies. If they can get into your vCenter they can probably get into your core routers and "reload" or get into your PDUs and power everything off.

      Delete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.