Coping with long-distance vMotion requests

During the last Data Center webinar (register here or buy a recording) I got an interesting question when describing the inherent problems of long-distance vMotion: “OK, I understand all the implications, but how do I persuade my server admins?”

The best answer I’ve heard so far came from an old battle-hardened networking guru: “Well, let them try”.

A bit more background: he’s running a really tight shop, with no inter-DC bridging. The application and server people in his company got infected with the anywhere-anytime “vision” and wanted to move live virtual machines between data centers. He decided to give them the tools they needed (read: more than enough rope).

Unintentionally they made the experiment more interesting by moving a virtual machine really far away from its data and all the other servers it was communicating with (for our purpose, the farther the better) ... and failed spectacularly. After a while, they came back saying “Now we understand what you were trying to tell us ... can we sit down and figure out what we can do?”

You don’t have to enable inter-DC vMotion to make them understand the problems. Just use a delay emulator like WANem and traffic shaping (to limit the bandwidth) and do the test in your lab.

You might find the following steps handy when replicating this feat:

  • Understand the application traffic flows. The meshier the structure, the better.
  • Document in advance what you expect to happen. Try to educate the server admins and the application people. If nothing else, make them aware why you’re opposed to inter-DC vMotion (it’s not because you’re lazy or because you don’t have a clue how to implement it).
  • Make your boss (or CIO) aware of your predictions. You can’t say “I told them so” afterwards if your boss is not aware of the issue.
  • Make sure the VM is moved under real load. The inter-DC link has to be well saturated before the traffic trombone data starts winding around the network.
  • Enjoy the show ;)

5 comments:

  1. Rob Marković03 December, 2010 08:32

    Why not throw in some WAN optimization at the right parts?

    ReplyDelete
  2. WAN optimization will not reduce the latency and it's questionable what the compression factor would be. Also, 1Gbps+ WAN optimization products are still rare.

    ReplyDelete
  3. You could try Exinda's 1 Gbps+ products - 6060 (up to 1 Gbps), 8060 (up to 5 Gbps), 10060 (up to 20 Gbps) for visibility and QoS Assurance (control) and then intelligently accelerate the traffic that needs to be accelerated. Check the products out at www.exinda.com

    ReplyDelete
  4. Could you get me in touch with someone who could answer in-depth questions?

    ReplyDelete
  5. http://blog.exinda.com/contact-us/ ask for a product manager to contact you

    ReplyDelete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.