How many VM moves do you see in a medium and how many in a large data center environment per second and per minute? What would be a reasonable maximum?
- A hypervisor host has two 2 10GE uplinks and uses one of them for vMotion;
- vMotion has reasonably small overhead;
- Typical VM size is 4GB = 32 Gbit.
Based on those assumptions, moving that VM across the 10GE uplink would take approximately 4 seconds.
The expected VM move frequency of a hypervisor doing its best to get all VMs off it (meaning you placed it into maintenance mode) is thus ~ 0.25 Hz.
Now for the worst-case scenario: all hypervisors on a ToR switch panic and start shuffling VMs around (within a ToR switch). A typical ToR switch would have ~40 hosts connected to it, resulting in ~10 VM moves per second.
Evacuating a ToR switch
Now for another scenario: a ToR switch fails and even though all the servers connected to it have a redundant 10GE uplink going to another ToR switch you decide to shut them down (just in case the other switch fails as well).
A typical ToR switch has 160 Gbps of uplink capacity (4 x 40GE), and you don’t want to use it all for vMotion. Let’s assume vMotion can consume 100 Gbps of that capacity, resulting in 2.5 VM moves per second (or 4-5 VM moves per second if you’re willing to consume all the uplink bandwidth).
Conclusion: I would be highly surprised if anyone is seeing more than a dozen or so moves per second in a typical enterprise environment with a few thousand VMs. Have you seen more?