Do We Need Bare Metal Servers in Public and Private Clouds?

Whenever I was comparing VMware NSX and Cisco ACI a few years ago (in late 2010s in case you’re reading this in a far-away future), someone would inevitably ask “and how would you connect a bare metal server to a VMware NSX environment?

While NSX-T has that capability since release 2.5 (more about that in a later blog post), let’s start with the big question: why would you need to?

Hardware-assisted server virtualization came a very long way since the early days of VMware ESX, and paravirtualization significantly reduced the impact of virtualization on I/O operations – the last figures I’ve seen claimed that virtualization tax (performance drop when running on a hypervisor) dropped to a few percent. Combine that with the increased flexibility server virtualization gives you like on-the-fly ability to increase resources available to a VM, or migrate a running instance to another physical server, and one has to wonder why we’re still so obsessed with owning or renting physical servers.

One of my public cloud provider customers told me years ago that they decided to run a single VM on a dedicated hypervisor host whenever their tenants would ask for a bare metal server. The ability to add RAM or CPU cores, or to migrate the tenant VM to another physical server for maintenance or upgrade reasons was way more valuable than virtualization-related loss of performance.

I could find just a few reasons that could justify bare metal servers:

  • Licensing limitations - if you’re forced to deal with a software company that wants to charge you for any CPU socket their software could potentially run on till the heat death of the universe, you better get rid of that software, or run it in a way where even their lawyers couldn’t dream up a reason to charge you more.
  • Nested virtualization - You want to run your own hypervisor, and need some functionality that cannot be provided with hardware-assisted virtualization… or you want to run nested virtualization yourself (hint: hardware is usually not able to implement turtles-all-the-way-down designs).
  • Very high I/O performance - While I haven’t heard anyone complaining (too much) about the impact of virtualization on CPU performance (assuming a single-VM environment with no noisy neighbors and a CPU core dedicated to the hypervisor), software implementation of I/O operations could pose an interesting challenge. For example, no-one sane would run High-Frequency Trading applications on top of virtual switches.
  • Security - Hypervisor exploits remain a recurring nightmare of some security practitioners… although the impacts of those hypothetical exploits are greatly reduced if you’re running a single VM on a hypervisor host.
  • Mindset - There’s nothing better than the warm and fuzzy feeling of having your own server… even if it’s just one of many sitting in a data center at an undisclosed location.
  • Marketing - Your architecture (or a kludge implemented for some other customer) supports bare metal servers… and of course you start seeing them everywhere.

Have I missed anything? Please write a comment…

3 comments:

  1. I think another use case for a bare metal server would be to host some proprietary ivr system with a very old isa or pci card with some sort of onboard dsp. I find that even if you do pci passthru within vmware you lose a lot of features on that vm... like vmotion and resource ballooning.

    Mario

  2. It's questionable whether it makes sense to mention "cloud" and "old ISA/PCI card" in the same sentence. Architecting a "bare-metal cloud instance support" around such a use case might be a marketing nirvana, but also technological stupidity.

    I would keep that system in an isolated corner (quarantine seems to be a popular word these days) and connect it to my cloud via a L3 or (worst case) L2 gateway.

  3. HFT space uses baremetal kernel bypass etc.

    Also scientific research community.

Add comment
Sidebar