In the last couple of days you might have heard that recent performance stress-testing on the new vSphere platform has smashed through the physiological barrier of 1 Million IOPS. Now you might dismiss this as (for want a better phrase) as a “pissing contest” between VMware and its noisy neighbours. But I think its significance is best understood without the benefit of the marketing-goggles that afflict so much FUD, anti-FUD and counter-FUD content – you know the sort of “my dad’s bigger than your dad” knock-about stuff that circulates. I also think its fair to say that its unlikely that any customer is driving this much IOPS to a VM. Instead see this sort of performance work as kind of drag-race – its way of demonstrating what vSphere is capable of driving if hardware and money were no object. It’s also a demonstration that neither the platform nor the virtual machine is the real barriers to virtualization.
For me the announcement is part of much bigger picture. For sometime have been saying that more or less the computer limits that afflicted embryonic virtualization in the 2001-3 period have finally been put to rest. Let’s just think about that for second, and go for nostalgic trip down memory lane. [This is where I put aside my Zimmer-frame, and settle back in my comfy chair and pull a tartan rug over my knees]. Do you remember when an ESX 2.x VM had just 2vCPU and 3GB RAM. Back then SQL and Exchange “nagmins” could rightly argue that no paltry VM could ever compete with their physical servers….
So VMware took the VM down to the gym, gave it some steroids and an aggressive workout campaign. So by the end of the decade we now have a VM that support 64vCPUs, 1TB of RAM and capable of 1 M-i-l-l-i-o-n I-O P-e-r S-e-c-o-n-d (Yes, it’s important to wheel out your Dr Evil impression whenEVER you say this!). For me these numbers are more psychologically significant than anything else. In the sense it sends out a message to any application owner (aka virtualization naysayer) that whatever their service requires we can deliver from the hypervisor. Of course, nothing comes from nothing (as Shakespeare’s King Lear famously said) – if you do a have a big-n-beefy VM that has these sorts of demands on a regular basis the underlying physical server have the underlying physical resources. So what were really saying is that the limits have shifted elsewhere. Away from the software limits of VM, back to the physical world… I think that’s important because then the only argument against virtualizing a resource intensive VMs is whether it makes economic sense. For me that’s a no-brainer. It’s unlikely that even one of these resource intensive VMs needs those resources all day long. So the principles of server-consolidation play well even with resource intensive VMs – they can be blended with less contentious VM or you can simply reduce the consolidation ratio to such a degree that the physical hardware isn’t over-taxed. One you accept this rationale, then it makes logical sense to look at the other resources/services a VM requires and ask if they delivered in a similarly efficient fashion. For me that’s the message at the heart of the software-defined datacenter concept.
The other important issue is appreciating the overhead incurred to achieve these sorts of levels of performance. As we all know vSphere is a platform designed for running VMs and little else. It’s not a generic, vanilla operating system that’s been retrofitted for virtualization. If you want another silly analogy for me its the difference between doing time-travel in Dr Who’s the TARDIS, or doing time-travel in a John DeLorean car with some crack-point professor at the wheel. Being a Brit I’d pick the TARDIS any day. You see the John DeLorean car was never designed for time-travel as can be seen by Michael J Fox’s facial expressions most of the time – it was design to rescue the failing Irish economy in the 1970’s. It really didn’t work out too well.
Anyway, I digress. The overheads. The point about is that the VMkernel can achieve these sorts of loony-toon performance stats without need 8 times amount of resources that other platforms require. The truth is any vendor can achieve mind-boggling performance stats if they choose chuck a load of hardware at the problem, and spec up the VMs as if they were running laptop OS from 1999. What the stats show is that these performance stats are capable just by having a modern hypervisor sitting on modern hardware. No need to cook up some bleeding edge lab with a load of guys in white-lab coats.
The joke I’ve been making with folks in the community about these new VM maximums and specs, is that the “Monster VM” is soooooo last year. He’s been replaced within 12 months with the “MegaBeast VM”. Thing about this sort of work is that isn’t just pointless point scoring. It’s really about future-proofing the hypervisor so what ever gets chucked at it over the next decade it is more than capable of absorbing. Although right now we not pushing this sort of workload there may come a time when some scaled-out/scaled-up application crosses our paths that places an unexpected strain on our resources, and we don’t want the hypervisor to be way. The other scenario I can see is we might get mega-scale virtual desktop solutions where we have extremely high or dense virtual desktop consolidation scenarios where a lot of little IOPS could turn into a big headache.