emptydatacenter.jpg

This Blogpost is available in an audio format available here. Select blogpost are narrated into an audio format and distributed via my podcast “The Chinwag” which is available in both Audio Only and Video Only.

This article is dedicated to Jane Rimmer, one the leaders of the London VMware User Group who a couple of weeks ago asked what VMware means by the “Software Defined Datacenter”. Jane often spots typos in my blog posts that I often rush to correct – once she said she would be my professional proof-reader.  So I asked her to look this over. Here you go Jane!

Last week I spent most of my time working through the “Launch Readiness” material that’s privately available on the MyLearn. It’s part of the process that people call “on boarding”. It’s a funny kind of word “on boarding”. The first meaning likens the company to being a great ocean-going liner, and I’m a passenger being brought on board.  For me it’s the other meaning that’s been more prevalent – trying to take on board a great deal of information about the company – both in terms of larger-scale vision, but also some technical and practical ways that we intend to deliver it.  So part of the on boarding has being understanding the concept of the “Software Defined Datacenter”. It’s a term that many people have asked to be defined. So I want to use this blog post to outline what the phrase means to me, and to explain why I think we have adopted it.

In the previous decade I was one, amongst many, who spearheaded the drive to adopt virtualization. As a VMware Certified Instructor I used to get the delegates to log on to what was called the “Virtual Datacenter”. I liked the term at the time. The datacenter was often in the US and we were accessing the servers, the network and storage all remotely.  The students rather liked the VDC and were impressed by the fact that at the beginning of the week we had nothing, but by the end we had built out a whole VMware environment. The less said about keyboard repeats and trying to type into ILO/DRAC console session the better! As instructors we liked them because each VDC was consistently the same – and you could depend on the lab guys to reset it properly at the end of each class. It was better than bellowing over the noise of servers, switches and an FC-SAN at the back of the room!

Little did I think at the time the abbreviation VDC would become term in vCloud Director. In case you don’t know you get “Provider” vDC (which offer up the resources – storage, network and compute) and logical “Organizational” vDC’s that represents the tenants that consume those resources. At the time I remember how impressed my students where with how quick it was to create new VMs especially once a library of templates had been established – we would call them vApps nowadays.

Somewhere along that five-day course I would discuss the fact if the bottlenecks no longer existed in the provisioning of new servers, that this would expose different bottlenecks elsewhere. I went as far to say that those bottlenecks would most likely be process ones. The old bottlenecks once disguised and obscured the subterranean bottlenecks beneath. I guess the idea really came from an experience around networking. Generally, when you remove a bottleneck in the network stack, all it serves to do is expose a different bottleneck elsewhere. Paradoxically, the efficiencies we implement in one part of our operations only serve to reveal inefficiencies elsewhere. Incidentally, this isn’t an excuse to maintain the status quo – but to be always pushing at open and unopened doors in a ceaseless quest to improve what we do. If it helps that you’re imagining yourself in the datacenter as a character in “Lord of the Rings” – that’s fine with me, but don’t tell your senior management, they might look at you strangely.

The key to understanding the “Software Defined Datacenter” is to look back at the previous decade, and that term the “Virtual Datacenter”. So far the success of virtualization has largely been restricted to and circumscribed by “compute” virtualization. The last decade was largely a narrative of taking physical servers, and converting them into virtual machines. That was a resounding success. However, as we all begin to mature into the new virtual datacentre, its becoming increasingly apparent that “compute” virtualization isn’t going to drive the same efficiencies going forward on its own. Now, that’s not to say there isn’t still legs in the compute virtualization model. Everyone is a different number of miles down the road that is their virtualization journey. There are surprisingly a significant number of folks who are only just starting. That’s something our community often forgets. At the other end of the spectrum we see increasing cases of 90% virtual and, in some cases, 100% virtual estates. But putting that success aside for one moment, the truth is in many aspects of the virtual world the VM remained, until recently, stubbornly tied to the physical world. This is apparent in three main areas – network, storage and disaster recovery.  Over time we have created silos of networking and storage because the technologies demanded them, and they have been coupled to silos of expertise as well.

I remember when I was an instructor I used to warn my students the course would be about the physical world as well as the virtual. After all, it wasn’t as if our server rooms were suddenly empty, resembling some ghost town from the era of the Wild West with nothing but tumbleweed drifting through. There was a reason. Much of the course was about how the VM access resources in the physical world. Where would it be stored? How would it communicate to other VMs on other hosts and to other servers and users on the wider network? How would we back it up and make it available elsewhere? If a physical server died where would the VM live? Much of this had to be done within the context of existing physical environment, where with virtualization being the “new kid” in the datacenter and with the physical world being the older Big Brother.

A good example of how tied the VM was to the physical world is networking. In the world of vSphere we create “Virtual Switches” either “standard” ones or “distributed” ones. Yet despite their name these “virtual” switches are very much tied to the physical world. One of the first administrative decisions you must make is which vmnics to assign, and how many given your needs for fault tolerance. There is support for VLANs via the VLAN Tagging process – but VLANs themselves are not defined or created at the vSwitch level. Paradoxically, the VirtualLAN is defined in the physical switch. Despite its name a VLAN is actually quite a physical construct – it’s created in the firmware of the physical switch. Even advanced features such as IP based load-balancing require 802.3 “Link Aggregation” to be enabled on the physical switch. For the most part we still live in world of subnets and IP where the VM is tied to these constructs that limit its portability. True, I have met customers who have stretched VMware HA clusters who can move VMs across sites – and have the pre-requisite networking and storage to allow that to happen. But they are a relatively small number compared to customers who use VMotion and HA within site. The VXLAN project is about trying to liberate customers from constraints of the physical datacenter who’s design in physical silos goes back to the early 90’s and the rise of client-server x86 computing.

If the network is not your kettle of fish, then look to storage instead. Lets say if you create a VM with three virtual disks (OS, Application, Temp/Log) each one of these virtual disks are likely to be located on three different datastores or, if you have vSphere5, the “datastore cluster”. A great deal of complexity comes from this very simple best practise. What RAID level should be used to back each LUN or Volume? How many spindles should back each LUN/Volume for the given workload? And of course, the old chestnut – how many VMs should you place on single LUN? Even this little tale disguises the other storage concerns – features like Motion, HA, DRS, and FT all require shared storage with the same datastore being presented to all the hosts. Woe betides any storage admin who doesn’t present and number the LUNs/Volumes to the right host, or accidently de-presents them.

Of course all this new “complexity” that produced massive efficiency was a great opportunity for a little known guy from the UK. In truth the early adopters of virtualization benefited career-wise from understanding the dependencies and requirements needed to get all the ducks-in-a-row to make this technology sing. We could leave this state of play as is. Wait until we reach 100% virtualization, and then fold our hands – and proclaim that our work is done. We’d be dead wrong. Here’s why. It’s precisely this over dependency on the physical world that will actually prevent virtualization hitting 100%. All of us has have rubbed up against limitations and restrictions at the network and storage layer – whether they be technical limitations and restrictions or, increasingly, for some political limitations and restrictions.

Last year saw the introduction by VMware of the “Monster VM”, this year we are introducing what I’l call the “MegaBeast VM”. The barrier to adopting ever-greater levels virtualization isn’t really the VM anymore. It was – do you remember when in ESX 2.x the maximum sized VM was just 2xvCPU and 3GB RAM? The barriers to virtualization and cloud computing reside in the storage and network layer. The solution to this challenge is not to stop at compute virtualization but to extend virtualization to the other resources the VM needs to be a first-class entity in the datacenter. We’ve talked for many years about virtual appliances – and to some degree the “Software Defined Datacenter” is a more rounded way of talking about those embryonic ideas. So if you look at the new components to vSphere you can see clearly that term is a quick way to talk about the other components we are virtualizing. At the network layer we are talking about vShield Edge/Endpoint and App. At the storage layer we are talking about the VSA and vSAN. The difference you will find with VMware is while we have offerings in these areas our competitors don’t. They are still leveraging moribund technologies located in the operating system layer that customers have never been really satisfied with, and as a consequence have been forced to blend solutions from third parties. The important thing is that there wasn’t a choice; it was something that was foisted upon them. Now here’s where VMware is playing things smart. The API’s that allow such technologies as vShield to operate don’t lock out our partners. So if you are building a vCloud Director cloud on top of vSphere5.1 and want to use another vendors technologies, like F5 Networks BIG-IP then you’re free to do so. The choice is yours, not one created by a “good enough” component that forces you to put your hand in your pocket for your wallet. So this “Software Defined Datacenter” isn’t a “2001 Space Odyssey” monolith, it much more modular than that. SOFTware is like that. It shouldn’t be like HARDware – implacable or inflexible.  The truth is VM isn’t “naked” as some people have put it – it’s surrounded by other network services – VLANs, firewalls, load-balancers and security services such as IDS

For me the “storage array” and its management has been slowly disappearing over the last three or four years. Firstly, I was one of the early adopters of virtual storage appliances when I was given NFR licenses for what was then Lefthand’s ISCSI VSA, which I used to write my very first “Site Recovery Manager” book. Secondly, I’ve been a long time devotee of technologies such as Dell’s “Integration Tools for VMware Edition” (DIT-VE), EMC’s “Virtual Storage Integrator” (VSI) and NetApp’s “Virtual Storage Console” (VSC). These plug-ins to vCenter all do essentially the similar tasks relative to the vendor such as provisioning new LUNs/Volumes, cloning VMs for virtual desktops and managing the storage vendors’ snapshots technologies for the fast restore of damaged VMs. The interesting thing about these plug-ins is the more you use them, the more you find yourself that the array exists, and log in less and less to the storage vendors management tools.  Storage starts to become just another resource you allocate. The centre of my world currently is vCenter, and if I can have one less window open to manage my lab – with fewer clicks and wizards to complete. I’m going to choose that every time.  I can see this happening more and more as I begin my first tentative steps on my cloud journey – those physical resources will start to become less physical to me. For me that’s very much like compute virtualization. In the early days, like my students, I was worried about on which ESX host my VM was running. Within a few short months, I began to care less and less. Now I barely think of it all – in fact when students used to ask me, it felt somewhat quaint and esoteric to care anymore.

The other area of the “Software Defined Datacenter” resides in a field that’s very close to my home – disaster recovery. It recently received a massive uplift in version 5.0 which introduced automated failback and vSphere Replication. For me vSphere Replication is a perfect illustration of the vision. Up until its introduction customers wanting virtual DR needed matching arrays at both sites with the smallest unit of replication being the LUN or the Volume. That introduces whole set of costs and complexities. With vSphere Replication you can replicate just want you need – the VM and nothing else. It’s “VM-aware” as I used to say to my students.  At a single stroke it both solves a problem, and introduces new possibilities because the replication doesn’t reside down in the storage controller and firmware of the array. You can see the fruits of this already been shown very quickly within the lifetime of the vSphere5.x release. So now vSphere Replication is available as part of vSphere without needing to purchase SRM, and vSphere Replication now supports automated failback the same as array-based replication. vSphere Replication is just software. There is a  virtual appliance you download and configure and off you go. It’s precisely that kind of flexibility that a software defined datacenter is aiming to deliver.

There will be some, of course, that will say that this vision of the “Software Defined Datacenter” is overly ambitious. I’ve got a couple of things to say about that. Firstly, that’s the point of a vision – ambition. I’ve often likened cloud as like being the mission to the moon. It took grand vision to harness and focus the brainpower and hard work of engineers to a goal. If your vision is an ambition that is easy to achieve, it probably isn’t much of a vision. Secondly, we have been here before. There were many naysayers who said that virtualization would be a tactical technology that wouldn’t go mainstream and would remain corralled into just being used to run legacy OSes and applications such Windows NT or just in the dev/test world. Those people were proved wrong. The truth is that IT and business departments so loved virtualization they could not wait to use it in production. The last couple of years have been about consolidating the position of virtualization as a mainstream, strategic technology in the datacenter. We’ve gone from questioning whether Tier1 applications should reside in a VM, towards adopting “virtualization first” policies. Given how conservative the world of the datacenter and infrastructure can be that’s a terrific amount of change in a very small time. Pretty much like the Wright Brothers inventing powered flight one day, and the jet engine arriving the next.  Finally, despite what some media pundits say it doesn’t put VMware on a collision course with its partners like Cisco or EMC. It’s precisely those partners that will help make this vision a reality.

Anyway, that’s about it for this little polemic. In my next couple of blog posts I’m going to be going more down in the weeds. Talking about what’s new in the vSphere5.1 and the Cloud Suite.