There are a number of enhancements about storage.
VMFS and VOMA
vSphere5.1 increases number of ESX host that can simultaneously access a shared file on VMFS5 is going up from 8 to 32, essentially bringing it inline with NFS. In real terms this should get around some of the “linked cloned” requirements that stated that you could have 8 ESX host per cluster supporting VMware View linked-clones. The improvement also helps vCloud Director when its been used to create vApps in a more “Test/Dev” style deployment as a replacement for the now end-of-life, Lab Manager product. It’s means the process of fast provisioning for vApps has improved scalability.
Included in this release is utility called “vSphere On-Disk Metadata Analyser” or VOMA that resides on the ESX 5.1 host as a command-line utility. It has two modes check-only and read-only, and is intended as diagnostics tool that can read and validate the integrity of the VMFS metadata. Of course, problems in the VMFS file system are extremely rare, so you might find yourself only using this tool in conjunction with VMware support.
Space Efficient (SE) Sparse Disk
I talked about this briefly in the opening part of this series. Let me just repeat that in case you missed it before:
This feature that has done the rounds of nearly all the top bloggers this year – including Gabrie Van Zantan and Jason Boche – Thin Provision Reclamation. In the bad old days the only way to recoup “orphaned free disks space” caused by file deletes – was to run a VM through a truly horrible sdelete.exe process which write out any deleted files (and balloon up a thin virtual disk at the same) time, followed by an one-two upper-cut of SVMotion to claw back the space. In recent months there’s been 3rd party tools and flings to handle this issue. It is now being addressed directly in the platform. vSphere5.1 introduces as new SE Sparse VMDK format. Where this issue has been particularly apparent is in the area of virtual desktops and linked clones. It might find an application also in general vSphere environment now that PowerCLI5.1 supports the creation of linked clones – without the need for View or vCD.
But it also has an application in vCloud Director for when vCD is used rapidly build out a test/dev or lab management environment. The process has two stages associated with it. The first stage scans for stranded space and the vmkernel issues a SCSI Unmap command to virtual SCSI controller – and then the vmkernel marks those blocks as free. In the second stage a SCSI command is sent to the array. If the volume is a NFS datastore then a RPC “Truncate” call is used rather than a SCSI Unmap to the array.
All Paths Down State
It’s rare occurrence because of the amount redundancy/fault-tolerance put into storage connections – to get situation where all the paths to LUN/Volume or even an entire array occurs. Theirs is this thing called Murphy’s Law that has to be recognised. There’s couple of cases where things can get confused with storage. Say when our friendly storage admins remove a LUN/Volume backing a VMFS/NFS datastore – without the appropriate work being done in vSphere (evacuate VMs with Storage VMotion, remove or unmounts datastore from hosts). Another case could be the failure of the network layer that stops access to the array in iSCSI or indeed scenario when array just dies altogether or becomes inaccessible – say a site wide outage.
Previously processes like hostd (the ESX host management service) would sit there an scan waiting/hoping for the storage to return. That could consume worker-threads and potentially disconnect the host within vCenter. vSphere 5.0 introduced a new condition to the storage layer called “Permanent Device Loss” which instruct hostd to give up and timeout on its attempts to access the storage. vSphere 5.1 builds upon that intelligence to deal with transient situations where storage access has been lost – but not for ever. You can see this enhancement in the same context to the way that Site Recovery Manager added a quicker method to failover to the recovery site. Previous SRM would try to synch with the Protected Site as if it was planned failover. Of course, in the context of a loss of entire site, there’s little point in attempting a sync when the protected site might be lost. So SRM 5.0.1 added option to do a faster failover by not bothering to execute those synch steps in the recovery plan.
Storage Protocol Enhancements
There are improvements in both the iSCSI stack and FC stack as well. So there’s now full support for “Jumbo Frames” on both the hardware and software iSCSI adapters. In terms of the setup you can enable the MTU size in the GUI, and the vmk ports that back the vSwitch configuration will pick this up. On the FC side of the house 16GB FC is supported (subject to the HCL requirements). Previously, 16GB HBA’s were supported but they had to be run in 8GB mode.
Advanced IO Management (IODM part of ESXCLI)
ESX 5.1 gets a massive uplift to its ESXCLI command set. That’s a whole other story. From a storage perspective there are new commands that will allow to gather information about FC frame loss, LIP Resets and Physical Resets. As well as that there are ESXCLI commands to gather information about SSD drives including media wear-out, temperature and relocated sector counts. This one is quite interesting. Many SSD drives have a reserved number of blocks that act as PlanB should block be unavailable. So if your SSD drive starts using the relocated sectors it could be an indication of a potential problem. SSD vendors will be able to add their own plug-ins for vendor specific metrics. If you do have SSD drives the command below would pull out the information:
esxcli storage core device smart
Right now these options are available ESXCLI and they not visible in vCenter – that kind of makes sense, as they are diagnostic options in the main.
SIOC/Storage DRS
There is number of changes surrounding SIOC and Storage DRS. First up SIOC is now turned on by default in “statistics only” mode. If I understand it rightly it means should you decide to enabled SIOC fully at later stage the system doesn’t need to spend time building up the kind of information that you would need to configure it correctly. The default for the congestion threshold is 30ms. That’s quite high for SSD drives that would reach their thresholds much earlier – so something to bear in mind if you are using SSD. There is also an automatic threshold detection process that once SIOC is fully enabled would use the stats collected to work out the “peak” IOPS, measure any latency – and then set SIOC to be at 90% of that value by default.
Storage DRS gets an update as well. Firstly, there’s interoperability with vCloud Director. That’s us fulfilling on the promise of making sure that the cloud suite will have good integration points where they are needed – that also means that SDRS is aware of the “linked clones” that vCD can generate. Inside SDRS the “Datastore Coloration Detector” can now work out if a datastore is backed by same or different spindles of another datastore – that’s quite important to avoid Storage VMotion event that does an unnecessary move of a VM from datastore to another within the same RAID group or aggregate with no discernable improvement in performance. This process used to require a VASA provider from the storage vendor, but now is baked into vSphere5.1. Finally, there has been an improvement in the logic behind observed latency. Previously latency was measure from the ESX host to the storage – now the latency is measured from the Guest OS virtual SCSI controller and back again. This should give SDRS an even more accurate picture of the storage performance it’s seeking to improve.