Categories
Uncategorized

Why I’m moving away from vVols on IBM SVC storage

Virtual Volumes, or vVols, sound like a pretty nice idea: We present a pool of storage to vCenter, which in turn gets control of storage events within that pool via something called VASA providers. Benefits of this include the following:

  • vVols allow for policy-based storage assignment.
  • We get to use an “inverted” snapshotting method, where snapshot deletions (i.e. commits of snapshotted data), which are most commonplace are almost instantaneous, at the cost of more expensive rollbacks.
  • vCenter gets access to internal procedures in the storage solution instead of having to issue regular SCSI commands to the controllers.

As presented by VMware, the solution should be pretty robust: The VASA providers present an out-of-band configuration interface to vCenter, while the actual data channel is completely independent of them. As recommended by VMware, the VASA providers by themselves should also be stateless, meaning that in case of total loss of them, recovering should only be a matter of deploying new ones, which should read metadata about the storage from the storage itself and present it back to vCenter.

So what’s the drawback?

If your VASA providers are offline, you can’t make changes to vVol storage, and any vVol-based VMs that aren’t actively running become unavailable. Not being able to make changes to vVol storage is a pretty big deal, because guess what: Snapshots are a vVol storage change. And snapshots are pretty much a requirement for VM backups, which for any production environment is a daily recurring task.

I’ve been presenting vVols from our V9000 and V7000 storage solutions via IBM Spectrum Control Base Edition for quite some time now, and have really liked it. Except when it stopped working. Because it did. Several times. Firmware update on the SAN? Spectrum Control stopped working. HA failover between Spectrum Control nodes? Not reliable. Updates to the operating system on a Spectrum Control node? At least once I couldn’t get the node back online, and had to restore a VM backup. And right now I’m having an issue where some necessary metadata string apparently contains untranslatable unicode characters because someone – possibly even me – used the Swedish letters å, ä, and Ä somewhere without thinking.

I’ve opened a case with IBM support to get things running again, and as soon as I have, I’m migrating everything off of my vVols on SVC, and replacing those storage pools with regular LUNs. From now on I’m sticking to vSAN when I want the benefits of modern object storage for my virtualization environment.