The paravirtual SCSI controller and the blue screen of death

For driver reasons, the default disk controller in VMware guests is an emulated LSI card. However, once you install VMware Tools in Windows (and immediately after installing the OS in most modern Linux distributions), it’s possible to slightly lower the overhead for disk operations by switching to the paravirtual SCSI controller (“pvscsi”).

I’m all for lower overhead, so my server templates are already converted to use the more efficient controller, but I still have quite a lot of older Windows servers that still run the LSI controller, so I’ve made it a habit to switch controllers when I have them down for manual maintenance. There is a perfectly good way of switching Windows system drives to a pvscsi controller in VMware, and it’s well documented, so up until a couple of days ago, I’ve never encountered any issues.

However when I tried to do the switch on a domain controller the other day, it refused to boot – it just came up in a blue screen of death during the boot process. Naturally, I have backups, so that wasn’t much of an issue – thanks to Veeam it was all of three minutes to have the machine back in its previous configuration. The issue kept bugging me, though, so the next day I restored a copy of the machine in a closed off lab environment to experiment a little.

What I noticed was that restoring the LSI controller and running a “Last known good configuration” from the Windows boot menu returned the computer to working order. Also, the fact that I got as far as a blue screen of death meant Windows actually saw the system drive through the pvscsi controller. Booting into a command prompt through the rescue environment did not help: In diskpart only the system drive was visible. Booting into Safe mode through the Windows boot menu resulted in the same blue screen. But when I tried booting into Directory services recovery mode, everything fell into place. The DSRM is essentially safe mode but with the addition of not loading Active Directory. Note that this means you need to dig out the DSRM password you created when promoting the server into an AD controller. Since Windows booted with no issues into the DSRM environment, what had happened became obvious to me:

I use a secondary drive for storing AD data. When you move the system drive to a pvscsi controller, a quirk of Windows is that all other drives are taken offline until you manually take them online. So when the Active Directory services started and couldn’t find their data locally, one of them crashed horribly, taking the entire computer down with it. Getting the computer to run properly with a pvscsi driver was simply a matter of onlining the data drive from DRSM.

In conclusion: It’s possible to upgrade a Windows domain controller from the default emulated LSI scsi controller to the paravirtual scsi controller even if you have the AD data on a secondary volume. The steps are as follows:

  • Make a backup. Seriously. Always do this.
  • Add a SCSI controller from the vSphere web client and make sure its type is “Paravirtual SCSI”.
  • Look in the Windows device manager and confirm that the driver has loaded.
  • Shut down the server and change the type of the primary SCSI controller to pvscsi. The second controller can be removed, or the secondary vdisk can be given a SCSI ID that puts it on the secondary controller – likely (1:0).
  • Start the DC and be quick on the F8 button to show the Windows boot menu.
  • Select the Directory Services Recovery Mode option and press Enter.
  • After the computer has booted, log in with the DSRM password for the server.
  • In Server manager select the local server, select Disk and File management, and set the secondary drive Online.
  • Reboot the server into normal operation.