vCenter VM Hardware Upgrade results in Hung vCenter services

Yesterday, while upgrading a new vCenter virtual machine that was created on an ESX 3.5 host, to a new ESXi 5.0 host, we found ourself with a VM that was refusing to start any services.

The virtual machine is running

  • Windows Server 2008 R2 SP1
  • vCenter 5.0 Update 1
  • SQL Server 2008 R2 SP1 (10.50.2792)
  • and the whole suite of vCenter services (vum, syslog, dump, web service).

The virtual machine was created  on an ESX 3.5 (Build 604481) and was configured as a VM Version 4.  The target platform was a new ESXi 5.0 Update 1 host (Build 623860). So we cold migrated the vCenter to the new system, via a shared VMFS3 datastore.

At this point, the virtual machine was running fine as a VM Version 4 on the ESXi 5.0 Update 1.

I then started the upgrade process, with the installation of the VMware Tools, to ensure I had all the proper drivers in the VM. I then powered off the virtual machine, and upgraded the hardware to VM Version 8.

vCenter - VM Version 8

The system restarted but there was an issue with the various services. I could not open the network settings, I could not uninstall the VMware Tools as the Windows Installer service was not running. My data and database log disks where not visible, I could not open the disk management control panel.

After much troubleshooting, restarting the virtual machine in safe mode and various other tests, my colleague found this very interesting article Windows Server 2008 computer hang during startup while “applying computer settings” and services configured to start automatically fail to start http://support.microsoft.com/kb/2004121 

The following two paragraphs are taken from the Microsoft Support Article.

Cause

The problems described in the symptoms section occur because of a lock on the Service Control Manager (SCM) database.  As a result of the lock, none of the services can access the SCM database to initialize their service start requests. To verify that a Windows computer is affected by the problem discussed in this article, run the following command from the command Prompt:

[box]sc querylock

The output below would indicate that the SCM database is locked:

QueryServiceLockstatus – Success

IsLocked : True

LockOwner : .\NT Service Control Manager

LockDuration : 1090 (seconds since acquired)

[/box]

Let me fix it myself

you can modify the behavior of HTTP.SYS to depend on another service being started first.  To do this, perform the following steps:
[box]

  • Open Registry Editor
  • Navigate to HKLM\SYSTEM\CurrentControlSet\Services\HTTP and create the following Multi-string value: DependOnService
  • Double click the new DependOnService entry
  • Type CRYPTSVC in the Value Data field and click OK.
  • Reboot the server

[/box]

NOTE: Please ensure that you make a backup of the registry / affected keys before making any changes to your system.

After having made the registry modification and a final restart, the virtual machine was working again as expect. This was a very strange and bizarre error I have never heard someone run into. So here it is resumed, and may it be usefull someday to someone else…