VirtualCenter (VC) is the heart of any large VMware ESX environment. ESX can still run without it, but features such as VMware Distributed Resource Scheduler (DRS) and VMotion are not available, which makes host management more difficult.
When trouble occurs with VirtualCenter it is important to have it working properly as soon as possible. This article will give you some tips for troubleshooting VirtualCenter and help you to resolve any problems that occur with it.
Restarting a stopped Windows service
Typically when a problem occurs within VirtualCenter its Windows service will be stopped. If you have an enterprise monitoring system, you should set both the VirtualCenter Server and License Server services to alert you when this happens. Depending on the source of the problem, restarting the service may bring it back up (i.e., the database was down or network connectivity to the database server was lost).
Beginning with VC version 2.0.2, the recovery option for the service is set to automatically restart in the event of a failure. If this is not the case on your server, then you can manually set it by editing the properties of the VirtualCenter Server service. Select the Recovery tab and then change the First, Second and Subsequent Failure options to "Restart the Service." VirtualCenter will subsequently keep trying to restart itself and will be able to recover from any temporary problems on its own.
General VirtualCenter troubleshooting
The first place to start when troubleshooting VirtualCenter problems is with the log files. The main VC log file is the vpxd-#.log (located in C:\Windows\Temp\VPX for VC 2.0.x and %allusersprofile%\Application Data\VMware\VMware VirtualCenter\Logs for VC 2.5.x, multiple logs are automatically rotated), which should give you a good idea of what the cause of the problem is.
Check the vpxd-index file to see which number log file is currently in use or sort by modified date. You should also check the Windows Event Log for any errors. You can also enable detailed database logging by editing the vpxd.cfg file. To do this, insert the trace lines below and then start the VirtualCenter service (vpxd.cfg is only read on VirtualCenter startup and is located in the %allusersprofile%\Application Data\VMware\VMware VirtualCenter directory). Also, to enable the most detailed logging change the log level in either the vpxd.cfg or the VMware Infrastructure Client to trivia (extended verbose).
In addition to the logs on the VC server you may want to check the VC related logs on the ESX Server. They are: /var/log/vmware/vpx/vpxa-#.log (vc agent) and /var/log/vmware/hostd-#.log (host agent)
How to fix VirtualCenter database issues
Many VirtualCenter problems are caused by issues with the database that VC uses to store its information. It's usually best to let a qualified database adminstrator (DBA) handle database maintenance issues if possible. Some common database problems and how to handle them are below:
- Loss of connectivity to the database server
Check for any network issues on your VC server including speed/duplex mis-matches and defective ports/cables. Any momentarily loss of connectivity between VC and the database server can cause VC to quit. Your network administrator should be able to determine if there are any issues with the physical network between the two servers. If your VC and/or database server are running on a virtual machine check for resource constraints on your hosts to ensure the virtual machine (VM) is getting the resources it needs to function properly.
- Database server is out of disk space
If you are using SQL Server this is typically caused by the transaction logs filling up. If you are using the default Full Recovery Model consider switching to the Simple Recovery Model which greatly reduces the size of the transaction logs. You can also shrink the transaction log to reduce its size. See how in this KB article.
- Maximum tablespace size reached (Oracle only)
For Oracle databases, the tablespace that your database uses may not be set to auto-extend once it reaches its maximum size (defined when it was created). Many DBAs do not like to have this set to auto-extend by default. Have your DBA check this, and extend it if necessary. For more information see this Database authentication issues
If your log file shows database authentication errors, you should verify that your VC server can successfully authenticate with the database server. Open the ODBC connection on the VC server and click Test Connection to verify connectivity. If authentication fails, ensure that the database username/password is correct. If you need to change this on the VC server, you must run a repair of the VC installation to reset it. This is because the username/password is stored in the registry in encrypted form and can only be set when installing VC, or through the VC user interface while it is running. You can read more in this KB Article.
Possible problems with VirtualCenter upgrades
Another common problem can occur when upgrading VirtualCenter to a newer release. Each new release of VirtualCenter is a full version and will overwrite the previous version. Take care when upgrading and carefully answer the upgrade questions. Many users have inadvertently wiped out and re-initialized their databases while upgrading.
In later versions of VirtualCenter, the prompts have been changed to decrease the chances of this happening. As part of the VC server upgrade, the VC agent on each ESX server that it manages is also upgraded. Many times this does not complete successfully and afterwards some of your ESX servers will appear as disconnected in VC.
There are several preventative measures. With some versions of ESX 3.0.x, you have to ensure that the /tmp/vmware-root directory exists on each ESX server or the VC agent install may fail. This KB article has some options for dealing with this.
Fixing an ESX server that appears as disconnected
If you should find yourself in a situation where some of your ESX servers appear as disconnected, you can first try restarting the hosted service on the server that is showing as disconnected. Log into the Service Console and type "service mgmt-vmware restart." Next try restarting the vpx and authd services by typing "service vmware-vpxa" and "service vmware-vmkauthd restart."
If the server still displays as disconnected, you can try manually installing the VC agent on the ESX server by following these steps:
- On the VC server, locate the upgrade folder under your VC server program directory. Open the bundleversion.xml file and look for the bundle ID that corresponds to your ESX version (ie. ESX 3.0.x = 6, ESX 3.5.x = 7)
- Copy the appropriate vpx-upgrade file for your bundle ID to a temp directory on your ESX server (ie. vpx-upgrade-esx-6-linux-
- Install the file by typing the following from the Service Console in the directory that you copied the file to: sh ./
- Once it completes restart the hostd and vpx services on the ESX host by typing "service vmware-vpxa restart" and "service mgmt-vmware restart"
- If your server still shows disconnected, then a reboot of the ESX server is usually needed to recover from this.
Generating a log bundle
If you need to contact support, they will usually want you to generate a log bundle for their information. This is the VC equivalent of running VM support on an ESX server, and includes log and configuration files that the support people will need to help troubleshoot your problem.
You can run this from the Start menu on the VC Server; it is located in the VMware folder and called "Generate Virtual Center Server log bundle." If your Virtual Center server is running you can select Administration, Export Diagnostic Data, uncheck any ESX hosts and make sure "Include information from VC Server and VI Client" is checked to create a log bundle for VC.
ABOUT THE AUTHOR: Eric Siebert is a 25-year IT veteran with experience in programming, networking, telecom and systems administration. He is a guru-status moderator on the VMware community VMTN forums and maintains VMware-land.com, a VI3 information site.