Home > VMware Tips > VMware management, migration and performance > Resolving VMware ESX problems without pulling the plug
VMware Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

VMWARE MANAGEMENT, MIGRATION AND PERFORMANCE

Resolving VMware ESX problems without pulling the plug


Eric Siebert, Contributor
06.24.2008
Rating: -4.40- (out of 5)


Enterprise IT tips and expert advice
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


Differences between virtual machines and physical servers highlight the unique challenges of resolving virtual machine issues. On a physical server you can always pull the power plug as a last resort before restarting a server. But this strategy may not work on virtual machines, which only have virtual power switches. There are, however, a few toolkits available that either help prevent problems, or make your troubleshooting process easier. I'll discuss several of these in this tip, and give you step-by-step instructions on how to fix various common problems.

VMware Tools
The first set of tools you want to familiarize yourself with is VMware Tools. VMware Tools is a set of enhanced drivers and applications that installs on your virtual machine's (VMs) operating system. As a best practice, you should make a habit of always installing VMware Tools to ensure the optimal performance and stability of your VM. Also, double check to make sure that you're running the latest version of VMware Tools after you install any upgrades to ESX (incidentally, some ESX patches will also require updates to VMware Tools). There is a column in the Virtual Machine view in the VMware Infrastructure Client (VI Client) that will show the VMware Tools status of every VM and whether it is OK, out of date or not installed.

Virtual machine file types
As part of the troubleshooting process, you'll need to understand all the various file types involved with fixing a possible problem. Let's review the files associated with a virtual machine:

  • .nvram file – This file contains the CMOS/BIOS for the VM.
  • .vmdk files – These are the disk files that are created for each virtual hard drive in your VM. There are three different types of files that use the vmdk extension, they are:
    • *–flat.vmdk file - This is the actual raw disk file that is created for each virtual hard drive.
    • *.vmdk file – This is the disk descriptor file which describes the size and geometry of the virtual disk file.
    • *–delta.vmdk file - This is the differential file created when you take a snapshot of a VM (also known as REDO log)
  • .vmx file – This file is the primary configuration file for a virtual machine. When you create a new virtual machine and configure the hardware settings for it that information is stored in this file.
  • .vswp file – This is the VM swap file (earlier ESX versions had a per host swap file) and is created to allow for memory overcommitment on a ESX server.
  • .vmss file – This file is created when a VM is put into Suspend (pause) mode and is used to save the suspend state.
  • .log file – This is the file that keeps a log of the virtual machine activity and is useful in troubleshooting virtual machine problems.
  • .vmxf file – This is a supplemental configuration file in text format for virtual ...

    Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google





    machines that are in a team.
  • .vmsd file – This file is used to store metadata and information about snapshots.
  • .vmsn file - This is the snapshot state file, which stores the exact running state of a virtual machine at the time you take that snapshot.

Log files
Once you understand VM file types, you'll want to become very familiar with log files. Log files are the best method for troubleshooting problems with virtual machines. It's the first place you should check when problems occur.

The most important file is the Vmware.log file. This is the main log file for the VM on the ESX server, and is located in the working directory for the VM. Vmware.log is always the current working log for the VM and older log files are incremented numerically, i.e. vmware-1.log

You should also check /var/log/vmkernel and /var/log/vmware/hostd.log on the ESX host for any errors that may be related to the problem you are experiencing with your VM. Sometimes, restarting the hostd service (service mgmt-vmware restart) on the ESX host will resolve quirky problems with virtual machines. For more common problems, there are more specific techniques that will likely resolve your problem; I'll go over these next.

Problem: Can't shut down a virtual machine
Let's say you can not shutdown a VM using the VM power controls. You can try using command line methods to try and manually kill your stuck VM. There are several methods for doing this below. Employ these methods only as a last resort, short of restarting your ESX host.

  1. The first option you should always try is the command line equivalent to using the VI Client which is the vmware-cmd command.
    • Login to the service console
    • Type "vmware-cmd –l" to get a list of all VMs and their paths
    • You can check the VM state by typing "vmware-cmd //.vmx getstate"
    • To forcibly stop type vmware-cmd //.vmx stop hard"
    • Check VM state again, it should now be off
    • Type "vmware-cmd //.vmx start" to power on VM

  2. The second option is to try and manually kill the VM's process by finding its process identifier (pid) and issuing the kill command to terminate it.
    • Login to the service console
    • Type "vmware-cmd –l" to get a list of all VM's and there paths
    • You can check the VM state by typing "vmware-cmd //.vmx getstate"
    • Type "ps -ef | grep "
    • The second column is your pid of the vmkload_app of the virtual machine, you can also type "ps –eaf" to see all running processes
    • Type "kill -9 "
    • Check VM state again, it should now be off
    • Type "vmware-cmd //.vmx start" to power on VM

  3. The last option is to use the vm-support to command to try and force the VM to shutdown.
    • Login to the service console
    • Get the vmid of the VM you want to kill by typing "vm-support –x" or "cat /proc/vmware/vm/*/names"
    • Kill the VM and generate core dumps and logs by typing "vm-support –X "
    • You will be prompted if you want to include a screenshot of the VM, send an NMI to the VM and send an abort command to the VM. You must answer yes to the abort question to kill the VM. The entire process will take about 5-10 minutes to run. It will create a tar archive in the directory.

Problem: Can't power on a virtual machine
Another common problem may be that you can not power on a VM. This can happen if the host server does not have enough resources for the VM to use. For example, if the VM has a memory reservation set and the ESX host does not have enough physical memory to meet the reservation, then it cannot power on the VM. If this happens you can either remove the memory reservation from the VM and migrate it to another host with more free physical memory, or you can free up physical memory on the existing host.

Also, when a VM is powered on it needs to create a vswp file in the working directory of the VM on the ESX host that is equal to the amount of RAM assigned to the VM (minus any memory reservations). If there is not sufficient disk space on your ESX host, then you will also not be able to power on the VM. A workaround it to set a memory reservation equal to the amount of RAM assigned to the VM so the vswp file will be 0 bytes in size. It's important, however, to always take care to leave additional disk space on your VMFS volumes for things like logs, swap files and snapshots.

Problem: Virtual machine encountering boot errors due to OS corruption
If a VM is having problems while booting due to operating system corruption or faulty configuration, a good way to deal with this is to add its virtual disk to another working VM so you can access the drive and make any needed repairs. To repair the VM, you should make sure the problem VM is powered off. Next add an additional drive to a working VM and browse to the problem VM's disk file. Boot the working VM; you can now access the drive of the problem VM to make any changes or corrections. When you are done remove the drive from the working VM, add it back to the problem VM and try booting it again.

Problem: General virtual machine OS issues
For troubleshooting problems with the VM's operating system, I create a toolkit of ISO files that contain helpful troubleshooting applications that I can quickly mount on a VM's CD-ROM and use (or boot from) to make repairs to a VM. A few of the ISO files I use include:

  • Sysinternals utilities - Great utilities for troubleshooting Windows server problems.
  • Gparted – A Linux-based disk partition editor.
  • Knoppix - A Linux-based live CD with many tools and applications.
  • Ultimate Boot CD - A live CD with many system repairs and testing tools.
  • UBCD4Win - A Windows-based live CD with many system repairs and testing tools.

Conclusions
These are just a few of the problems and techniques that you will use when troubleshooting virtual machine problems. The information in this article should help you the next time you experience a problem with a troublesome VM.

ABOUT THE AUTHOR: Eric Siebert is a 25-year IT veteran with experience in programming, networking, telecom and systems administration. He is a guru-status moderator on the VMware community VMTN forums and maintains VMware-land.com, a VI3 information site.

Rate this Tip
To rate tips, you must be a member of SearchVMware.com.
Register now to start rating these tips. Log in if you are already a member.




DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



VMware Migration Tips - White Papers
HomeNewsTopicsITKnowledge ExchangeTipsBlogsMultimediaWhite PapersEvents
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2007 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts