Using vCenter Operations Manager to tame rowdy virtual machines

This second installment of the vCenter Operations Manager tutorial explains the interface that details your virtual environment's health.

VMware vCenter Operations Manager is a reporting tool designed to help system administrators find and eliminate...

virtual infrastructure issues. Here, we'll look at the interface, how it works and some tips to understand it.

(This is part two; part one covered vCOPs installation and setup.)

Exploring the vCOPs dashboard

To log in to the vCenter Operation's Manager (vCOPs) system, use the URL you set up during installation. Use the login that you set up -- the domain admin account will work. It is also possible to use the application via the vSphere console if the vCOPs plugin is installed.

If a serial number is not entered, the system will work in evaluation mode. Evaluation mode doesn't expire, but it doesn't give access to the more advanced features, such as the customizable dashboard. To add the license, open vSphere client, Administration, followed by Licenses. Click Assets and you will see vCOPs listed. Double-click on this, and add the evaluation or fully licensed key. Restart the application from the vCOPs admin console (Status, Application controls, then Restart). This may take several minutes.

After logging in you will see the vCOPs user interface (UI).

Health weather map
Figure 1. The Dashboard tab shows the health weather map of your virtual environment, risk trends and workload strain.

Notice the Dashboard tab is split into three parts. (The Dashboard is only available in the standard and advanced editions.) It is possible to drill down through your infrastructure using this. (See Figure 1.) The middle section gives you the data you are interested in. Collapsing the left-hand column will give you more space. There are several tabs at the top, including Environment, Operations and Alerts. The final right tab gives general overview for the infrastructure. If you see a gray tab, that means no data has been collected for this item.

Each one of the tabs across the top gives different metrics for the item selected on the left. In order to understand the metrics, you need to understand how the badges and data are made up and sorted.

The badge colors indicate the health of the system: Green means good; yellow means potential problem; orange means degraded performance; gray means unknown state; and X means powered off.

The badges at the top of the screen -- Health, Risk and Efficiency -- are more important metrics; the smaller the number, the better your overall health score. The other metrics on the Dashboard tab are workload, anomalies and faults.

The Health badge denotes the main makeup of the environment. Underneath the Health badge are the blocks under the health weather map heading. As you can see, this section gives a pictorial view of the environment's overall health. It covers the last six hours so you can see historical information. Hover over any parts that need investigation to get the name of the item in question, or click on the box and it will give you more details and where the issue lies.

Workload, as the name implies, relates to workloads that the item is currently experiencing. Over time, the system will adapt and generate a range for normal operation. Anything outside of the normal range will cause the score to decrease respectively.

Anomalies are events or changes that are unexpected. Small anomalies are normal, but if you get a spike in the anomalies section, you can infer that something is outside of the norm. A good example is when you get more than the "noise" level of alerts from vCenter, indicating something serious is happening or has happened. The alarm will let you know that something is wrong with the infrastructure.

Viewing overall virtual environment health

Key metrics

Figure 2. The Operations tab gives a more detailed look at key metrics on the processor and memory, as well as other metrics on your infrastructure.

The main window will give you a global score of how healthy the environment is. This is made up of the various metrics discussed above. To drill down, double-click on your World Health button; your view will then switch to the Operations view. (See Figure 2.) This will fill out and give you a better visual breakdown of the health of your infrastructure. You can click CPU, RAM and other metrics and see them at a global level.

On the Operations pane you will see a Top Offenders list that shows the items within the infrastructure that have problems. To see the issues on these items, click on the hyperlink.

It is possible to change the values at which the badges change color. There are also two different types of item that make up the workload badge. The first is a virtual machine and the second is the data store item. This means that you can now customize the data store to go yellow at, say, 80% utilization and red at 90% utilization.

Setting alarms via email or SNMP is available as well. This can be configured from the vCOPs Administration console. Enable SMTP or SNMP by clicking SMTP/SNMP, and enable the services with the credentials as needed.

Finally, there is a section of predesigned reports you can run to give useful information, for example on oversized and undersized virtual machines. Run the report and download the PDF to see how your environment stacks up against an ideal one.

This was last published in September 2013

