VSphere memory performance is no different from other parts of performance management. It's all about pools of resource. If a pool is exhausted, then performance will suffer. Expanding the pool or reducing noncritical consumption will restore performance.
There are four levels for memory pool to consider:
- Configured RAM on the VM
- The vSphere resource pool holding the VM
- Installed RAM on the ESXi hosts
- Total RAM on a DRS cluster
At each level, the pool can be exhausted and result in lowered performance.
How to pinpoint VM RAM saturation
In the end, performance is about an application inside a virtual machine (VM). The configured RAM on the VM is the maximum amount that the operating system -- and therefore the application -- can use. This part is no different in a VM from in a physical server. Once the entire RAM in the OS is in use, it will start paging to disk as a substitute for RAM.
To fix this, either modify the application to use less RAM or give the VM more configured RAM. The best place to identify VM RAM saturation is using the performance monitoring tools inside the VM, just as you would on a physical machine.
Figure 1 shows a Windows VM where a single application is using the VM's entire physical RAM. This is a classic example where increasing the VM configured RAM can help performance. In the image, the VM is using 1.2 GB of RAM but has only 1 GB allocated. A simple increase of the allocated RAM to 1.5 GB would greatly improve performance.
One possible issue to watch for is the vmmemctl, the memory balloon driver, using RAM. This driver is part of the VMware Tools and is used to reclaim RAM from the VM. If ballooning is causing VM RAM saturation, then you must look at the ESXi server to resolve the problem.
Performance depends on RAM reclaim
ESXi allows RAM overcommit and, as a result, the ESXi server does not always hold all of the VM memory in RAM. When the ESXi host wants to use less RAM to hold a VM, it will reclaim physical RAM from the VM. Ballooning and VMkernel swapping reduce the physical RAM usage for the VM. Both ballooning and swapping are best seen with the vSphere performance graphs; if either of these is nonzero for a VM, then the VM is not getting all of its physical RAM. If a little RAM is being reclaimed, performance impact could be minimal. If a lot of RAM is being taken away from the VM and its applications, then application performance will suffer.
In Figure 2, you can see a VM with various amounts of reclaim. The workload remains the same throughout with a simple application wanting 98% of the 1 GB of RAM allocated to the VM. At the start, all of the VM memory is in physical RAM and the amber line of granted shows 1 GB of physical RAM allowed for the VM. In the middle of the graph, there is a large amount of RAM reclaimed with ballooning. Granted drops to 450 MB and ballooned climbs to 574 MB -- the memory that is no longer granted is reclaimed by ballooning. At the end, RAM is reclaimed so aggressively that ballooning is not enough and VMKernel swapping is used. Granted ends up at 50 MB, ballooned at 650 MB and swapped at 128 MB.
The performance of any application inside that VM will be very poor. In Figure 3, looking back inside the VM, with exactly the same workload as the first image, we see a lot of change. RAM usage is more than 1.5 GB. Since there is a lot of paging activity, CPU usage is much higher.
Reasons for RAM reclaim
The usual cause of RAM reclaim is that an ESXi server's entire physical RAM is in use. Since the ESXi server cannot grow more RAM, it must share what it has with a group of VMs. Usually you will see a vCenter alarm on the ESXi server under "Host RAM utilization" before the ESXi server starts reclaiming RAM from VMs. The resolution is to shut down VMs or move VMs to another ESXi server.
Be watchful for VMs with a RAM limit -- the maximum amount of physical resource the ESXi server will deliver to the VM. A VM with a RAM limit will have RAM reclaimed even if the ESXi server has RAM to spare. The lower the limit, the more RAM is reclaimed from the VM. RAM limits on VMs are almost never a good idea as they can cause huge performance problems.
Know your workloads in a DRS cluster
A DRS cluster is a collection of ESXi servers that together run a group of VMs. The cluster must have enough aggregate resource for the VMs and the VMs must be able to vMotion to any host. If every host in the cluster has excessive RAM utilization, then you must either upgrade the RAM in the hosts or reduce the demand for RAM in the VMs. If only a single ESXi has high RAM utilizations, or all except one, then there may be a configuration issue with the host. If VMs cannot be vMotioned onto or off the host, then DRS cannot balance the load. Make sure all of the hosts in the cluster can see the same data stores and networks and that they have the same type of CPU. One simple test is to VMotion a high load VM and see if an error occurs.
Like the rest of performance management, there are a lot of factors at play and you need to know your workload. VMs that are not being granted the amount of RAM that they need will perform poorly. Good configuration control should mean that a DRS cluster provides a large pool of resources for a large population of VMs. Careful monitoring and good planning means that the cluster never runs out of RAM for its workload and users are happy.