yblaz - Fotolia

How do affinity and anti-affinity rules improve VM failover?

Proper VM placement can go a long way toward maintaining availability in the event of a failure. How can you use affinity and anti-affinity rules with DRS for better VM distribution?

The average administrator often spends considerable time and effort to arrange VM distribution on host systems...

across a data center. Proper VM placement ensures that related VMs can stay on common hosts for better performance, access storage in the most efficient ways and so on, but even the best-laid plans are forgotten when VM failover occurs. While it's certainly possible to fail over a VM to another host simply based on available resources, this kind of random VM failover placement is hardly best practice. A VMware administrator can implement affinity and anti-affinity rules for Distributed Resource Scheduler to improve VM failover.

Important VMs rarely operate as a single instance in an enterprise data center. You can organize two or more instances of a VM into a cluster to enhance VM resilience. When a VM is placed or a failover event occurs, other nodes in the cluster can continue to support workload availability, while the node in question starts properly on another host and rejoins to the cluster.

It's important to consider where a VM should -- or should not -- restart in your virtualized data center.

It's poor practice to start a VM node on any random host because the VM might require access to certain storage, network or compute resources, as well as access to other related VMs, such as a back-end database. This means an administrator must consider VM placement behavior in advance and apply rules that enforce that behavior.

Where to use affinity and anti-affinity rules

VMware administrators can employ rules in platforms like Distributed Resource Scheduler (DRS) and vSphere High Availability (HA) that guide the placement of VMs during scale-up or failover events. Administrators can use affinity or anti-affinity rules. Affinity rules will place specific VMs onto certain host systems -- or a member of a host group -- when a VM placement occurs. Anti-affinity rules will force a VM instance to remain separated on a different host from other identical VMs during startup or failover.

It's worth pointing out that affinity and anti-affinity are not logical opposites. That is, an affinity for one host is not automatically anti-affinity for all other hosts. For example, you might use an affinity rule when it's important to keep a VM failed over within a group of specific servers. However, anti-affinity rules prevent a VM from failing over to a host that might carry another instance of the same VM. This, in turn, prevents multiple instances of the same VM from sharing the same host and risking a single point of failure that would compromise availability. The use of affinity and anti-affinity rules depends on what the VM placement goals and limitations are.

Affinity and anti-affinity are not logical opposites. … An affinity for one host is not automatically anti-affinity for all other hosts.

While DRS emphasizes the use of rules for proper VM placement, HA's principal priority is to maintain availability. This means HA failover events might not always obey affinity and anti-affinity rules, which can cause problems for older versions of vSphere. With vSphere 6.0 and later, however, the administrator can also specify how vSphere HA applies affinity and anti-affinity rules during failovers. For example, the administrator can edit DRS affinity and anti-affinity rules to select desired behaviors for vSphere HA.

There are two types of rules enforcement for vSphere HA. If the administrator selects HA should respect VM to Host affinity rules during failover, vSphere HA will try to place VMs covered by the rule onto the specified host systems if possible; if not, HA will fail over the VM somewhere. If the administrator selects HA must respect VM anti-affinity rules during failover, vSphere HA will enforce VM separation. If VMs covered by the rule would ultimately be placed together -- violating the anti-affinity rule -- the failover is cancelled.

Next Steps

Manage virtual CPU distribution to avoid performance problems

Reduce administrative fatigue with DRS rules

Get more out of servers with balanced VM distribution

Dig Deeper on DRS and DPM