VMware Fault Tolerance is a high-availability feature that can be used within a VMware High Availability cluster. However, high availability is not synonymous with fault tolerance; there are meaningful differences between the two terms. Each setup requires different available resources and will affect virtual machines differently.
Knowing your company's uptime and data recovery needs is the first step to determining if you need high availability, fault tolerance or none of the above. The next step is understanding the system requirements of VMware FT and HA and how each product works.
The resources presented in this guide to VMware FT and HA will help you understand your options, with some real-world use cases to compare to your IT infrastructure and needs. Once you've perused the guide, test your knowledge of VMware HA and FT in our pop quiz.
Table of contents:
Where VMware FT and HA differ
The key difference between VMware's Fault Tolerance (FT) and High Availability (HA) products is interruption to virtual machine (VM) operation in the event of an ESX/ESXi host failure. Fault-tolerant systems instantly transition to a new host, whereas high-availability systems will see the VMs fail with the host before restarting on another host.
VMware High Availability
VMware High Availability should be used to maintain uptime on important but non-mission-critical VMs. While HA cannot prevent VM failure, it will get VMs back up and running with very little disturbance to the virtual infrastructure. Consider the value of HA for host failures that occur in the early hours of the morning, when IT is not immediately available to resolve the problem.
In addition to tending to VMs during ESX/ESXi host failure, VMware High Availability can monitor and restart a VM, ensuring the machine is capable of restarting on a new host with enough resources.
VMware Fault Tolerance
VMware vSphere Fault Tolerance has been around since 2009. If your company cannot withstand downtime for end users, VMware FT or a similar tool is required. Don't use FT for load balancing -- its role is protecting VMs when an ESX server goes down.
How does VMware FT work?
VMware FT instantly moves VMs to a new host via vLockstep, which keeps a secondary VM in sync with the primary, ready to take over at any second, like a Broadway understudy. The VM's instructions and instruction sequence are the actor's lines, which pass to the understudy on a dedicated server backbone network. Heartbeats ping between the star and understudy on this backbone as well, for instantaneous detection of a failure.
How and when to use VMware FT
So your company's IT resources are mission-critical, and unplanned downtime is out of the question. Ramp up fault tolerance tools and you're done right? Not so fast. VMware FT has stringent hardware requirements to take into account when requisitioning server hardware. Before you plan a fault-tolerant virtualized environment, check out your options for when and where to use FT.
New product features in VMware FT and HA
With a major overhaul to HA in vSphere 5 and murmurs of a soon-to-be-released new feature, we share some key points to know about VMware FT and HA road maps.
What's new? Faster failover in VMware HA, but no FT for SMP
VMware is planning a new high-availability design for release in 2013, called Virtual Machine Component Protection. Choosing a VM within a host to vMotion according to specific failover conditions improves failover.
Unlike HA, VMware FT uses synchronous replication to prevent any service interruption in the event of a VM failure. Mission-critical applications need fault tolerance, but despite user interest, FT for symmetric multiprocessing systems (SMP) seems stuck in a VMware preview purgatory.
High availability in a heartbeat
VMware instituted new intelligence for High Availability in vSphere 5. If the master cluster becomes unavailable or orphaned from the network, an election process takes over to prevent false-positive failovers. If a host becomes orphaned from the cluster in vSphere 5's HA, the storage network is available as backup. The admin can choose their heartbeat data stores in the HA Clustering dialog boxes.
Goodbye Legato, hello Fault Domain Manager
VMware also revamped the HA architecture in vSphere 5. Fault Domain Manager (FDM) took over for Legato Automated Availability Manager software, which was frustratingly complex. Now, admins have one master server, with all other servers in the HA cluster waiting in the wings to help in the event of a failure. If you're switching to vSphere 5 from an older version, make sure you have at least two shared data stores between all hosts in the HA cluster. What other changes can you expect? Heartbeats, simpler log and configuration files, and installs in under a minute, thanks to FDM.
The nitty gritty of vSphere 5's HA and FT setup
With the move from Legato to FDM comes major HA architecture changes, even if the "look and feel" will be familiar to legacy users. Learn the responsibilities of Master and Slave hosts in a cluster. This tip also covers important tips for using FT now that it is properly compatible with VMware's Distributed Resource Scheduler (DRS).
Coming soon to VMware FT and HA
In video footage from a VMware User Group (VMUG) meeting in Italy, VMware CTO Steve Herrod discussed plans for the company's products, including Fault Tolerance with SMP support and native application-awareness for High Availability.
Tips and tricks for VMware FT and HA
Check out these real-world scenarios for VMware HA and FT implementation to better understand how and why you will use either tool.
How to configure a VMware HA cluster
VMware HA cluster hosts must be configured exactingly for failovers to occur as planned. Before implementing VMware HA, consider performing a test of HA capabilities on a virtual machine, run through the list of cluster requirements and learn about the common problem of a host in a cluster failing during configuration.
HA experiences from the field
VMware HA is not a 100% uptime tool, as described in the comparison of VMware HA and FT. If you aren't sure whether HA offers the right amount of uptime support for your infrastructure, check out these real-world examples of VMware HA in action.
What to do when good HA goes bad
If your infrastructure runs VI3-based HA, there are several situations where VMware HA can fail or run into communication issues. You might be creating these situations just by growing your IT infrastructure. Follow these tips to prevent or resolve HA issues, and determine if HA is for you.
Use cases for VMware FT
Are you using VMware HA and the current edition of vSphere, but users are demanding 100% application availability? Do critical application servers within your organization have a single point of failure? Each organization's reasons for implementing fault tolerance may be different, but these five general use cases offer guidance on whether you need FT now or in the future.