Fault tolerance goes beyond high availability to provide constant uptime for virtualized infrastructures. That means VMware Fault Tolerance requirements also go beyond those for high availability, because FT is based on networked pairs of synchronized VMs.
VMware High Availability (HA) ensures high uptime for important data center resources, restarting virtual machines (VMs) immediately when a failure occurs. Nevertheless, there will be a short period in which the service is not available. If you need constant uptime, you need VMware Fault Tolerance (FT).
VMware FT came out with vSphere 4. Critical VMs run as synchronized pairs of VMs: a primary VM on one host and its secondary VM on another host. VMware vLockstep technology keeps the VMs in synchronized state. If the primary VM fails, the secondary takes over without any time in between. The user experiences no interruption or loss of connection.
VMware FT's vLockstep technology ensures that all x86 instructions are executed in an identical sequence on both hosts. The primary VM is leading in this process, and sends all instructions immediately after executing them to the secondary VM, over a dedicated FT logging network. Only the primary VM will generate output.
This FT logging network is essential for synchronizing the paired VMs and also to send heartbeats to the other hosts through ESXi. Heartbeats -- signals sent between hosts -- help immediately detect failures.
Preparing for a VMware FT cluster
VMware FT requirements include storage, hardware, network and host considerations. Start with a dedicated high-speed network; 1 Gb/sec is the minimum, but I recommend 10 Gb/sec to keep up with the VMs. Latency between the ESX hosts should be below a millisecond. Use the
vmkping command to measure latency. You can choose a host for the secondary VM or let VMware Distributed Resource Scheduler automatically create it.
Do I need VMware HA or FT?
Learn how VMware FT differs from HA
Test your knowledge of VMware's uptime tools
Learn the right situations for VMware FT
Get a VMware HA education
VMware FT requires at least two hosts for primary and secondary VMs, although I suggest a minimum of three hosts. The third host will allow the FT setup to be recreated as soon as there is a failure in one of the VMs. After the primary VM fails, the secondary VM automatically becomes primary. To maintain high availability of services, you'll want VMware FT to create a new secondary VM automatically -- requiring a third host in the FT network.
To make sure that host failures are detected, you can only use VMware FT in a VMware HA cluster. Your FT hosts must all connect to the same storage and all use the same software version. Use similar hardware for each VMware FT host as well -- VMware FT tolerates some slight hardware differences, but it is easier to use the same hardware for each host.
The VMDK files must be on shared storage: Fibre Channel, iSCSI or network-attached storage. You also must thick provision or eager zero the VMDK files. If VMs are using the wrong disk format for VMware FT, you can convert them using the vmkfstool
diskformat eagerzeroesthick command.
VMware FT has some tough hardware requirements, especially considering that 10 Gb/sec networking won't be available by default in all data centers. Even if you have everything in place to run VMware FT VMs, be aware that it doesn't give you 100% uptime certitude. VMware FT protects against VM failure, but it doesn't guarantee the availability of software running in these VMs. Therefore, invest in an FT infrastructure as one part of an overall plan to guarantee service availability.