Avoiding downtime with VMware Fault Tolerance and High Availability
A comprehensive collection of articles, videos and more, hand-picked by our editors
By now you know what VMware Fault Tolerance (FT) is -- high availability for any supported vSphere guest OS between...
two vSphere servers. More specifically, FT is protection from a failed server (unplanned downtime). VMware offers other levels of protection for other levels of failures (server component, storage, data, and site) but FT protects your critical applications from a server failure. It sounds nice, but how can this help you in the real world, in a real data center? In my tip "Understanding VMware Fault Tolerance benefits and requirements," I covered how FT can help you and what the hardware requirements are.
In this tip, I will discuss the use cases for FT and the impact its stringent hardware requirements make on your future server hardware purchases.
Fault Tolerance use cases
While it is easy to enable VMware Fault Tolerance, there is no need to use it on every virtual machine. While it doesn't cost you extra to enable it, by trying to do so, you will quickly deplete the resources on your ESX hosts. This is because when you enable FT, you are creating a second virtual machine, with the same amount of RAM in use, some additional CPU overhead, effectively doubling the number of VMs in your inventory.
So, if you don't enable it across the board, where do you enable it? Here are the most common uses for VMware Fault Tolerance:
- Anywhere you use VMware High Availability. If you are already using VMware High Availability (VMHA) on any virtual machines, enabling FT may be a good move for those VMs. While FT will take up double the resource for those VMs and may require a vSphere upgrade, by using FT those protected virtual machines won't have to reboot and the end uses' applications will never miss a beat when and if an ESX server fails.
- Fault Tolerance on demand. At many organizations there are certain virtual machines that become critical at a specific time of the month or year. For example, the chief financial officer'ss virtual desktops at year end or the payroll virtual server at the end of each month. It is easy enough to enable FT that you can just right-click on these virtual machines when they become critical and enable FT.
While those two uses are valid, the next three use cases are more plausible, in my opinion.
- Any application server with a single point of failure (SPOF). Many times application servers start off small and grow larger and much more critical than expected. For example, a Blackberry server might have just started out for a handful of users and, today, fulfills the Blackberry needs for the entire company. Usually there aren't HA options offered for these types of servers or, if they are, they are complex or cost a great deal as compared to the value that the application brings. These applications have a single point of failure (SPOF) and FT is the ideal option to provide HA for these application servers.
- Any application that required clustering but it was cost prohibitive. Certainly if you have a large group of enterprise email servers providing email for 5,000 users, you can justify a HA option costing $10,000. However, for small and medium-sized database servers, Exchange messaging server supporting fewer than 1,000 users, or for critical remote branch office servers like point of sale, FT can provide HA at a much lower cost when compared to dedicated HA clustering options.
- Use FT in the future. Today, FT syncs the memory between two ESX hosts sharing the same storage (where the virtual machine disk is located). In the future, the HP/Lefthand P4000 storage appliance (and other companies) will be able to offer FT between sites that have high-speed connectivity to one another.
Hardware purchase implications of Fault Tolerance
As I discussed in my previous article, you can't run FT on just any vSphere-compatible server because FT uses special features of the CPU. That's why VMware FT requires Intel 31xx, 33xx, 52xx, 54xx, 55xx, 74xx or AMD 13xx, 23xx, 83xx series of processors (or greater) processors. While multi-processor support will likely be available in the future, today FT is only on VMs that have a single CPU.
Besides the requirements for particular CPUs, for a VM to be protected using fault tolerance both the primary server and secondary server CPUs be the same or from the same category. (These CPU compatibilities categories are broken down in VMware KB article 1008027, "Processors and guest operating systems that support VMware Fault Tolerance.")
That means that you could have two servers that are compatible with FT, but if one is a Xeon 7400 series and the other is a Xeon 5500 series, you still can't use FT.
Even if you go to buy a brand new server today, not all CPUs will be compatible. Even barnd new, one could be FT-compatible and not the other.
So how do you avoid hardware incompatibilities with Fault Tolerance?
- Don't assume that if avSphere compatible server is also FT compatible.
- Check the FT compatibility list and make sure that your servers are not only on it, but that they are also from the same CPU category.
- Run the free VMware SiteSurvey tool on your existing servers to see which are compatible and if they are from the same CPU category.
Eric Siebert covers the SiteSurvey tool and everything that it checks for in his Master's Guide to VMware Fault Tolerance.
In the past, admins tried to keep their desktops and servers from the same respective families in order to make support and OS images compatible. Now, admins have a new reason to try to make sure that new servers are from the same CPU category as existing servers - Fault Tolerance. Unfortunately, as new server models come out, it isn't always possible to get the older model with the same category of CPU. Admins have already run into this problem with VMware VMotion. While Enhanced VMotion Compatibility (EVC) partially solves this, there isn't such a solution for FT.
One more point about compatibility, FT can't protect VMs with operating systems. VMware KB article 1008027 also points out that specific operating systems may not be supported with certain CPU architectures, or you may be required to reboot your OS when enabling FT.
In summary, for new hardware purchases, the solution is simple:make sure servers are vSphere compatible, FT compatible, and from the same CPU category as your existing servers (if possible).
About the author:
David Davis is the director of infrastructure at TrainSignal.com . He has a number of certifications including vExpert, VCP, CCIE #9369 and CISSP. Davis has also authored hundreds of articles and six different video training courses at Train Signal with his most popular course being the VMware vSphere 4 video training course. His personal website is VMwareVideos.com . You can follow Davis on Twitter or connect with Davis on LinkedIn.