For those of us who have been in the trenches of IT for a long time, the world of virtualization has brought tremendous benefits to protect VMs and keep applications running. But a solid VMware restore and backup plan will remove any doubts about recovery if disaster hits the data center.
Using technologies such as VMware's High Availability, DRS and vMotion allows us to keep our systems current and still provide near-100% availability. The availability and uptime service-level agreements that we can provide to our customers would have been unthinkable a decade ago. In addition, the improvements in hardware reliability and speed have made our jobs so much better and easier.
With all the good, though, comes a danger. It can be easy to become complacent when your infrastructure is on cruise control, when upgrades are routine and uneventful. However, this does not diminish the need to stay vigilant about how to handle major issues should they arise.
As we know, much of virtualization's capabilities are predicated on shared storage always being available. Storage redundancy technology further ensures reliability. What happens when a large portion of your storage suddenly becomes unavailable? Cascading hardware failures can leave your VMs unavailable and your customers justifiably upset. No matter how well-designed your infrastructure, sometimes bad things happen. It is in these cases where the lessons we learned from the old, pre-virtualization days can save the day.
Keep backups and test restores
In the old days of IT -- also known as the 1990s -- backups and restores were part of the daily routine. Backups were made, servers crashed, servers were rebuilt, files were restored, and the cycle would repeat. With all the hardware redundancy and reliability now, it can be easy to forget regular backups. Nonetheless, backups should be an integral part of your virtual environment, in case of a systemic infrastructure failure. In addition to a backup that is stored on the same storage pool as your VMs, admins should keep another backup separate from the infrastructure in case a disaster hits the data center.
Backup systems that take advantage of vCenter connectivity and management capabilities, as well as the simplicity of backing up virtual machine disks, have come a long way in the past few years. Take the time to pick a reliable, VMware-supported backup platform. Spend the necessary time up front to ensure you get solid, reliable and easily recoverable backups of your VMs. Off-site backups may be your last holdout in a bad situation, so make sure they are good.
Carve out the time to do a periodic test restore of a VM. Not just a single file, but the whole machine. Remember the IT axiom: "Nobody cares about backups. They only care about restores!" Knowing that your restores work -- as well as being familiar with the restore process that works best in your environment -- will go a long way in getting your environment back online as quickly as possible.
Build a physical or virtual vCenter?
Abler pens than mine have made very good cases for virtualizing vCenter, but I believe it makes more sense to keep some servers as physical machines, and that includes vCenter Server. Having a server that is not tied to shared storage and hosts can provide a welcome source of stability.
Keeping the vCenter management tools separate from the operational environment provides the advantage of quickly and easily analyzing breakdowns in your operational environment. It is easier to focus on getting your VMs either restored or rebuilt, without having to focus on rebuilding vCenter first. Of course, that also requires a good backup and restore plan for your vCenter server, as well as hardware redundancy.
Check the infrastructure from all angles
I encourage everyone to take a high-level view of their infrastructure and look for potential weak spots, particularly any single point of failure. Try not to be so caught up in the minute technical details that the big picture is missed. Ask yourself: If this entire system failed, what would I do? Use this question to guide you to improve your environment and keep your customers happy.