VMware Site Recovery Manager is the company's disaster recovery product, which is an add-on product to vSphere and requires a vSphere license at both the primary and backup locations.
Rackspace engineers Pranav Parekh and Ranjit Singh walked attendees at the October Boston VMUG conference through some tips on using VMware's SRM DR product.
1. Prioritize your apps. When it comes to disaster recovery, the first decision is likely which apps you should protect. "Decide based on what's making you money," Parekh said. "Do that first, and everything else can wait." Session attendee Richard Leclair, systems engineer at a financial institution, said for his organization, that's part of the "process of going from protected to unprotected. We're working with application groups on priority one versus priority two apps."
SRM uses protection groups, which are array-based groups of VMs or data stores. A single protection group can be included in multiple recovery plans. These can be organized by business unit or technology, Parekh said, and users should decide what to fail over together or separately.
2. Work backwards. Make sure the DR target is in good shape. Have a DR site that's not within the same data center as your primary environment. Also, it's ideal to have a test network at the DR site -- Parekh said the preferred approach is a VLAN extension. SRM will orchestrate the IP changes. Also, DR sites don't need to be one-to-one -- you can have up to 10 production sites going to one DR site.
3. Read the fine print. The apps you've been using may not be supported by the vendor if they are moved using SRM. "Contact the app vendor. Tell them you are putting your VMs in an unclean state," Parekh said. "Sometimes complex apps may not be supported."
4. Map it out. Understand your architecture before you start moving anything with SRM. Understand the dependencies among your applications and systems, and map them out. "Documentation is key," Parekh said, adding that administrators should "set up a separate HA management cluster on both sides."
5. Replicate for real. Next, assign priorities to applications and put them in replication for SRM. Within each priority, you can organize further, by the order in which you'd like individual VMs to be backed up.
6. Stick to your path. Figure out how you'll approach the ongoing maintenance, migration and upgrade path at your DR site. When you're upgrading ESXi, for example, "upgrade at the DR site first to avoid compatibility issues" back at the main center, Parekh said.
Some other considerations in using SRM may be storage. Automating DR with SRM will split the storage capacity, so that some is used to power the VM and some is used to do the replication. That will reduce the VM's journal size. SRM is storage-agnostic but requires array pairs for replication.
In addition, you may need to reinstall vCenter version 5.1 or 5.5 if you initially did the "simple" install. Version 5.1 doesn't offer a multisite option at all, and 5.5 may need to be reinstalled using the "custom" option to support SRM, according to Singh. Also, take note of VM boot priorities and dependencies, as boot times may have to be increased to support SRM use.
Finally, remember that you can test SRM without shutting down. It may not be the real thing, but tests can show off those areas that need finessing. Leclair said his bank has installed SRM but hasn't yet experimented with it. "You don't know how well it works till there's a disaster," he said.