One of the key announcements from VMworld 2011 was news of "hypervisor-level" replication in the 5.0 version of...
Site Recovery Manager. At VMworld 2012, VMware decoupled the replication technology, offered it as a standalone product, and included licensing for vSphere Essentials Plus and above.
We have seen vSphere Replication go through four release cycles and it now sits at version 5.8, but has it gained the traction VMware hoped it would? If not, why? Many will say the Veeams, Unitrends and Zertos of the world have cornered this market, but vSphere Replication has a couple things going for it that the others don't: price and integration. That said, is vSphere Replication an enterprise IT product or is it more geared toward smaller environments? Before we can answer those questions, let's have a look at how it works.
How vSphere Replication works
Integration is where vSphere Replication really shines. While most backup vendors rely on snapshots to perform replication, vSphere Replication uses an agent in each ESXi host. This agent, along with a replication appliance at both the local and target locations, allow vSphere Replication to replicate VMs without the use or performance penalty of a VMware snapshot. Once the initial synchronization copy or seed of a VM is complete, the agent, along with the VMkernel, track only those blocks in a VMDK that have changed with a vSCSI filter, sending them to a replication appliance at the target site, where they are checked for consistency. Only after all the data is deemed consistent and complete are redo logs generated and blocks committed to the replicated VM.
How often this occurs depends on your recovery point objective (RPO) policy. The RPO can range anywhere between 15 minutes and 24 hours, meaning we could, if bandwidth and change rate allowed, have a cold copy of our VM sitting off-site that is just 15 minutes behind our production copy. There is a lot that goes into the RPO policy calculations -- history, network bandwidth, changed data, etc. -- to determine if the VM can meet its target. However, for the most part, this is all kept from the end user.
What's missing from vSphere Replication?
When looking at all the features that vSphere Replication contains, one might conclude it is a solid replication platform -- which it is -- but how does it stand up when placed next to other replication products? Keep in mind that this article focuses on running vSphere Replication as a standalone product, without the presence of Site Recovery Manager (SRM). Adding SRM solves most the issues.
- Snapshots: Initially, vSphere Replication supported only one restore point -- the most recent successful replication based on the RPO policy you selected. This certainly isn't feasible in an environment given that your VM could contain corrupt in-guest data and still be successfully replicated. Since vSphere Replication 5.5, a feature called Multiple Points in Time (MPIT) has been supported. MPIT instructs vSphere Replication to keep x number of snapshots for x number of days on the replicated VM to let us revert back to a point in time if corruption occurs. While this solves the corruption issue, the way in which we configure the times of replication -- utilizing an RPO -- can present confusion when it is time to restore. The times the snapshots are taken and the dates that are on our snapshot labels all depend on the number of replication events, which depend on the RPO selected, as well as the data change rate. Confused? Don't worry; I am too! Basically, our point-in-time snapshots end up being inconsistent when they occur, adding a bit of chaos when it comes to recovery. Also, the reversion or failover to a point in time is performed manually. Most third-party replication products are more consistent with the timing of restore points and also provide automation in the failover to specific points in time.
- Scheduling: Another limitation of vSphere Replication is scheduling. We can't simply schedule a VM to perform a full replication as vSphere Replication is driven solely by the RPO policy. VSphere Replication looks at changed data and network bandwidth, then creates its own replication schedule, meaning an RPO of one hour doesn't necessarily mean it will replicate every hour. This can be an issue for businesses with requirements to perform a full replication of a VM at a certain point in time to support compliance and regulatory reasons. Third-party replication products allow administrators to add the VM to separate jobs, each with a different purpose.
- Traffic compression/encryption: VSphere Replication does not provide any compression or WAN acceleration to speed up the replication time. There are ways within various vSphere licensing to throttle different types of network traffic, but other backup/replication products provide WAN acceleration. VSphere Replication also does not support any sort of data encryption, while most third-party replication software does.
- Automation and failover: When using vSphere Replication as a standalone product, you may find it lacking in automation and failover. Although we can failover a VM, we can only failover one VM at a time, unless we also are running SRM. This can be very cumbersome and inefficient in large environments. Other replication vendors provide failover plans to orchestrate and automate the failover process. Another issue: vSphere Replication doesn't do much for customization of our failed-over VMs, meaning these VMs would have to be powered-on first, then the administrator can manually change IPs and remap networks. This is something that is normally automated and included with other third-party replication products.
Does vSphere Replication make the cut?
How do we decide if vSphere Replication is right for your environment? Do you need to utilize WAN acceleration or do you already have something in place for that? Is the confusion about scheduling and point-in-time a deal breaker, or are you looking to simply replicate your data in case of a site failure? How much automation during a failover do you need if the production site goes down? These are questions you need to answer to make a decision on vSphere Replication.
In my opinion, vSphere Replication has a solid architecture and there are a lot of great features. I like the way it is "non-intrusive" to the guest during replication. I like the hooks it has into ESXi to utilize vSCSI filters. I like the way it utilizes an RPO policy to abstract the replication calculation. But, depending on the size of the environment, not having the failover automation within vSphere Replication can be an issue. Also, there are limits to the number of VMs (500) that vSphere Replication can support. For an SMB, the price -- included with vSphere Essentials Plus and above -- is very attractive and the feature set included may be enough.
In market terms, vSphere Replication standalone is still young, being only its third iteration since breaking away from SRM. The competitors are well-established and may have a richer feature set; some of them, such as Veeam, are already on the eighth iteration. I suggest looking at all the options out there and using a trial version to get a feel for the product, and then make the decision based on your needs and budget.