tiero - Fotolia
A properly architected vSphere Metro Storage Cluster (vMSC) environment should be able to accommodate a wide range of failures -- everything from a fault in a single network cable up to and including the loss of an entire data center. Such a broad scope makes it impossible to detail every possible permutation, but you can see several general categories.
First, VMware vMSC is effective at isolating and correcting single-host (single-server) failures within a data center. For example, vMSC can tell that a host loses network connectivity, and VMs can continue to run (awaiting network restoration). Similarly, a single host that fails completely can utilize vSphere HA affinity rules to determine where to restart any affected VMs on the troubled host.
VMware vMSC is also capable of addressing almost any practical issues that arise in storage resources duplicated between multiple remote data centers. Although individual disk or disk group faults are typically rectified within the affected storage array, vMSC can identify the loss of an entire disk shelf within an array and recognize the loss of connectivity between storage subsystems across data centers --storage switches lose connectivity, but server connectivity remains. It can also address the total loss of connectivity between data centers (storage and host servers), and deal with the complete storage failure at a data center. In all of these scenarios, it's possible for the organization's VM workloads to remain running without any disruption.
Third, vMSC can address even more substantial or disruptive events at one data center and restart affected VMs at another. For example, vMSC can detect and respond to a permanent device loss (PDL) in a data center, a full compute failure at one data center, or the complete loss of an entire data center. With the correct implementation and configuration, all of the affected VMs in one data center can be successfully restarted in another without disruption.
Proper recognition and response by VMware vMSC depends on the correct settings which influence the availability and recoverability of VMs in the aftermath of a failure. This means IT professionals will need to pay particularly close attention to VM-to-host affinity rules, selected responses to PDL, proper isolation address configuration and heartbeat data stores, and taking pains to avoid split-brain scenarios when failures occur which can complicate or introduce errors in recovery.
What are the best practices for VMware vMSC deployment?
How does VMware vMSC fit in the data center disaster recovery?
Using VMware vMSC for a more flexible data center
How vSphere Metro Storage Cluster differs from Site Recovery Manager
Architecting and operating a VMware vSphere Metro Storage Cluster
Dig Deeper on Backing up VMware host servers and guest OSes
Related Q&A from Stephen J. Bigelow
Just because software passes functional tests doesn't mean it works. Dig into stress, load, endurance and other performance tests, and their ... Continue Reading
Don't neglect form factor as part of your data center server selection. Instead, figure out what type of environment you need and learn which server ... Continue Reading
Learn how load balancing in the cloud differs from a traditional network traffic distribution, and explore the different services available from AWS,... Continue Reading