Hyper-converged infrastructure represents a big change in the way a virtualization platform is architected. Like all architectural changes, there is an impact to how the environment is operated. Converging the storage and compute means that a change with one could have an effect on the other. Before implementing a new architecture it's important to understand what the operational costs are. Some tasks are likely to be easy, while others are complicated or simply new.
In conventional virtualization architecture, the VMs are stored on an array, which is a dedicated platform used for nothing but storage. The array is expected to be available at all times. The array and its storage network are often managed by its own team, which is the storage team. The VMs run on hypervisor hosts that share access to the storage array, usually over a dedicated storage network. The hypervisor and the VMs are looked after by the virtualization team. Keeping storage and compute separate allows changes to one without impact on the other. Often operational challenges arise from having two separate teams, each with little understanding of the other's domain. In an ideal world, the two teams would work closely together, but in many organizations these teams conflict. Provisioning requests and performance troubleshooting are complicated and slowed by communication between the teams.
In a hyper-converged architecture, the storage and compute are combined. Each hypervisor host has local storage, which is clustered and made available as redundant shared storage. The storage cluster is software running on each hypervisor host, either in the hypervisor itself or in a VM on the host. The cluster stores multiple copies of each piece of data. This provides redundancy and distributes the data across the cluster to provide performance. The data pieces are spread across multiple hypervisor hosts to ensure data availability if a single host is unavailable. Most hyper-converged systems default to two copies of each data item. This redundant storage means the usable capacity of the cluster is half the purchased capacity.
Management is merged
Another benefit of hyper-convergence is the consolidation of management. The virtualization team will manage the compute and the hyper-converged storage. Generally, this is a positive impact as there is no handoff between the virtualization and storage teams. The virtualization team can immediately provision the storage they need and access the performance data they require. The challenge is the virtualization team must learn about storage. Most hyper-converged offerings simplify storage management by using a large amount of SSD to provide performance.
Removing a host
With conventional architecture, a hypervisor host can be taken out of service by migrating VMs to other hosts. This usually takes several minutes and the RAM contents of the VMs must be copied across the network.
Under a hyper-converged architecture, the hypervisor host is also part of the storage array. The data stored on the host must be copied to another host, as well as the RAM contents, before maintenance. Since the VM disk is usually many times the size of VM RAM, this copy can take hours, particularly as disks are much slower than RAM.
Taking a hyper-converged host out of the cluster without compromising data protection will take much longer than a conventional host. One way to avoid this is to store a third copy of the data. During maintenance, data will still be stored redundantly without making a new copy. Of course, storing three copies of data means only a third of the total capacity can be used to store VMs. You may need to buy much more disk capacity to allow faster maintenance.
Hyper-converged architecture uses the distributed nature of the storage cluster to deliver performance. Usually each hypervisor can call on the entire storage cluster to deliver performance. A hypervisor host that is out of service will not deliver performance. The storage cluster may not deliver its peak performance during maintenance. Hopefully, the VM workload does not push the storage cluster to its limits in normal operation. At times when storage performance is critical, it may not be acceptable to undertake hypervisor host maintenance.
A common operational activity is patching of the hypervisor. Like any piece of software, faults will be found and patches will need to be implemented to protect security, performance and availability. Each host must be taken out of service, patched and rebooted before returning to service. In conventional architecture, a cluster of eight hosts could be patched in a couple of hours with a pace of 15 minutes per host. With hyper-converged architecture, the time per host could be a couple of hours, meaning an eight-host cluster would take a whole day to patch.
Compare the top hyper-converged offerings
What you need to know about hyper-converged infrastructure system storage
Find out if a hyper-converged infrastructure is the right move for you?