Organizations pay close attention to workload resilience and use clustering and other technologies to ensure application...
availability and maintain an efficient use of resources. As critical workloads proliferate across the enterprise, however, the same imperatives of resilience and efficiency must also extend to storage in order to ensure data availability and efficient storage use.
VMware vSAN supports data protection as RAID 0, RAID 1 or a combination of the two. But other RAID models allow for more efficient use of storage resources. VMware vSAN 6.2 adds support for RAID 5 and RAID 6 erasure coding to provide recoverability for disk groups, while mitigating storage usage. Both RAID techniques have protection, capacity and performance tradeoffs that administrators should recognize and select according to the organization's resilience and data protection needs.
Both RAID and erasure coding aim to achieve the same data protection goals, but these two technologies aren't quite the same thing.
Erasure coding is a broad, generic term that encompasses any means of breaking up and partitioning data into segments that can be recovered if original segments fail or wind up missing. Reed–Solomon encoding involves a group of approaches that augment X data values with Y new values that the X data generates through the use of polynomials. The newly generated Y values are collectively known as erasure codes.
RAID 5 and RAID 6 typically fall under the generic umbrella of erasure coding because parity blocks are generated based on underlying data values. Data is normally spread out across multiple disks organized into a RAID group. Mathematical processes calculate parity, and the parity data is also spread out across those grouped disks. If a disk fails, the parity data can recover or reconstruct the lost data to a replacement -- or spare -- disk.
For example, the RAID 5 erasure code relies on basic parity and augments X bit values with a new Y bit value with very simple exclusive or (XOR) binary math. It fits the Reed–Solomon umbrella. RAID 5 typically uses four disks in a RAID group. Data and parity are spread out across all group disks. RAID 5 can recover one failed disk, using the parity data from remaining disks to recreate the missing data. While RAID 5 usually uses a minimum of four disks, RAID 5 groups can be much larger.
By comparison, the RAID 6 erasure code takes this process a step further. RAID 6 enhances basic parity calculations with a second layer of mathematical calculations -- a second level of parity -- which is also spread across the group disks. RAID 6 normally uses a minimum of six disks in the RAID group, but enhanced protection can recover up to two failed disks in the group. RAID 6 is used in critical storage situations that demand protection against the chance of multiple simultaneous disk failures -- for example, a second disk fails, while a first failed disk is rebuilding. RAID 6 groups can involve more than the minimum six disks.
RAID 6 used to involve a measurable storage performance penalty because of the additional computations needed for the second layer of calculations, which are more intensive than simple XOR. Today, the performance penalty is largely irrelevant with the advent of modern processor instruction sets, such as Supplemental Streaming SIMD Extensions 3 (SSSE3) and Advanced Vector Extensions 2 (AVX2), capable of processing such mathematical operations more efficiently.
In summary, VMware vSAN 6.2 implements both RAID 5 and RAID 6 based on Reed–Solomon encoding. VMware vSAN takes advantage of the Intel SSSE3 and AVX2 instruction sets. This adds to the existing support for RAID 1, also known as mirroring.
Is RAID 5 or RAID 10 better for Exchange storage?
More users look to erasure coding over RAID
RAID and erasure coding face off for flash storage
Dig Deeper on Backing up VMware host servers and guest OSes
Related Q&A from Stephen J. Bigelow
Full virtualization and paravirtualization both enable hardware resource abstraction, but the two technologies differ when it comes to isolation ... Continue Reading
Organizations can cap their hyper-converged infrastructure costs when they deploy the Azure Stack HCI platform, but once they plug into the cloud, ... Continue Reading
You can implement ESXi on ARM -- or other RISC processors -- in micro and nano data centers. A nano data center is more specialized but also more ... Continue Reading
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.