VMware has put a lot of engineering into the vStorage Virtual Machine File System (VMFS). The robust file system scales well and is purpose-built to host virtual machines (VMs). While many of the VMware Infrastructure 3 and vSphere implementations use VMFS-3, some administrators don't understand the details that make it work like a champ. This article takes a deep dive into the inner workings of
Requires Free Membership to View
When can you use VMFS?
For ESX and ESXi host systems, VMFS is my file system of choice. But not every storage system can
take advantage of it. VMFS can be used on iSCSI or Fibre Channel storage systems in shared storage
configurations. VMFS can be used on local storage but is not as effective for aggregating
resources. Many storage devices can communicate with ESX and ESXi over Network File System (NFS)
storage protocol as well. But when it comes to the bigger picture, VMFS will
always be a first-class citizen, according to storage expert Stephen Foskett, who says that for
a purpose-built file system, VMFS does a great job.
Because not every storage device offers administrators the option of running either NFS or VMFS, I'll focus on scenarios where VMFS is used on iSCSI or Fibre Channel storage systems. As you plan new installations, deciding whether to go with VMFS or NFS storage will be based on many factors specific to your requirements and constraints.
VMFS is a clustered file system
Simply speaking, a clustered
file system is a disk resource that is mounted as a local disk to multiple computers
concurrently. Many VMware administrators come from Windows Server backgrounds and have experience
with Microsoft Cluster Server (MSCS). VMFS differs, as MSCS does not allow simultaneous access to
the drives across multiple servers. Where you use MSCS or ESX on the same storage, the
configuration on the storage area network (zoning, multipath configuration, etc.) would be the
same, but the experience would be totally different.
Other clustered file systems include Lustre, Red Hat Global File System, Hadoop Distributed File System (HDFS) and IBM's General Purpose File System (GPFS).
VMFS has one primary distinction: There is no server or software that inherently controls access to the file system. VMFS runs on each host cooperatively and directly manages the file system namespace to regulate access to files by clients. Each VMFS volume has some space reserved on the file system for this "on board" coordination, as outlined in this VMware KB article on missing space from VMFS data stores because of hidden files.
VMFS management zones
Because VMFS does not have a coordinator to maintain disk access, a lot can be done across
management zones with volumes formatted as VMFS. A management zone could be a single standalone
ESXi host (which is free and has no vCenter management or licensing costs), an ESX/ESXi host that
is managed by vCenter, or multiples and combinations of the two. This can include a logical unit
number (LUN) that is zoned to ESX and/or ESXi hosts across multiple vCenter Servers. This type of
management can work wonders for moving VMs across management zones, because you don't need VMware Converter and you
don't have to work directly with Virtual Machine Disk (VMDK) file format, which can be slow and
cumbersome.
It's worth mentioning that this compatibility is not exclusive to VMFS. The image below shows a single LUN being accessed across management zones using the VMFS file system.
This configuration can be used in situations where licensing is not required or access is needed across management zones. Moreover, significant savings can be realized by avoiding vCenter licensing costs on designated hosts in situations where advanced features are not needed. Use this configuration only when circumstances warrant, however, as it can quickly become unmanageable. With unlicensed (free) ESXi hosts, VMotion, VMware High Availability, Distributed Resource Scheduler and other advanced features cannot be used. But other ESX or ESXi hosts in that cluster can run the advanced features.
VMFS volume composition
The data makeup of VMFS volumes is unique compared with other file systems. The contents of a LUN
are usually a collection of very small and very large files. The very large files can be the .VMDK
files for virtual disks, any snapshot files of virtual disks and memory swap files. The small files
include virtual machine (VM) logs, VM configuration files or BIOS files for the VM.
This consistent disparity is addressed by VMFS with a two-way distribution using file blocks and sub-blocks. The primary file blocks are at the well-known 1 MB, 2 MB, 4 MB or 8 MB increments that are selected when the volume is formatted. Be sure to check this blog post by Eric Siebert for more information on block size selection when formatting a VMFS-3 volume. Sub-blocks are smaller allocations within the file system to reduce internal fragmentation by each of the very small files on the larger blocks.
In the common scenario where a VMFS volume is formatted with the 1 MB block size, there are several 64 KB blocks that accommodate the smaller files. Each VMFS-3 volume will have 64 KB sub-blocks carved out of the primary blocks like any other resource on the file system. The figure below demonstrates this.
The larger block size will reduce fragmentation for larger files, and smaller files will use the sub-blocks to reduce fragmentation. This plays directly on the typical makeup of a virtual machine volume: very large and very small files. Clearly, the mixed-block size attribute of VMFS is one of the built-in efficiencies that help it to scale well. In this situation, it may make sense to format every VMFS-3 volume at 8 MB allocation units as the smaller files will use the built-in efficiencies.
Pluggable architecture
With vSphere, VMFS volumes can be accessed with enhanced functionality. The Pluggable
Storage Architecture (PSA) has allowed many organizations' VMware implementations to take a big
step forward, which Stephen Foskett equates with "big iron" in corporate data centers.
PSA allows vSphere Enterprise Plus installations to use the storage vendor's multipathing capabilities. Historically, VMFS-3 volumes that support multipathing have been bound to the ESX or ESXi multipathing policies. While adequate for basic failover, this does not leverage the logic of the software running the storage array to make the most of the storage. The PSA changes all that, making for a tremendous storage enhancement to VMFS-3 volumes.
| Rick Vanover (VCP, MCTS, MCSA) is an IT infrastructure manager in Columbus, Ohio. Vanover has more than 12 years of IT experience and focuses on virtualization, Windows-based server administration and system hardware. |
This was first published in September 2009

Join the conversationComment
Share
Comments
Results
Contribute to the conversation