Storage has entered the final phase of a long, miserable war that has seen every aspect of IT storage change. The...
storage wars started with the emergence of entirely new classes of storage, championed by startups. The second phase involved well-known technology companies, such as VMware, embracing or developing new storage technologies, like Virtual SAN or Virtual Volumes. The final phase is consolidation, where companies merge, are acquired and go out of business en masse. Combined, these phases have contributed to the long-term evolution of storage.
Changes in enterprise storage have trickled down to cause a revolution in small and medium-sized businesses and midmarket storage. This, in turn, is putting pressure back on enterprise storage as those midmarket vendors prove capable of meeting enterprise needs. Vendors, such as VMware, which have branched out horizontally may occupy any point along the market spectrum, as they have existing enterprise clients, and also provide the sort of novel storage offerings that enterprise vendors typically avoid until midmarket companies have tested them out first.
The "just add a shelf" style storage offerings provided by old guard companies like EMC and NetApp are dated and have become legacy; frankly, they're too expensive to compete, too miserable to manage and too inflexible in practice. A decade of bitter competition has changed the storage space and virtualization along with it. Scale-out storage, flash, hyper-convergence, object storage and the movement toward Infrastructure as Code (IAC) have taken some time to unfold, but their influence is everywhere in IT -- blatant and subtle.
The distinction between legacy storage and scale-out storage is perhaps the easiest to explain. Legacy storage interconnects shelves of disks with storage controllers. If you need more space, you add a shelf. If you need more speed, you add a shelf and, maybe, a controller. Both are expensive, take up a lot of room and only scale so far.
Once your legacy storage has reached a given maximum -- storage or performance -- you need to start all over again, buying new controllers and shelves, and engaging in the miserable process of data migration. Upgrades with legacy storage aren't smooth, simple or cheap. This has earned them the nickname "forklift upgrades," because at enterprise scale you end up bringing new equipment in by forklift and carting away the old in the same fashion.
Avoid a forklift upgrade with scale-out storage
Scale-out storage is different. You buy a storage unit based on a commodity x86 server that contains some hard drives. The processing power of the storage unit is sized to support the drives it is filled with. When you need more storage, you buy another storage unit and plug it in. The previous unit will find the new one, connect to it and expand your available storage. As your needs grow, simply buy more storage devices and add them. It couldn't be simpler.
Proper scale-out storage systems took a while to develop, but today they are designed to handle different node types in the same cluster. This means there are no forklift upgrades. The makeup of the cluster simply evolves over time as newer units work in harmony with the old. As the old units fall out of support or are no longer economical they can be removed without impacting running workloads.
Flash storage has its pros and cons
Flash storage also changed everything. All-flash systems can provide a 2U storage device the equivalent of five racks of legacy storage disks, at storage latencies you simply couldn't achieve with magnetic disk.
Flash has its downsides, mainly it is more expensive than disk and, more importantly, it isn't as abundant as disk. Even if the price were right, we couldn't provide for the world's storage needs using flash because we simply do not have the planetary NAND fabrication capacity to meet demand. In fact, we probably couldn't meet one-tenth of global storage demand with flash today, even if we ignored the fact that the same facilities used to crank out flash are also the ones we rely on to make out RAM.
Hybrid flash storage is an option. "Hot" data lives on flash with "cold" data living on magnetic disk. This allows performance and capacity with minimal rack space consumption. This is very important because the rise of virtualization has meant that individual servers rarely run just one application; subsequently, storage is increasingly asked to support multiple workloads, even from a single server.
Hyper-convergence runs storage and computing
Hyper-convergence is the storage technology that has had the biggest impact on virtualization, in no small part because of VMware's VSAN. Instead of separating storage and computing into discrete components, storage is run from the same servers that are running the hypervisors and, consequently, our workloads.
This, too, comes with a downside: As demands on the storage increase, more of the hyper-converged cluster's resources become dedicated to the storage rather than running workloads. Fortunately, if you need more performance you can just add another server to the cluster; this provides compute capacity, storage capacity and additional storage performance.
Perhaps most importantly, hyper-convergence implies that virtualization is the new normal. Running a single workload per server is considered quaint, reserved only for the most demanding workloads. Containerization is starting to challenge virtualization, but lots of people run their containers inside a VM as well.
This normalization of virtualization means a lot for VMware. VMware isn't a startup anymore; it is the default provider of critical pieces of a data center's infrastructure. If an architect is interested in using a competitor's product -- or something other than virtualization altogether -- they'll have to see how it stacks up against the default choice of VMware.
That's a powerful position to be in, of which VMware should be proud. Hyper-convergence has underscored the normalization of virtualization, but hyper-convergence in general -- and perhaps VSAN in particular -- is not far from this point of acceptance itself, perhaps within the next five years.
Object storage skips the middleman
Object storage is a whole new way to do storage. It assumes that applications should talk to storage directly instead of talking to the OS, having the OS request storage from the hypervisor and so forth.
Object storage is all about the application storing what it needs using simple "GET" and "PUT" requests. Essentially, developers can store terabytes of data using the same techniques they are familiar with for getting data in and out of databases. Since object storage isn't addressed in the same way as the block storage -- that is, with underpins, hypervisors and OSes -- it can use different methods to synchronize data across a cluster of storage nodes. This results in extremely inexpensive storage when deployed at scale.
Object storage systems are even less constrained regarding cluster evolution than scale-out storage. Generally, you take any x86 server you can find, stuff it as full of disks and/or flash as you can, and add it to the cluster. "Deep and cheap" is the goal.
The significance of object storage is that it represents a broad change in application development and design. Legacy applications were designed to store data with the application. They assumed they owned the environment in which they operated and were the only workload on a given server. Windows administrators know that the worst of these as applications are hard-coded to store all data in a specific directory on the C drive, without the option to move it.
Object storage doesn't assume the OS from which the application executes can even "see" the storage the application uses. Calls to the application storage occur over the network. If the VM or container running the application dies, it doesn't actually matter because a new one can be spun up in seconds and it can resume working from the same data set.
Infrastructure as code provisions storage by policy
IAC is the ultimate evolution of storage. IAC is a movement which says only the application matters. Infrastructure -- be it storage, networking, physical servers, a hypervisor, an operating system or containerization software -- is simply addressed via an application programming interface.
With IAC, a developer requests a given amount of storage, CPU and RAM resources along with an appropriate environment in which to run her application. The infrastructure sets about provisioning this automatically and the developer pays little attention to which bits do what. Administrators don't approve each request manually; they set limits using policies and are free to do what they wish within those limits.
In the VMware ecosystem, some of this is met through vRealize applications, some through the Photon stack, and some is delivered through third-party applications such as Puppet, while certain elements have yet to be fully developed into a proper IAC offering.
Although IAC is still a very young movement, it is a reflection of various hard realities within the tech sector that are not going to reverse course. Infrastructure, from switches to servers to software, is all rapidly becoming a commodity at all levels. The evolution of storage has driven -- and has been driven by -- the evolution of application delivery. This, in turn, has not only normalized virtualization, but made it the default option and a critical part of modern infrastructure.
At the same time, the evolution of storage, application delivery and virtualization has meant that management of all tiers must be as "invisible" as possible. Administrator skill sets must include automation and orchestration of all layers of data center infrastructure, not merely the herding of VMs.
The data center and the roles of those who run it are constantly changing. Networking and OS admins are probably out of luck because their verticals are taking longer to evolve. By the time they do, they may find the data center architecture positions have all been filled.
The battle for who ultimately takes on the role of data center automation architect seems set to occur between storage and virtualization administrators, as both are seeing their specialties commoditized today by advanced management tools and increasingly easy infrastructure.
Top enterprise storage systems of 2015
Scale-out storage set to overtake scale-up storage
Making sense of new data storage technology