Solid-state disks are transforming storage. Their combination of low latency, high throughput and moderate cost...
are producing a range of solutions that use SSDs as a cache in front of larger, cheaper -- but slower -- hard disks.
One of the top characteristics that give SSDs their speed is their ability to read or write in tens of microseconds -- or about a hundred times as fast as a hard disk. This low latency makes SSD a great cache technology.
SSDs still a pricey proposition
Although prices continue to fall, SSDs with the lowest latency and highest throughput still have a high price per gigabyte. Using this expensive class of storage cost-effectively requires knowledge of your specific workload. It is important to have enough expensive cache to deliver application performance. An oversized cache that doesn't deliver improved application performance is money wasted. The real problem is that there are few tools for identifying the disk cache size for your applications.
Historically, most storage arrays use hard disks for capacity and a combination of lots of disks and some RAM for performance. Recently, we have seen the rise of economical arrays with all-flash storage -- not a hard disk to be found in them -- that provide much higher performance than the equivalently priced disk array. This presents a challenge for many customers who bought hard-disk-based arrays in the last few years.
Getting the most mileage out of an array
A storage array is a long-term purchase, usually a three- to five-year commitment. Customers want an economical way to get better performance out of their disk arrays so they can avoid replacing them prematurely. Disk-array vendors add SSDs as a cache in their disk arrays as a performance tier that is larger than available RAM. Accessing data held in the performance tier is vastly faster than accessing the disk-tier data. Sizing the RAM and SSD cache in your storage array is the new art of storage tuning.
A central concept with cache is the working set size. Applications usually access a small subset of their data far more frequently than the rest of the data. To understand this, think of an encyclopedia: Each page of its index is read a hundred times as often any other page. Having the entire index available a hundred times faster will provide a huge benefit. Having some of the remaining pages available a hundred times faster will provide less benefit.
Objects in a storage array are much the same. A small amount of data, called hot data, is accessed much more frequently than the remaining data. The real benefit of the cache will come only if we have enough to hold all of the frequently accessed pages. If the hot data is 10 pages long and the cache can hold only five, then we must wait to retrieve the other five from slower media. Having enough cache to hold some normal pages will help, but not nearly as much as caching the index will.
Calculating the proper cache size
The SSD cache will be most valuable when it holds the working set of application data. Not having enough SSDs for the working set will not deliver optimal performance. Let's consider an application with 500 GB of data. If 40% of an application's I/O uses only 20 GB of the data, then clearly a cache size of 20 GB will be a great improvement. Making 40% of the I/O run 100 times faster is probably going to be a good choice.
Would a cache size of 100 GB provide better performance? Quite possibly, not a lot. Maybe the application spreads the next 40% of its I/O evenly over 200 GB of its data. Then increasing the cache to 100 GB will not deliver the same level of improvement we got from the first 20 GB. The extra 80 GB of cache may accelerate only 15% of the application's I/O, at four times the cost of the initial improvement.
The challenge of hot data
The biggest challenge is that the amount of hot data is hard to measure. Even for a specific application type, different uses of the application -- and different customers -- will have very different profiles.
There are also multiple levels of hot. For the encyclopedia, the table of contents is tiny and used more than the index. The index is larger and the set of all pages can be huge. There is a series of amounts of cache that will help; as cache gets larger, there are a series of steps where performance improves a lot. There are also a series of flats where performance won't increase much as cache size increases. If your performance tier, like SSD, is expensive, you need to measure your workload to work out the right amount to buy.