According to Gartner, the market for all-flash storage is going to approach $4B in 2015. Given these high stakes and ongoing brouhaha between hardware-centric flash appliances and new software-centric flash storage arrays, it’s no surprise that there’s substantial misinformation out there. To attempt to shed some light on the controversy, George Crump of Storage Switzerland and I hosted a seminar last week entitled The Five Lies Told about All-Flash Storage Systems. For those that prefer to be able to skim a blog (nonlinear access) rather than listen to a webcast, I’m going to summarize my thoughts below (but you still need to tune in to the session to pick up George’s insights).
Lie #1 – All flash storage systems are too expensive
Today, when you buy performance storage, you typically quantify your purchase in dollars per GB raw—that is, the total purchase price divided by the total GBs of storage media in the array. With all flash storage typically going for somewhere between $15 to $30/GB raw, flash tends to be ~3-5X more expensive than comparable arrays of fast spindles (not counting any flash caching).
However, when you buy a disk-based backup appliance, you are instead likely to quantify your purchase in dollars per GB usable. This is because deduplication and compression so materially change the equation that you are better served weighing how much of your data you will be able to store on a particular appliance (that is, the value that accrues to you) rather than how much raw space the appliance has (how much the hardware costs the vendor).
The stunning result that Pure’s customers have seen is that their average $/GB usable with all-flash is typically less than their $/GB usable with 15K hard drives! Our end users are often paying $2-5 GB/raw for performance disk, but find their $/GB usable is in the $5-15 range. Why? Lack of thin provisioning. RAID-10 for performance workloads. Short stroking for same. Incremental array software licensing fees for critical features. And so on.
With the Pure Storage FlashArray, we have effectively flipped these ratios: the FlashArray costs more than a disk array in terms of $/GB raw, but with deduplication and compression delivering material savings (we typically see 5-10X for our target virtualization and database workloads), customers are getting the benefits of all flash at $5-10/GB usable. No need to take our word for how much data reduction could impact your workload: we have a software tool that you can download to quantify your data reduction rate. (Note: The reason this is an apples to apples comparison is that these data reduction techniques appear to be generally incompatible with mechanical hard drives running performance workloads—Deduplication is so random I/O intensive that traditional disk cannot hope to accomplish it with submillisecond latencies. See Not Your Momma’s Deduplication and Why Flash Changes Everything.
Moreover, the above analysis doesn’t take into account any of the savings from flash’s greater power efficiency—the typical disk array costs as much to rack and power over three years as it does to procure, and as a result, mechanical storage is likely 40% of your data center power budget. Nor does this analysis count any savings from server consolidation, lowering your DRAM footprint, or saving money on enterprise software licensing fees (via server consolidation). Net net: you may well find that you can have your cake (flash’s >10X performance boost) and eat it too (save money versus your current spend on performance disk).
Lie #2 – All flash storage systems are not reliable
The claim, typically from the incumbent disk vendors, is that flash has not yet been proven for enterprise workloads, but in reality solid state storage is more reliable storage.
Studies by CMU and Google have shown that hard drives are less reliable than is typically thought: While the data sheet may list annual failure rates “of at most .88%,” typically replacement rates range from 2 to 4% in the first year, going up to 6 to 8% in later years, “and up to 13% observed on some systems.” And Intel has shown that the lower failure rates of solid-state drives in laptops justify the additional cost relative to mechanical disk. While our own experience is anecdotal, Pure has deployed 1000s of SSDs, both internally and at customer sites for 2+ years of hard labor. In all that time we have had one SSD get into a state that we couldn’t heal with our software.
At the same time, flash is a profoundly different media than spinning disk, and to get the greatest longevity and performance out of it, you cannot treat it like a hard drive. Yet traditional disk-centric designs still often update data in place (rather than amortizing wear evenly), fail to align writes with the underlying flash geometry (often leading to write amplification), and fail to employ inline deduplication (and hence write the same data repeatedly, further wearing the flash). Pure employs a collection of techniques we refer to as our FlashCare™ technology for staying within the sweet spot of flash memory, extending its life and enhancing its performance.
Lie #3 – All flash storage systems are not highly available (HA)
In fairness, the first generation of flash form factors—server PCIe flash cards and hardware-centric flash appliances—did not provide native HA. It is this next generation of software-centric all flash enterprise arrays that offer HA. When you insert flash cards or SSDs into servers, or you directly connect a server to an unshared flash appliance (DAS), HA is an exercise left to the application programmer. The Google’s and Facebook’s of the world have had the luxury of crafting entirely new software architectures from scratch to deliver HA without shared storage. But the vast majority of enterprises need shared storage that accommodates their existing mission critical applications—VMware, Oracle, Microsoft, SAP, and so on.
What we are seeing today is the emergence of true all-flash arrays—storage appliances that have comparable data management features to traditional disk arrays, including HA, snapshots, flash-specific RAID layouts, replication, and so on. These designs feature fully redundant hardware, generally made up of commodity components, combined with advanced controller software that automatically heals around any problems within the underlying hardware. Since data reduction (a.k.a. dedupe and compression) allow vendors like Pure to offer feature rich all-flash arrays for less than the cost of these typically bare bones flash appliances, we can expect the market to vote with its feet in the months and years ahead.
Lie #4 – All flash arrays are for the high performance fringe
In fact, for the random IO demanded by virtualization and databases, disk today appears slower than tape did twenty years ago. Thus it is high time to get mechanical storage out of the latency path of performance workloads.
Such work had already been happening within the public cloud: Facebook, Google, Amazon, Microsoft are all using flash extensively to unleash their applications from the inherent latency in mechanical systems: for random IO, disk spends >95% of its time and energy seeking and rotating, and <5% actually doing the useful work of transferring data.
The reason flash has been relegated to the performance enthusiast to date goes back to Lie #1—Flash has been too expensive. The mainstream enterprise will embrace flash when it is packaged in a form factor that delivers the reliability, scalability, and plug compatibility they require at a price point that is at or below what they are spending on disk. Thanks to the wonders of purpose-built storage for flash combined with high-performance data reduction, that vision is now reality.
Lie #5 – All Flash arrays have to compromise performance to achieve affordability
This topic is typically raised by vendors that do not support deduplication and compression, or else have been unable to deliver high performance implementations of the same.
First, consider compression. Thanks to Moore’s Law and the engineering efforts of Intel and their brethren, compression has gotten really fast: at Pure, we see compression performance on the order the 700KB/ms for a single processor core (out of our 12 core controller). With that level of performance, it turns out that “CPU is cheaper than the flash”—that is, that it makes economic sense to upgrade to a faster CPU so that you can use less flash and still meet the performance demands of your workload. And clearly if your comparison point is to disk arrays, compression plus flash is much faster than waiting for disk to seek and rotate.
Most of this false debate over having to compromise flash performance to deliver economics, however, has centered on deduplication. Deduplication is so random IO intensive that it is has proved incompatible with performance workloads on disk. With flash, writes are expensive and reads very cheap, and there is no random IO penalty. Thus for flash, it simply does not make sense to write the same data over and over again the way it is done on performance disk (to preserve contiguity). Which is to say, that the ideal algorithms for deduplication on flash can actually accelerate performance for write intensive workloads, as well as save wear and tear. (For flash, dedupe really should happen inline—if you write the data to flash first, and then dedupe it later, you increase flash wear.) Given the savings in flash footprint and reduced program/erase cycles, dedupe on flash is a no brainer for virtualization and database workloads. (The market leader EMC has likely ended this debate with their acquisition of XtremIO, who like Pure is building a next-generation storage architecture purpose-built for flash that incorporates deduplication.)