Last week we introduced a new feature at purestorage.com that I’m quite excited about: the Dedupe Ticker. The ticker is a live feed from the call-home data repository here at Pure Storage Support, and it shows the average data reduction results across all Pure Storage FlashArrays deployed at customer sites with call-home enabled. We did this for a simple reason: to provide transparency into the dedupe results that the FlashArray delivers for customers, and to start encouraging the industry to be more honest, open, and transparent about real data reduction results.
What Are Good Data Reduction Results?
One of the challenging things about data reduction is that it is inherently variable. We see many vendors in the market hiding behind very generalized marketing claims about the efficacy of their data reduction technologies, often claiming “5-10x” reduction as a typically achievable result. But what does that mean? The reality is that data reduction varies a lot by workload, and it is really important to test vendor technologies on your data. That said, we’ve deployed enough FlashArrays here at Pure across a wide range of workloads to be able to give customers pretty good expectations before they dive into a deployment. We also make available our PureSize tool to allow customers to analyze their own data. Here are some example Pure Storage customers across our typical use cases:
As you can see, results vary by customer, but we’ve pretty typically seen Database-type workloads achieve 2-5x reduction, server virtualization achieve 4-8x reduction (depends on what is inside the VMs), and VDI workloads achieve 5-10x reduction (depends on persistent vs. non-persistent). All of the results above lead to successful deployments and happy customers, but understanding the expected level of data reduction pre-deployment is critical to this.
Does Data Reduction Include Thin Provisioning?
One of the sillier areas of confusion in the marketplace right now is whether different vendor-promised data reduction rates include thin provisioning or not. Some vendors incorporate it into their reported data reduction rates, others do not. Savings related to thin provisioning are useful to understand, but in our minds counting them as “dedupe” is a little less than honest. It is also bad operational advice: if thin provisioning savings are included in the dedupe rate then as the host volume fills the dedupe rate will reduce over time…but if only actual data reduction is included, then the rate will be relatively stable as the volume fills, and then can thus be used to rationally plan capacity growth. For clarity Pure Storage provides both numbers, both are available right on the front page of the FlashArray GUI, and both are available in the new Dedupe Ticker. Want to know EXACTLY how we calculate data reduction rates? We’re transparent about that too.
Why Different Data Reduction Technologies are Not “All the Same”
We’ve noticed a disturbing trend in the industry, every vendor is now talking about data reduction technologies as if they are a “checkbox” item on a product spec sheet, and that all data reduction technologies are the same. Our findings in the field are quite different: we believe that data reduction technologies differ widely by vendor, and that this is a core differentiation for Pure Storage. Here are a few dimensions to consider:
- Data reduction type: Deduplication, Compression, Pattern Removal, or all. Some vendors will do dedupe but not compression, some will do compression but not dedupe. Some will do data reduction on their flash tier but not on their disk tier. Some will reduce simple patterns like zero blocks, but not more complex patterns (i.e. real deduplication). Pure Storage delivers always-on deduplication, compression, pattern removal, and thin provisioning.
- Inline vs. post process. Some vendors will reduce data as it is coming into the array, others will first land data to disk and/or flash before processing for data reduction. In the legacy disk world, post-process was the norm because data reduction just took too long. In the flash world, inline deduplication becomes possible due to the performance of flash and new architectures designed for it. It is important to understand that inline affords two key advantages in the flash world: cost reduction via increasing the effective capacity of the flash, and write avoidance (and thus flash life extension) by removing write IOs which would have otherwise hit the flash. Pure Storage’s data reduction is an inline technology, where IOs are committed to the DRAM in the array’s controllers (and mirrored to non-volatile memory to protect against complete array power loss) so they can be acknowledged back to the host immediately, but then data reduction processing happens before the IOs ever are written to the flash, avoiding writes and improving capacity.
- Reduction “chunk” size, granularity, and alignment. Data reduction technologies vary in the granularity with which they analyze data. Granularity is a bit of a trade-off: the smaller chunks you use to analyze data the more redundancy and benefit you’ll find, but the smaller the chunks the more metadata the process creates that must be managed. Granularity also has a direct tie to alignment…anyone who has ever dealt with mis-aligned VMs in traditional storage knows exactly what I’m talking about…if all the layers (storage, hypervisor, VM, FS, application) don’t align on the right geometry, finding duplicates is all the harder. Most data reduction and thin provisioning technologies today operate on the 4K or larger chunk size. Pure Storage data reduction services are unique in that they analyze data down to a 512-byte granularity (8x smaller than 4K), but they are variable chunk in nature, meaning that if we find a larger chunk we store metadata references to larger chunks to drive metadata efficiency. 512-byte granularity also has the very nice benefit of essentially being auto-aligning to all larger geometries above.
- Performance impact. This one is a biggie: why hasn’t deduplication taken the primary storage world by storm like it has the backup world? Simple, in traditional disk storage dedupe is really slow and just can’t be run in any performance environment. In fact, we’ve seen the same challenges in poorly-designed or retrofit data reduction implmentations in the flash world as well, enabling data reduction can come with a 50%+ performance impact. If a vendor is telling you about their data reduction, the performance impact that comes along with it should be your very first question. Pure Storage is different – the FlashArray was designed from the ground-up for data reduction, the data reduction is always on, and all the performance specs you see from us are with data reduction turned on.
- Global vs. local. Many of the existing implementations of data reduction are confined to a portion of the storage array to keep their performance and metadata manageable, such as only within a volume, LUN, FS, shelf, etc. NetApp’s data reduction, for example, operates within the Volume. The downside here is obvious – the more you divide data reduction into pools or zones, the more you duplicate data between these pools and reduce overall effectiveness. Pure Storage data reduction is global across the entire array: all volumes.
- Purpose-built vs. retrofit. We found at Pure Storage that to enable data reduction technologies to work at flash speed and global scale, we had to design our entire array from the ground up to be optimized for data reduction. As other vendors have realized that data reduction is a “must have” feature to make MLC flash arrays work (both from a cost point and a write avoidance perspective), we’re seeing a wave of vendors OEM or acquire 3rd-party technologies and retrofit them to their storage arrays, often as a post-process step. Pure Storage data reduction technologies were all developed from scratch by Pure, which is key to our differentiation in the marketplace.