image_pdfimage_print

We’ve been asked this question many times by customers and the press, and in fact this morning Chris Mellor at The Register published an article calling-us out specifically on this at the end: Big boy Fujitsu crushes stroppy upstarts in storage boffins’ trials.

So why doesn’t Pure Storage publish a SPC benchmark?  Simple:  We aren’t allowed to.

We aren’t allowed to publish a SPC benchmark result.

The rules from SPC today explicitly prevent vendors from running and publishing results on arrays which have deduplication, compression, and/or thin provisioning enabled.  Here’s a statement that we were provided by Walter Baker, the SPC administrator:

“Data reduction is currently not allowed in any audited SPC measurements to become new SPC Results.”…”We are currently developing specifications for compressible content for the next iteration of the SPC benchmark tools and specifications.  That activity will be followed by investigating how to incorporate deduplication within the benchmark specifications.”

To be clear the specific rules are that if you pre-fill the array with uncompressible data you can enable compression and thin provisioning, you just negate their value completely…but you can’t enable deduplication.

This obviously puts Pure Storage in a tough spot, as the Pure Storage FlashArray has “always on” inline deduplication, compression, and thin provisioning….we can’t turn it off.  This is, in fact, a strong differentiator for Pure Storage: most of our competitors who offer these services as “optional” features suffer massive performance impacts when they are enabled.  In Pure Storage these services (plus encryption) are so deeply integrated into the Purity Operating Environment that we can’t even turn them off, and all of our performance numbers and specs include them.

Why does SPC benchmark exclude deduplicating arrays?

That’s a good question for them, but I’d suspect it’s because their benchmark likely uses highly dedupable data (i.e. writes the same data repeatedly), and most deduplicating arrays can process dedupe data faster than unique data, since it ultimately doesn’t need to be written to back-end storage.  Allowing vendors with deduplication enabled to “compete” in SPC would ultimately be comparing apples to oranges, which we agree with.  For SPC to return a fair result across both legacy disk and modern deduplicating arrays, the workload must be evolved to write more real-world data that is fair in both cases.

So why is this a problem for SPC?

Ultimately, the enterprise storage world is rapidly switching from disk –> flash, and the new all-flash array products in the market all incorporate data reduction technologies of some form to make flash cost-effective.  You essentially can’t make an all-flash array cost-competitive in the market these days without data reduction technologies. So…if SPC wants to prevent fading into the horizon as a disk-era benchmark and modernize for the flash era, really incorporating deduplication, compression, and thin provisioning is key.  We’re eager for SPC to do this, and would be excited to run SPC when we’re ultimately allowed.

The good news is that SPC already does effectively market flash.

Despite SPC not being suitable for modern all-flash arrays, in an odd twist of fate, we find that SPC actually does a great job today of convincing customers to go all-flash for any performance-centric workload.  Just perusing the recent SPC results gives some great insight into the lengths (and cost) that legacy disk architectures have to go through to deliver what is, in most cases, still sub-flash performance.

Results vary greatly, but what you’ll find typically is arrays that have 100s of disk spindles, are 2-3 racks wide, cost $500K-$1M+, generally cost $5-10/GB raw but >$20/GB usable (as mirroring and wide striping is common to get good performance results).

Compare this to modern all-flash arrays like Pure Storage that can deliver better 100%-flash performance, are microwave sized instead of multi-refrigerator sized, draw a fraction of the power, are dramatically simpler to operation, yet deliver the same enterprise-class features and reliability.  You make the call.

Up Next: Automate Your Data Storage to Boost Efficiency