Non-Volatile Memory Express (NVMe) is becoming the primary protocol for interconnecting modern storage technologies both within storage arrays and between storage arrays, storage networks, and...
Today EMC announced the long-expected GA of XtremIO. As we’ve discussed earlier, overall we view this as a great move for the industry. We’re in a rapid transformation from legacy disk architectures to all-flash architectures for performance storage, and the 800-lb gorilla in the storage industry endorsing and accelerating this transformation will only help move the entire industry forward to the bright flash future. Hugely gratifying for those of us who have been pioneering this transformation.
I’ll also give EMC props for toning-down their usual launch fanfare to provide a clean, clear, straightforward presentation of what XtremIO was all about, which made clear both what they perceived to be their technical advantages, as well as their “1.0” product limitations (many of which they were surprisingly open about in the live chat correspondence at the event).
In case you missed it, we had a little fun with a pre-game show, inviting 49ers all-star and SuperBowl champion Harris Barton to try and guess with us what the launch was all about. Take a quick watch, and then you can see how accurate our pre-game guesses were! (and check back on Monday, when we’ll publish our post-game show featuring 49ers great Ronnie Lott!).
In a nutshell, if I had to attempt to summarize what I took away from the event, EMC tried to make the following core points about XtremIO and the competition:
1) XtremIO is GA and ready for “prime time” production deployments
2) XtremIO has a very unique architecture, which differentiates it in the market:
Let’s look at these claims one-by-one and provide the Pure perspective.
With any product, certain compromises have to be made to ship the first GA, and it is clear after today’s launch that XtremIO is no different. Let’s just look at what these compromises entailed (presented in EMC’s words, from the launch chat window).
XtremIO GA includes:
Features marketed in the launch, but not in GA:
(These feature limitations were discussed by EMC representatives in the live chat window during the launch event. Copyright law prevents me from reproducing the actual quotes, but the launch event is available for replay if you’d like to get more context/details.)
Finally, there were certain “GA-readiness” questions that were left unanswered by the release, including:
OK, on to the architectural stuff….
EMC made four primary claims, which we’ll discuss 1-by-1:
I believe this is true, XtremIO implements a simple 4K, SHA-1 content hash-based deduplication scheme, where all deduplication is done inline (in fact it has to be done inline, as the hash is how the array figures out where to store data). In stressing this point, EMC is poking on competitors who don’t do deduplication, and Pure Storage which leverages a variety of data reduction algorithms, most of which are inline and some of which add additional value post-process.
From my POV, what EMC highlights as a strength is one of their biggest weaknesses: their product only does 4K fixed block deduplication, and this likely is deeply tied to their 4K in-memory metadata model.
Let’s contrast this with Pure Storage. Pure implements highly adaptive data reduction technologies, we in fact use 5 independent data reduction technologies within our array, and we continually add to and improve them to deliver better data reduction over time.
Our data reduction technologies are:
As far as what is inline vs. post-process, #1-#4 are always on inline. The relative aggression and priority of #2 and #3 can be turned up and down (automatically by the array) to maintain consistent performance. In practice turning these down is rare, but can happen under the highest write workloads to protect the performance of the array. And if potential data reduction is missed, the array achieves FULL data reduction via #5, which happens within hours.
But in our minds, the real story here is the lack of compression. We’ve never shared this data before, but Pure averaged data reduction rates across our entire customer base of 100s of arrays, and found that on average, compression delivers MORE data reduction than deduplication, esp. true for database and mixed VM workloads (don’t forget, there are applications in those VMs!). You’ll also see via the bell curve where most of the arrays fall into the population of data reduction delivered, and compression dominates in that sweet spot. Here’s the data:
Let’s say that more simply: if you don’t deliver compression, you lose out on more than 50% of the potential data reduction in mixed workloads, thus making a solution 2x more expensive on a like-for-like hardware basis.
So in the end, the “100% inline” debate is the sideshow, the real story is the level of data reduction delivered and the use cases that enables, which we believe Pure wins in hands-down.
EMC spent a good deal of time talking about XtremIO’s in-memory metadata model, where metadata is persisted and lives in the DRAM of the controllers vs. being persisted to flash. EMC is right in highlighting metadata – it is the life blood of advanced services. Deduplication, compression, thin provisioning – they are all powered by metadata, and in fact I’m guessing that XtremIO’s simple 4K metadata model is a big reason that the platform doesn’t deliver compression or fine-grain variable-chunk sized deduplication (see above).
By way of comparison, Pure Storage also makes heavy use of metadata, but our model for it is very different. Pure commits and persists 100% of metadata to flash itself (it is actually stored and protected in the same data structures as user data), and then caches the most heavily-used metadata in DRAM to speed operations.
Ultimately, when analyzing metadata models, there are three things that are important:
This is one of the odder claims from EMC, and honestly I’m not sure who it is directed at. What EMC has claimed here is that:
To respond in kind to each of these, EMC leverages expensive eMLC SSDs, which do the garbage collection inside. We’re not privy to exactly which eMLC drive EMC uses, but typically these devices contain from 30-50% extra flash to be able to manage garbage collection within the drive. Let me repeat that: the eMLC SSDs have about 30-50% extra flash than advertised inside, and they use this space to do garbage collection…folks, there is no free lunch in flash garbage collection – it’s a property of the medium and it has to be done.
Pure Storage uses consumer-grade MLC SSDs, with typically 5-7% over-provisioning, and instead we reserve (and hide from the user) 20% of the raw flash to manage garbage collection. We also reserve dedicated CPU time for this garbage collection, which is an ongoing background process. What does this mean? Garbage collection has no impact on the performance of the array, and Pure Storage guarantees full read and write performance up to 100% full. If you are running a Pure Storage array and see performance loss at any % full, we ask customers to open a support ticket and treat that as a bug.
Finally, it’s worth noting that implementing compression and variable size deduplication likely require garbage collection (4K overwrites won’t fit neatly into 4K back-end spaces). Any vendor serious about shipping these features will have to solve delivering garbage collection without performance impact as Pure has.
Much has been said in other forums about the scale-out vs. scale-up debate, so I’m not going to go very deep into that one today….but a few things to understand:
From the Pure Storage point-of-view, we’ve designed our architecture from day 1 to support scale-out, when we need it (which is why we chose InfiniBand as our cluster interconnect as well). What we’ve seen in practice though, is that we’re meeting the vast majority of customer use case needs via our current approach, which is to first focus on maximizing scale-up. Via scale-up we’ve been able to deliver capacity expansion (customers can deploy FlashArrays which range from 10TBs to >150TBs usable, with much more to come next year) from a single set of controllers, and every year via Moore’s Law we deliver faster controllers which are easy, non-disruptive, and low-cost upgrades to existing arrays. Customers can choose to expand capacity and performance independently, and both types of expansion are non-disruptive and at lower cost than the scale-out model.
Do we eventually plan on supporting scale-out? Absolutely…when (and if) the market asks us for it. But for now, our customers are pushing us harder around more aggressive feature, manageability, and ecosystem integration support (which is an area where we are already market-leading in the all-flash array space). We’re also, as a generalization, more focused on reducing the effective cost of flash than breaking the 1M IOPS barrier. We’re delivering full-featured, no compromise solutions today at $3-4/GB usable and many hundreds of thousands of IOPS with consistent <1ms latency…and if I had to push harder on our engineering team around one dimension of improvement, higher performance via scale-out wouldn’t be it.
In closing, this has been a great week for the free flash world. With EMC XtremIO finally wading into the ring, more and more customers will come to the realization that the time is right to get serious about replacing spinning disk for performance workloads with all-flash solutions – enterprise flash has indeed become mainstream.
What preceded in this post was 2,500+ words of analysis…but what I urge you to do is raise your head above the technical arm wrestling, and judge for yourself. Get out of the land of theory, and bring both solutions into your datacenter, try them out, and we believe that the difference between the technologies will become obvious. Pure Storage is dedicated to delivering a better storage experience using flash as a catalyst, we’re here to stay, and we’re going to win in the market one happy customer at a time.