Storage “Intelligence”: Real Value, or Illusory Superiority?

When it comes to describing how  “smart” someone or something is, studies show that many of us suffer from what it known as Illusory Superiority. Most of us think that we’re smarter than we really are. A particularly interesting aspect of illusory superiority is known as the Downing Effect, named after researcher C. L. Downing. Basically his work showed that people with lower IQs tend to overestimate their own intelligence, while people with higher IQs tend to underestimate their own intelligence.

At Pure Storage, we believe that possibly the greatest indicator of value from intelligence in storage is found in how simple and effortless it is to deal with each necessary or valuable aspect of operating the storage array.

Effectively, what exposes the value (or lack thereof) of intelligence in a storage array is: How much of what a smart person would want to do manually is instead done for you, automatically, by the array itself?

Questions to ask of any vendor to get an idea of how “smart” an array truly is might be things like: How easy is it to install? How quickly can I be fully operational after beginning my installation? How much performance management do I need to do to ensure that it meets or exceeds my service level requirements? Are data services such as compression, deduplication, and encryption automatic, or do I need to manually configure and manage them? How easily can I integrate Active/Active clustering? Will the knowledge learned from everyone running a similar array to mine be constantly applied to resolve potential problems on my array (without the need for my involvement)?

At Pure Storage, our FlashArray was designed from the beginning to deliver a vastly superior and more effortless customer experience than everything else that we saw in the market that preceded it. We don’t usually talk about it in terms of being “smarter” or more “intelligent” than other storage systems, but rather, simply better at giving customers what they want. This blog from early 2017 describes those concepts that are still as valid today.

In contrast, some other storage vendors have made statements that call out what they believe is their unique level of “intelligence”.

With the recent announcement of Dell’s PowerMax, the terms “Machine Learning”, “Autonomous Storage”, “Artificial Intelligence”, and “Higher IQ” are being used by Dell to describe their new array. Dell claims that “PowerMaxOS brings autonomous storage to life with a built-in machine learning engine that leverages predictive analytics and pattern recognition to analyze and forecast over 40 million data sets in real-time, driving 6 billion decisions per day to maximize performance with no overhead.”

That sure sounds impressive! But what exactly are these data sets that are being analyzed? What actual decisions are being made because of them? And what customer value do they really provide?

A quick history of some key milestones in storage intelligence

When it comes to intelligence for data storage, there is a long history of innovation by many vendors in the areas of performance, availability, efficiency, and simplicity. As examples, storage historians have credited Digital Equipment Corp. for introducing the first RAID feature (VMS Volume Shadowing, 1989), StorageTek for integrated data compression (Iceberg, 1994), 3Par for thin provisioning (InForm O/S, 2003), Hitachi for storage that could virtualize other storage (TagmaStore, 2004), and Compellent for automated storage tiering (Storage Center, 2004).

From its introduction in 1990 the EMC Symmetrix architecture had the ability to intelligently determine what data it should pre-fetch from slow disks into much-faster front-end solid-state cache memory, and then to intelligently manage what data stays in cache, and for how long. A primary objective and assumption for why this architecture added value was that although the vast majority of the data would reside on slow inexpensive disk drives, for many applications most I/O (including all writes) would be served at the higher speed of the fast front-end cache memory. This was innovative at the time, as was the redundant design of all major system components to improve availability.

Pure Storage recognizes historical storage leadership by EMC including technical innovations that benefited the industry in past decades. But Pure Storage’s FlashArray was designed decades later, in the new age of flash rather than disk, and being architected to be all-flash it could do some things differently. In an all-flash array, there is no need for data that is being read to have that data separately staged in a read-cache. For Pure Storage there was no need to write complex software to manage the “intelligence” for read-cache management that the all-flash architecture simply does not require.

In 2009 EMC introduced Fully Automated Storage Tiering (FAST) for full volumes, and subsequently Fully Automated Storage Tiering Virtual Provisioning (FAST-VP) which could analyze physical locations on disk at a very granular level. And what was being analyzed? Usage! And then based upon that usage, decisions were made every ten minutes regarding which data to move to which tier of storage within a single VMAX.

The FAST-VP software optimized for both cost and performance on hybrid storage arrays by minimizing the amount of fast but expensive Flash capacity required and maximizing the amount of slower but cheaper SATA capacity. FAST-VP proved itself to be effective for optimizing hybrid VMAX arrays that contained a mix of different types of disks.

Intelligence: Use it or lose it!

Fast forward to 2018: While it is possible that Dell considers the historical cache management or maybe even something else entirely to be part of those “6 billion decisions”, it doesn’t appear to us that this is what they mean, based upon these statements: “In our Marketing announce literature we reference three hero numbers for Service Levels, how are these numbers derived?


We make reference to the following:

  • Analyzes and forecasts 40 million data sets in real time: This is the number of data units managed in a typical VMAX deployment of 200TBe. The granularity of data monitoring and movement is a 5.25MB data object (42 128K tracks). Also note that for a fully configured PowerMax array (4PBe) we are continuously managing 800 million data sets. Our advanced machine learning algorithms enable PowerMax to maximize performance and increase efficiency for our customers
  • 6 billion decisions per day: In a typical PowerMax deployment, the array will be analyzing and making physical adjustments to the backend placement of each of the 40 million datasets every 10 minutes or so (the decision cycle). That is 150 times a day x 40 million data sets = 6 billion.

NOTE: the system is not moving 40 million data sets every 10 minutes,only data sets that need adjusting per their service level requirements.”

It appears that most of the analysis and most of what comprises those 6 billion decisions being made for the new PowerMax are simply based upon the legacy FAST-VP software, but with some notable twists. On VMAX, FAST-VP would move data between different storage tiers based upon real-time usage. But in the new PowerMax, currently there is only one single class of storage type – NAND SSDs. The software is able to analyze which data on those SSDs is hot, which is cold, and then it gets to make smart decisions (up to possibly 6 billion per day) about where to move it.

However, today on PowerMax it simply doesn’t have anywhere to move it to that’s any different from where it already is, it’s all NAND SSDs, so the data probably doesn’t move. If data isn’t actually moving as a result of all those decisions, you might want to ask what, if any, value is being delivered? Regardless, Dell still talks about this feature, associated with Storage Class Memory (SCM), and how you can intelligently move data between tiers of SCM and NAND SSDs on PowerMax; however, Dell has stated that they do not expect to provide SCM until 2019, so you simply cannot do any of this today. Dell has also stated “we expect initial SCM pricing to (be) around 5-10X per GB compared to NVMe NAND drives”, so you might want to ask yourself if it will be worth it to you even when it might happen.

So maybe in 2019 this legacy software will find new life on PowerMax and possibly be able to provide some of its intended original value for moving data between what might be two-tiers, one NAND-based and the other SCM-based at 5-10X the cost. But until that might happen, what is the value of deciding that data should be moved, but then not actually moving it? In contrast, the design of Pure’s FlashArray simply doesn’t need to worry about tiering and never has. FlashArray delivers consistent high performance and cost reduction without the trade-offs or complexity of tiering, through a modern all-flash architecture.

To Data Reduce or not to Data Reduce? That is the question!

Another legacy intelligence feature carried over from the VMAX to PowerMax is that it will identify up to the top 20% of the most “active” data and make a decision based upon that activity. And that decision is: Should data reduction be turned off for this “active” data? Even if you determine that you want the cost, space, and power benefits of data reduction (who doesn’t?) for all your data, and even if you have manually enabled data reduction, the PowerMax will override your decision whether you like it or not for the most active data.

You might want to ask yourself, why would such a feature exist? What if data reduction was your priority, why don’t YOU get to choose which way you want it? Or better yet, wouldn’t it be better to never have to trade-off between performance and lower cost at all?

You might say that this is analogous to the famous scene in the movie 2001: A Space Odyssey where astronaut Dave requests that the HAL 9000 computer open the pod bay doors, to which HAL calmly replies “I’m sorry Dave, I’m afraid I can’t do that”. There is no “intelligence” to turn data reduction on if it was manually turned off on PowerMax, just the other way around.

FlashArray’s data reduction is designed to be always-on. You don’t manually turn it on or off, or have to worry about impact to performance. You can read more about how FlashArray delivers leadership data reduction through five separate techniques here. Since up to 20% of your data could be affected on PowerMax, this likely results in a larger and more expensive PowerMax configuration than if data reduction was “always-on”.

Quality of Service (QoS) is intelligence too!

Another area of “intelligence” for PowerMax involves performance service levels, usually known as Quality of Service (QoS). In 2014 EMC introduced SLP (Service Level Provisioning) software to further assist FAST-VP in optimizing the placement of data across a mix of Flash SSDs, FC disks, and SATA disks to achieve targeted response time performance for different workloads. Specifically, five individual classes for performance were made available in addition to a default mode. Like FAST-VP, this was software that was designed for a legacy mixed-disk hybrid storage platform, so now updated its implications for running on an all-Flash platform are a bit different.

On PowerMax, as was noted above, there is only one single performance class for storage today, NAND SSDs. So this software can be used specifically to force the slow-down of lower priority workloads to something slower than what one might normally expect from an all-Flash storage array. Four of the five selectable service levels in the new PowerMax implementation have target response times of 1ms or slower, and can be defined to target performance as slow as 6ms. “The specific response times themselves, will not be user configurable.” Service level features on PowerMax are manually enabled and configured.

FlashArray has included QoS features since 2016. We see three primary use cases for QoS, and you can read more about FlashArray QoS in this blog:

  1. Intelligently protect against “noisy neighbors”. Don’t allow individual applications to inappropriately hog too many system resources at the expense of other applications, and throttle back their usage as needed. This aspect of QoS is “always-on” with FlashArray and does not require any manual management, configuration, or attention.
  2. Assure performance for performance-critical applications. We believe that FlashArray provides similar QoS benefits here to PowerMax, but while having five different levels of performance class may have made a lot of sense for a legacy hybrid storage architecture like VMAX that had SATA disks, FC disks, and SSDs, we have not seen demand for that degree of complexity with all-flash arrays.  FlashArray offers three performance class levels, which has proven to be sufficient for our customers.
  3. Performance limits, to insure consistent performance for different tenants. We believe that this feature on FlashArray also delivers similar benefits to the service level features on PowerMax.

In Summary, you should ask what does Dell’s storage intelligence do for you today?

So to be clear, in addition to the legacy basic Symmetrix intelligence features that have carried over from all Symmetrix architecture storage systems (Symmetrix, DMX, VMAX, PowerMax) it appears that there are primarily two major “machine learning” decisions that PowerMax provides today with most of those “6 Billion decisions”, and in our opinion the foundation of that software appears to be a legacy carry-over of the FAST/SLP software from VMAX:

  1. Should it turn off data reduction for some data to try to maintain performance, impacting the overall amount of data reduction, regardless of what the customer manually requested? This feature will engage if data reduction has been enabled, and if it hasn’t, no additional action is taken. With Pure’s FlashArray, data reduction is always-on, without compromise.
  2. Service Levels (QoS). Should you define performance classes and limits for different applications and tenants? This feature will only engage if the software has been manually configured and enabled, and if it hasn’t, no action is taken. With Pure’s FlashArray, “noisy neighbor” protection is always-on automatically, and we believe our other QoS features provide similar benefits to those on PowerMax.

Some of the aspects of the self-proclaimed intelligence found on the PowerMax don’t really apply to the Pure Storage FlashArray//X. Since all Pure FlashArrays, from the very first generation, were architected for all-flash and nothing but flash, there has not been a need for complex tiering software. Contrary to the old saying, sometimes one size really does fit all, or at least most, with excellent performance across multiple Tier-1 and Tier-2 workloads. Since the Pure FlashArray//X provides all data services “always-on”, there is no need to compromise and incur trade-offs between superior data reduction and superior all-flash performance.

But we need to be transparent too, as at Pure Storage we also use the words “machine learning” and “artificial intelligence” to describe certain aspects of our offering, but what we’re referring to is our Pure1® Meta cloud-based predictive analytics. We’re also going to throw out a big number – we analyze over one trillion data points per day, coming from our entire installed base, and to date we have created a data lake for analysis that is over 7PB, still growing, and continuously learning. Our primary use cases currently are:

  1. Workload analysis for performance and capacity planning. This helps our customers to optimize their configurations and deliver consistent service levels as things change.
  2. Predictive and automatic problem identification and resolution. This allows us to identify and fix potential issues before they ever present themselves as actual problems.

You can read more about Pure1 Meta in this blog.

Is FlashArray “intelligent”? We believe so, but it’s only the opinion of our customers that matters! But regardless, we expect that you will hold us accountable for how Pure can help to make your storage effortless, and we’d love to learn more about YOUR requirements. Please contact us here so that we can help you on a more effortless path. And we encourage you to run a real-world Proof of Concept (POC) and see the actual difference for yourself!