The Three Rs of Data Storage: Resilience, Redundancy, and Rebuilds

Everyone cares about the durability of their data. The good news is that Pure Storage has made this a non-issue for our customers because of how we design our systems

Resiliency

7 minutes
image_pdfimage_print

At our Pure//Accelerate® conference in June, Pure Storage showed the world some incredible technology with two big announcements. The first was that we would be introducing a 75TB DirectFlash® Module (DFM) this year and 150TB and 300TB DFMs to follow in the next few years

The second was what I would call a pretty hot take (as far as hot takes go in the storage industry): With the rapid pace of flash innovation, we predicted that there would be no place for hard drives in the data center by 2028.

Some of the questions I’ve gotten a lot since those announcements are: Isn’t Pure worried about making modules that big? What happens when one of them fails? How long would a rebuild take? Does it affect system performance? 

These questions don’t have a simple answer, and that’s because data resilience at scale is not simple. But the good news is that we’ve made this a non-issue for customers. 

The Storage Reliability Imperative: Learn how storage reliability is the key to navigating evolving requirements in the era of digital transformation

How Pure Storage Does Resilience at Scale

Like all things at Pure Storage, that’s because of how we’ve designed our systems. They’re capable and sophisticated inside while remaining simple to manage and consume on the outside.

Those of us who have been in the storage industry for a while remember the days when hard disk drives were doubling in capacity every year or two. But once they reached a certain size, while failing to get any faster, the industry started moving in the direction of increased redundancy schemes to protect data in the case of failures, such as dual parity (like RAID6) or even triple parity. But why was this necessary?

As I talked about in my blog earlier this year, one of the hidden efficiency taxes that hard drives have to pay is that because they’re so slow, their rebuild times get very long. If a drive gets twice as big, but its performance doesn’t increase at all, its rebuild time has doubled in the worst-case scenario. And because that rebuild period can leave you vulnerable to data loss in the case of an additional failure, at some point, it becomes a statistical imperative that you need to increase storage resilience, which as a consequence, lowers your effective usable capacity.

redundancy

Pure’s upcoming 75TB DFM

Flash flipped most of this on its head: Because performance was so much faster than disk (and the capacities so much lower than disk in the early days of flash), rebuild times dropped by more than an order of magnitude. While the largest hard drives might have rebuild times on the order of days or weeks, flash drives are generally on the order of hours or minutes. But now that Pure Storage has been shipping DFMs more than twice the size of the largest hard drive for over three years, the question becomes: Are the drives getting faster at the same rate as capacity increase? And if not, when will flash drives hit the same wall as hard drives?

It’s this concern that leads people to ask questions like: “How long will my DFM take to rebuild?” “How many drives can I lose?” These questions have complicated answers, because, like most engineering questions, the answer is “it depends.” But they also miss the bigger, more important question: “How durable is my data?”

Many customers have been trained by their legacy vendors to worry about things like drive rebuild times, parity schemes, stripe sizes, disk groups, and so on. That’s because traditional systems had been built up over many years, adding more features and complexity that was—in general—dumped on the storage administrator to figure out and manage. We believe that is a fundamentally broken approach to system design.

Pure Storage has been shipping resilient, durable storage systems for more than a decade. In that time, we’ve seen the sizes of individual drives increase by almost 200 times. But not once have we ever required a customer to configure during setup anything about the underlying data layout or distribution. We don’t ask the user about disk group sizes, or stripe widths, or what have you. Our systems are built without these kinds of hard restrictions which limit user choice in the future.

Our systems are designed around an availability goal and a durability goal. Availability means “Is my data accessible?” as opposed to durability, which means “Is my data safe?” If your system goes offline, but your data is safe, then it’s unavailable but still durable. Our systems are designed to have an availability of greater than six nines (99.9999% availability) and are designed to have a durability much higher: a mean time to data loss (MTTDL) of millions of years.¹ And our systems work to achieve these goals automatically and autonomously. As our systems grow in size, they automatically adjust their resiliency strategies to maintain that target level of durability and availability.

Our systems have to take into account multiple different aspects, not only of the system size and configuration but also ongoing state and performance. As the number of drives increases, we automatically adjust resiliency in balance with space efficiency, but we take into account the size and speed of those drives as well. We may choose to keep a rebuild process in the background to maintain high performance during heavy load or push it to the foreground if the system judges that the risk is too high to keep it on the back burner. All these calculations change over time as the system fills up, our software improves and our drives increase in size, performance, or both. And with Pure’s DFMs, because we write directly to flash, rather than going through a flash translation layer (FTL), our rebuilds are even faster than they would be with traditional SSDs.

This is all just math, but it’s math that most customers don’t have the time or energy to do. Graphing all this out would create some multi-dimensional line that any human would struggle to visualize. But it’s this flexibility that achieves the maximum amount of efficiency and performance in our systems. And in the spirit of our Evergreen philosophy, they can improve over time, as we adopt more advanced and efficient erasure codes and data reduction technology.

Ask the Right Questions, Get the Right Answers

So, it’s frustrating sometimes when people ask me “How long does it take to rebuild a DFM?” and I have to answer, “It depends.” Because as the saying goes, you can’t get the right answer by asking the wrong question. What people should be asking—and justifiably so—is this: 

“Is my data safe on Pure Storage?” 

Hopefully, with this insight into how our systems work, you’ll agree that the answer to that question is a resounding “yes”—both now, and in the future.

¹Note that the way cloud providers calculate durability in terms of “nines” is different from how a physical array would calculate MTTDL. In general, cloud providers talk about the durability of an individual object (KB to MB in size); whereas, we calculate for the whole array. An individual appliance is designed to have an MTTDL in the range of 1M-1B years, defined at the byte level. Adjusting for a typical object size to align with the cloud definition of durability, this corresponds to a rough equivalent of over 12 nines.