This post was originally published on this siteSo one of our field engineers reached out to me because they had a power outage of some sort and their ...
You may have noticed the phrase “Flash changes everything” on the Pure Storage home page. Well, not quite everything—in fact, part of the raison d’etre for companies like Pure Storage will be to deliver flash in form factors that make it very easy for end users to adopt it. But flash really does necessitate dramatic changes inside the storage systems that purport to offer its benefits.
Flash is simply a profoundly different media than mechanical disk. Before the industry shortened its name, we talked about Flash memory, with good reason: flash behaves far more like a persistent version of DRAM–that is, as a true random access device, with no position dependence or mechanical components.
Our claim is that no matter how flash is packaged for incorporation in the data center – whether as a PCIe server add-on, as a shared appliance, or within an enterprise array – the internal systems software that manages the reading and writing of that flash should be radically different than the software that looks after rotating disk. Here, for your consideration, are the first five of our top ten reasons why flash really does change everything within the storage software managing that flash:
Disks are serial devices in that you read and write one bit at a time, and parallelization is achieved by pooling disks together. Algorithms for optimizing disk performance, then, strive to balance contiguity (within disks) and parallelization (across disks). Flash memory affords a much higher degree of parallelization within a device – the flash storage equivalent to a sequential disk contains more than 100 dies that can work in parallel. For flash, then, the trick is to maximize parallelization at all levels.
For disk, the goal is to read or write a significant amount of data when the head reaches the right spot on the platter, typically 16KB to 128KB. The reason for this is to optimize the productive transfer time (time spent reading or writing) versus the wasted seek time (moving the head) and rotational latency (spinning the platter). Since seek times are typically measured in milliseconds and transfer times in microseconds, performance is optimized by contiguous reads/writes (think thicker stripe sizes), and sophisticated queuing and scheduling to minimize seek and rotational latency. For flash, all these optimizations add unnecessary overhead and complexity, since sequential and random access perform the same.
For traditional disk storage, the systems software and the storage administrator generally need to carefully keep track of how logical volumes are mapped to physical disks. The issue, of course, is that a physical disk can only do one thing at a time, and so the way to minimize contention (spikes in latency, drops in throughput) is to isolate workloads as much as possible across different disks. With flash memory, all volumes are fast. After all, do you care to which addresses your application is mapped in DRAM?
Of course, flash still requires that additional parity be maintained across devices to ensure data is not lost in the face of hardware failures. In traditional disk RAID, there is an essential trade-off between performance, cost, and reliability. Wide RAID geometries leveraging several spindles and dual-parity are the “safest.” They reduce the chance of data loss with low space overhead, but performance suffers because of the extra parity calculations. Higher-performance schemes involve narrower stripes and more mirroring, but achieve this performance at the cost of more wasted space. Flash’s performance enables the best of both worlds: ultra-wide striping for protection with multiple parity levels at very low overhead, while maintaining the highest levels of performance. And of course, dramatically faster re-build times when there is a failure.
Traditional storage solutions often strive to keep data in the same place; then there’s no need to update the map defining the location of that data, and it makes for easier management of concurrent reads and writes. For flash, such a strategy is the worst possible, because a dataset’s natural hot spots would burn out the associated flash as the data is repeatedly rewritten. Since any flash solution will need to move data around for such wear leveling, why not take advantage of this within the storage software stack to more broadly amortize the write workload across the available flash?
OK, we’ll stop there for now. With any luck, we’re starting to shed some light on the fact that simply dropping SSDs into a storage solution designed around the idiosyncrasies of mechanical disk cannot deliver the full capabilities of flash. Instead, we believe flash is deserving of the same consideration that disk has enjoyed for decades – that the systems software managing the reading and writing of flash be optimized for flash’s own idiosyncrasies. Please check out Part II of this post where we consider the rest of the top ten reasons that flash really does change everything.