Non-Volatile Memory Express (NVMe) is becoming the primary protocol for interconnecting modern storage technologies both within storage arrays and between storage arrays, storage networks, and...
Flash storage has to balance two competing goals:
Most storage systems solve this problem with some form of non-volatile memory, or NVRAM. The array immediately writes incoming data to the NVRAM and acknowledges the IO back to the initiator, since the data is securely persisted. The data is only freed from NVRAM after it has been written to flash, which happens once the array has had time to compress, deduplicate and batch the data up.
One common approach is using battery-backed DRAM in the storage controller, or by using a UPS to provide backup power to the entire storage controller. At Pure, we used a slightly different approach, with flash-based NVRAM modules located in the storage shelf and shared between the redundant controllers. This allows for Pure’s “stateless controller” architecture. If a controller is replaced, there is no data in NVRAM or anywhere else to be destaged or mirrored.
Replacing controllers is quick and easy with FlashArray//m, so we knew stateless controllers become even more important. If you can slide a new controller into the chassis in seconds, you don’t want a complex data migration process to get in the way. At the same time, we knew that as we continued to improve controller performance, the SAS connection and flash backend of our FA-400 NVRAMs would become a bottleneck.
Enter the FlashArray//m NVRAM, custom hardware designed by Pure and the world’s first dual-connected PCI Express storage backed by DRAM:
With dedicated slots in the FlashArray//m chassis, each NVRAM is simultaneously connected to both controllers via PCI Express, with extremely low latency and more than 3 GB/sec of throughput. And the FlashArray//m chassis can hold four of these modules!
Inside the NVRAM, data is written to DDR3 DRAM, which means we have unlimited endurance as well as the performance to keep up with our PCIe interface. Of course, we need the data to stick around, so we use supercapacitors to store energy to write the RAM contents into flash if the power fails. Since we only write to the flash on power failure, endurance is not a concern. Everything, including the DRAM, flash and supercapacitors is included in the NVRAM module, so there are no battery packs, cables, UPSes or any other complexity to deal with.
Finally, the controllers talk to the FlashArray//m NVRAM using NVM Express (or NVMe for short), a standard interface for PCIe-attached storage. NVMe was designed for high performance as well as scalability on multicore systems, and both the NVMe protocol and software stack are slimmer than traditional SCSI, leading to lower latency. (Currently, we only use NVMe for the NVRAM modules, but the FlashArray//m midplane is wired for PCIe, so in the future we’ll be able to get the same advantages of NVMe for our flash modules as NVMe SSDs become available.)
All of this sounds great, but how does the FlashArray//m NVRAM module actually perform? I grabbed some real hardware and compared the new NVRAM module to the SAS NVRAM used in FA-400. I used a synthetic load of 100% 24 KB writes (which is representative of the workload of a Pure NVRAM, although these numbers come from driving the NVRAM harder than a full system does):
We can be very confident that NVRAM performance won’t be a bottleneck for FlashArray//m for a long time.