Pure Storage Architecture 101: Bringing the Best of Both Worlds

FlashArray was designed from the start to deliver an always-available, simple, reliable, fast platform to build your business on. Explore the design decisions behind it.

Pure Storage Architecture

image_pdfimage_print

The world of technology has changed dramatically as IT organizations now face, more than ever, intense scrutiny on how they deliver technology services to the business. Raw performance must be available at all times, in addition to resource delivery for IT consumers who span the globe. “Planned downtime” is almost a misnomer as any downtime or slowdowns, planned or unplanned, result in the same thing: lost productivity and lost revenue.

Traditional Storage Array Approaches

Storage arrays primarily take one of two approaches to address this challenge: active/active or scale-out. Each has its advantages and disadvantages.

Active/Active: Advantages and Disadvantages

Active/active, unlike active/passive in which only one controller accepts IO while the other does not, allows both controllers to simultaneously serve IO to hosts. This has the advantage of providing high levels of performance since both controllers’ CPU and memory are actively used to deliver performance. However, this also means that during a planned or unplanned outage of a single controller, the array has lost 50% of its total performance profile. Storage administrators must ensure that maintenance is either performed during off-peak hours, where a drop in performance is less impactful, or closely monitor controller resources so that the combined busyness of the two controllers does not exceed the capabilities of a single controller, thus guaranteeing no performance is lost when a single controller is offline.

Most companies prefer that this work is performed during off-peak hours; however, as businesses scale, the opportunity for off-peak hours continually erodes. This means employees are burdened with performing delicate maintenance tasks in the evening, overnight, on the weekend, or even on holidays.

Another consideration with active/active controllers is a concept of volume “ownership” by one controller at a time, often because each controller has its own independent write cache. This means storage administrators must take into consideration which controller owns specific volumes and additional multipathing software or configuration must be done on the host side as paths to one controller may be seen as “active” or “optimized” while the paths to the other controller are “standby” or “unoptimized.” During controller maintenance, the host must fail over paths from any volume “owned” by the controller going offline to the surviving controller. It also means more thought must be given to which volumes belong to which controller to properly balance workloads across them.

Banner CTA - Real-World Data Virtualization Stories eBook

Scale-Out Storage: Advantages and Disadvantages

Scale-out architecture utilizes software-defined storage (SDS) to separate storage hardware from storage software. This then lets the software act as the controller, which is why scale-out storage is considered to be network attached storage (NAS).

One of the most significant benefits of scale-out storage is its ability to scale seamlessly. Unlike scale-up storage, which requires adding new shelves and controllers as capacity needs grow, scale-out storage allows organizations to expand their storage capacity incrementally. By adding nodes to a dynamic cluster, businesses can effortlessly increase storage resources without creating isolated data silos or complex configurations.

Scale-out storage consolidates multiple nodes into a single, manageable resource pool. This unified approach simplifies storage management by allowing administrators to monitor and control the entire system from a single pane of glass. This centralization reduces the administrative burden and helps in efficiently allocating resources and managing workloads.

Each node in a scale-out architecture comes equipped with its own CPU, memory, and networking resources, which contribute to the overall performance of the system. This distributed architecture helps in balancing the load and optimizing data access speeds, ensuring that performance remains consistent even as capacity grows. However, although scale-out storage systems are designed to handle high-performance workloads, achieving optimal performance may require fine-tuning and ongoing monitoring. Administrators need to ensure that nodes are balanced and that the system is properly configured to handle peak loads.

Scale-out storage provides a cost-effective solution for expanding storage capacity. Instead of investing in large, expensive systems upfront, organizations can add nodes as needed. This incremental approach helps avoid overprovisioning and reduces the upfront capital expenditure, making it easier to align spending with actual growth.

Scale-out architectures often support non-disruptive upgrades, allowing organizations to replace or upgrade nodes without impacting system availability. This feature is crucial for maintaining continuous operations and reducing downtime during hardware upgrades. FlashBlade//S is a scale-out solution that is built to improve over time thanks to the Pure Evergreen™ portfolio.

Pure Storage’s Controller Approach

When we built our first product, FlashArray™, we determined that neither of these scenarios were satisfactory and set out to build something new.

Pure Storage® FlashArray™, released in 2012, was designed to deliver 100% performance, uptime, and, more importantly in today’s world, access to IT resources that can be dynamically and automatically created by users themselves via multiple avenues: API calls, scripts, automation tools, and plugins.

Achieving these goals required a new, two-part design philosophy when it came to data storage:

  • Component failure cannot and should not compromise performance or access to provisioning. 
  • Data protection should be built in a way that provides maximum resilience but not at the cost of speed of data or access to it, even during data re-protection. 

It had to address data availability, protection, and performance while also ensuring users can continue to consume or alter resources being served by the platform. Maintenance and even hardware failures should have no noticeable impact on performance and API availability.

To tackle this, we decided to design our Purity OS as an abstraction layer that can decouple the hardware and software. The identity of the array (IP addresses, WWPNs, array configuration, etc.) should be defined in software and be portable. This means a Fibre Channel (FC) WWPN exists as a virtual address rather than the hardcoded address tied to the physical port. Even if hardware is changed out multiple times, the FC WWPNs can persist into perpetuity. Should a controller fail, its FC WWPNs can move to the surviving controller, ensuring paths are not lost to the clients.

Pure the Last Scale-Out Solution You’ll Ever Need

If you opt for a scale-out approach, consider leveraging Pure Storage® FlashBlade® for the most robust and flexible storage solution available. FlashBlade provides unified fast file and object (UFFO) storage and stands as the most advanced all-flash solution for integrating high-speed file and object data.

FlashBlade delivers:

  • Exceptional Performance: Exceeding the capabilities of traditional scale-out NAS, FlashBlade offers extraordinary throughput and parallel processing with consistently high multidimensional performance. Scaling capacity and performance is as simple as adding more blades.
  • Flexible Scale-Out Architecture: With its advanced metadata architecture, FlashBlade efficiently manages tens of billions of files and objects, ensuring top-tier performance and comprehensive data services.
  • Streamlined Workload Management: Featuring AI-driven storage management, FlashBlade simplifies updates and operations through automated APIs, making workload consolidation and management straightforward and efficient.

Pure Storage’s Data Protection Approach

At the end of the day, if your data isn’t protected, always available, and free of corruption, your business is in trouble. 

We knew legacy RAID schemes or mirroring wasn’t going to cut it with flash. The more you perform a state change on a cell of NAND (altering its voltage, like changing a 1 to a 0), the faster you wear it out. Imagine what an entire RAID 5 parity rebuild would do to a flash device. Not only would there be excessive wear on the NAND, but the performance hit of the rebuild would also drag down system performance.

By addressing the unique challenges of flash itself, this enabled our engineers to perform data protection in a multitude of new ways.

First, the Purity OS handles wear leveling and garbage collection (cleaning up deleted or overwritten data in the flash) at a global level, rather than individual drive firmware. It has direct line of sight to the NAND itself, which is why we created our own flash media—DirectFlash® modules. We can perform these operations with the context of the entire pool of flash, which enhances its longevity. It also means that those processes are something that can run at all times, rather than performed ad hoc by a device firmware. Since a cell of flash can only be read or written to at any given time, we can’t have a device firmware deciding to perform garbage collection on data we need to access as this would result in increased latency to the clients. 

To overcome this, read requests on a FlashArray fall into two buckets: user reads and system reads. Since our DirectFlash Modules don’t have a firmware acting as a gatekeeper to the NAND, like an off-the-shelf drive would, and Purity is performing system tasks, like garbage collection, we can determine what reads are for host access to the data and prioritize it as such. This means those necessary processes for managing flash, such as garbage collection, won’t interfere with accessing the data.

To protect against drive failures, our RAID-3D provides N+2 drive protection across each chassis or DirectFlash shelf. Not only can the system continue operating at 100% performance when drives are lost, but RAID-3D was also designed to continuously rebuild parity with free space available in the system using background processes already running on the array. This means no dedicated hot spares (wasted drives), no performance degradation while parity rebuilds, and no racing to the data center on a weekend to replace a failed drive. The system will self-heal back to N+2 protection on its own.

On top of that, FlashArray is always running AES-256-bit Data at Rest Encryption (DARE) at all times. Again, this isn’t running in the drive’s firmware (no reliance on third-party self-encrypting drives) but at the Purity OS level globally. Keys cycle on their own so there’s no need to configure your own KMIP server.

All of this was built with data protection in mind but without creating additional administrative headaches. Customers should not need to make sacrifices to data protection in the name of performance or vice versa. The system should be protected, resilient, and performant at all times without anyone needing to tune anything or make any decisions about how to configure it. As such, all of our data protection, reduction, and encryption services are enabled right out of the box with no need or ability to disable or tune them.

Why This Matters

These design decisions drive right to the core of our vision for data storage in the modern data center: an always-available, simple, reliable, fast platform to build your business on

When facing competition like the public cloud, modern IT organizations should find ways to provide services in a cloud-like fashion. If you store any of your own personal data in the cloud, like photos, can you imagine how many times your data has moved to different hardware? Did you notice? Do you care? You want your data readily available and fast, no matter what technology challenges the company that houses that data may face. 

That’s the experience Pure Storage FlashArray delivers.