A modern data platform shouldn’t just provide the ability to underpin a customer’s cloud operations, it should also integrate with the public cloud, enabling customers to take advantage of multi-cloud architectures for application development, deployment, data protection, and disaster recovery (DR). And when integrating with the public cloud, that should be a native integration, enabling full access to the public cloud’s PaaS services, for the ultimate in multi-cloud agility. That’s our vision, and today we’re announcing the first step in delivering on that vision: Purity CloudSnap.
Cloud-Era Flash: “The Year of Software” Launch Blog Series
From the early days, Purity Snapshot has been one of the (many) built-from-scratch-for-flash data services that has differentiated Pure. Purity is a metadata monster (by the very nature of dealing with data reduction and flash management), and ultra-efficient low-overhead snapshots fall logically out of the architecture. Snapshots are used widely for data protection, test/dev, and VM cloning workflows, and our asynchronous replication solution leverages them as well.
Customer use cases for snapshots have been evolving, especially in the world of data protection and DR. In the 2000s we saw tape/truck protection workflows move to disk-to-disk-to-tape. Now, with the public cloud providing access to low-cost and offsite storage for retention, as well as on-demand compute for DR, we see more customers wanting to move to flash-to-flash-to-cloud as their protection and DR strategy.
There’s as much change and desire for flexibility in application development. Legacy disk infrastructure required purchasing multiple arrays, so that copies of “prod” could be moved over to “dev,” to allow developers to work without impacting the live application. Pure’s AFA performance and zero overhead snaps enabled prod & dev to consolidate, delivering snaps to developers from the same array. But now with the public cloud, further flexibility is desired – the ability to have the freedom to develop in the cloud and deploy on-prem, or deploy on-prem but have burst capabilities to the cloud. How we can easily facilitate this two-way data movement?
In Purity//FA 5.0, Snaps get a major upgrade, in particular the ability to move “off the box”. Moving snaps sounds easy, and the data movement part is, but we wanted a solution that allowed for the portability of snaps not only between our arrays (FlashArray and FlashBlade™), but onto 3rd-party storage arrays as well as the cloud.
Moving full copies of data is one thing, but we wanted a solution that enabled us to preserve data reduction within a snapshot, and to only send delta changes between snaps, where versions of snaps might be flowing frequently. The solution was to introduce Portable Snapshots, a version of our snapshot which encapsulates its metadata. It can then be truly portable – recoverable into any Pure array, but also recoverable outside of Pure with a recovery VM.
The snapshot differencing engine runs within the Purity operating environment within a FlashArray. It computes the incremental changes between different snapshots of a data volume and writes out only the delta. The metadata describing the delta is packed along with the data changes and this data blob is then written out to any third party storage target. The are two additional advantages with this approach, besides incrementals forever and ability of FlashArray to talk to heterogeneous storage targets besides Pure products. Firstly there is no need to run any Pure software on the destination third party storage target and secondly there is no back and forth network traffic to compute delta’s over the network, resulting in dramatically smaller RTOs.
The first solution we built with Portable Snapshots is the ability to move FlashArray snapshots to FlashBlade or generic NFS devices for retention (like say that old NetApp filer or Data Domain appliance you may already own). First, we have many customers leveraging FlashBlade for high-speed backups and restores, so integrating FlashArray & FlashBlade for snapshot movement was obvious. But some customers also wanted a lower-cost, disk-based target for longer term snapshot retention, and there’s plenty of spinning rust in the world, so we were happy to open-up portability to 3rd-party disk targets.
What’s exciting about this solution is how easy it is. No third-party data protection or data movement software is required, NFS targets simply appear as just another destination within Purity Protection Groups. Purity can already easily manage the process of moving snapshots and enforcing varied retention between multiple FlashArrays with Protection Groups, so now FlashBlade and generic NFS targets are just another destination within a Protection Group. And if a recovery is desired, it’s just as easy…versions can be browsed and recovery of the snapshot back to the primary FlashArray or a different FlashArray is simple.
Incremental use of portable snapshots will preserve dedup and compression within a snapshot. This has an extremely positive side benefit. It enables customers and service providers to use a standard Linux/Unix JBOD connected to a server to substitute for an expensive deduplicating data protection target appliance. Since Pure’s efficient data reduction (dedup and compression) is preserved on snap write-out from FlashArray, a standard NFS server can now store more logical data at a much cheaper price point for backup & archive retention purposes.
Additionally Portable Snapshots and its ability to natively archive/backup to FlashBlade and generic NFS targets is fully-integrated into Purity at no cost, as part of Pure’s ever-expanding Evergreen capabilities. Snap to FlashBlade and NFS with Portable Snapshots is currently in Beta, and planned to go GA in Q3. But what if I want to move these snapshots to the public cloud?
With Portable Snapshots now working, the next logical extension is to enable them to be moved to/from the cloud. We’re particularly excited about this use case, as it enables a ton of multi-cloud flexibility to enable both data protection, DR, and development use cases. A simple way to enable this movement (and what other vendors have done), would be to wrap the Purity software in a VM, and make it a replication target running in the cloud itself. But there are some obvious problems with this model…first – it’s expensive. If we use AWS terminology, keeping Purity running in EC2 and writing data to EBS would continuously burn dollars, as we have to keep a beefy EC2 instance running just to accept a replication stream. And secondly, once the data got to AWS, it would be locked in Purity format stored on EBS, inaccessible to other IaaS and PaaS services available in the cloud. If you want to use the cloud, you want to use it natively with all its services and utilities with your data.
Today Pure is announcing Purity CloudSnap – our solution for integrating Purity with major public cloud storage providers. CloudSnap is focused on integrating natively with the public cloud, and our first focus is IaaS market leader AWS.
The first use case to be supported by CloudSnap will be to enable data protection and DR workflows to S3 and Glacier. Purity will be able to send data directly to S3 natively, and snapshots can then be moved to Glacier for longer-term retention. Since this is native to Purity there is no need for any additional software or a cloud gateway. CloudSnap preserves Purity’s data compression and deduplication in-transit to S3/Glacier increasing the ROI on already cheap cloud storage and preserves network bandwidth costs.
As above, S3 will appear like another destination within Purity Protection Groups, and snapshots can be sent incrementally for cloud retention, or retrieved for recovery. Our goal is to release this first version of CloudSnap focused on backup/recovery/archive workflows in Q4.
Once this workflow is completed, we intend to broaden the use cases that CloudSnap delivers. The next step will be to enable recoveries from S3-hosted snapshots into EBS so that snaps can be mounted to EC2 instances for usage in the cloud, such as in DR scenarios.
Finally, we then anticipate working more aggressively on migration and hybrid test/dev workflows, enabling replication bi-directionally to/from EBS. These additional use cases are planned for 2018, and will likely evolve in functionality as we work closely with customers on CloudSnap in 2017.
Data protection is increasingly becoming an integrated feature of the storage platform, while Protection Groups framework provides data protection schedules and policies natively within Purity, we know that many customers will also want to leverage a best-of-breed data protection infrastructure to implement a common backup and archiving strategy across their heterogenous environment. So in addition to continuing to enhance our own capabilities, we make a robust and open API available to 3rd-parties called DeltaSnap. DeltaSnap is an open REST API which provides changed block information across Purity snapshots to 3rd parties, making it possible to integrate with Purity to move incremental snapshots instead of a full snapshot each time. This enables ultra-efficient movement of data between Purity and 3rd-party data protection solutions, taking full advantage of the space-saving capabilities of Purity Snaps.
Pure is proud to have a rich ecosystem of partners who have either already integrated with or are planning to soon deliver solutions with DeltaSnap:
While we’re just at the beginning of our cloud integration journey with Purity CloudSnap, we felt it is important to make our roadmap clear here to customers, as we are committed to driving deeper native integration between Purity and leading public clouds.
As we deliver on this vision in 2017 and 2018, we’d love to partner with forward-thinking Pure customers who are eager to integrate more deeply with the public cloud – please drop us a line if you’d like to be part of our customer advisory council in this feature area.