This post was originally published on this siteThe FlashArray implementation of Virtual Volumes surfaces VMs on the FlashArray as standard volume groups. The volume group being named by ...
Flash for backup? You may think that’s crazy. And to be honest, at first we did too when we heard about customers who were buying FlashBlade™, our high performance data hub, and using it for backup. Backup, one of the least strategic parts in the IT budget, didn’t seem to fit with the other analytics workloads – data warehouse, data lake, streaming analytics, and AI clusters – that customers were also running on FlashBlade.
But when we talked to them, not only did they dispel our notions of flash for backup being crazy, but they also convinced us that the backup market is at an inflection point where flash and cloud are playing a transformative role. Backup has changed from being only about minimizing the cost of having a good copy, and has instead become about how to make data available. Recover became create!
So, why had our customers started using flash for backup? Did they really need the performance of flash? And isn’t flash too expensive for backup? It turns out these customers weren’t really using flash for backup per se – they were using it for recovery. And they were using flash to repurpose backup data for DR, test/development and re-use in the cloud.
Over the past decade these customers have been caught in a perfect data protection storm. There has been a Cambrian explosion of data with datasets growing from terabytes to petabytes and beyond. Simultaneously, the introduction of flash has set new performance expectations for data centers. And last, data is becoming more important and a foundation of many modern businesses. These combined to drive the adoption of aggressive RTOs, which previously had been limited to the most critical workloads, to be the new standard for most production workloads.
But as a result, they weren’t meeting their backup and, more importantly, recovery SLAs. It turns out they aren’t alone – backup success rates today are between 75 and 85 percent and even when successfully backed up, 20% of recoveries don’t meet the business RTO. The disk-to-disk-to-tape backup architectures our customers, and many others, were using could no longer keep up with the advanced and constant flow of data they are tasked with protecting today. And simply scaling these old models may give temporary reprieve, but wasn’t solving the underlying challenges.
The backup architecture that most customers deploy includes both disk and tape. This disk-to-disk-to-tape (or D2D2T) backup strategy, a copy of the data is stored first on a disk-based backup appliance and then also saved to tape. The disk copy provides better restore performance than tape alone can deliver. Because disk is more expensive than tape, backup appliances leverage deduplication to provide a relatively cost-effective disk-based backup solution. Even deduplicated disk is still not as cheap as tape, but it closes enough that the cost difference is more than justified by the management improvements. This disk-to-disk-to-tape approach provided for quicker data restore from the backup appliance and leveraged tape for long-term retention.
D2D2T helped solved some of the management challenges of tape – you no longer have to search for tapes to do a restore or worry that you have a complete set. And, because backup appliances used disk to store backups, they also improve restore times – allowing terabyte-sized datasets to be recovered in hours.
Disk-to-disk-to-tape modernized backup a lot compared to previous methods, but it introduced new challenges in managing and dealing with backup appliances – they don’t scale well. When you fill up an appliance you have to buy another one and then when that fills up, another one. But each appliance you add is a new deduplication and management zone, which creates inefficiencies.
And most importantly, while they are faster than tape, most backup appliances are inefficient at restoring data. They are designed to ingest backup data as quickly as possible, but restore performance is secondary. To restore as quickly possible, you should be able to serve data as fast as the primary storage can consume it. But, as the appliance disks fills up, the restore speed can get even slower, so it is also hard to run an efficient system.
|As an aside, nostalgia and promotion in films like Guardians of the Galaxy have created a recent surge in audio tape sales which grew 136% in 2017, but this likely doesn’t foreshadow a wave of skinny jean-wearing hipster IT Admins drinking flat whites while they wait for their data to be slowly restored from tape.|
Flash offers orders of magnitude performance increases over spinning magnetic disk. High performance flash backups and restores can be used to match the speed of all-flash production systems restoring as fast as the production systems can consume the data. And, flash backups can also be used to enable more simultaneous server backups providing better utilization at scale. And by coupling flash with data reduction, we can get great economics AND great restore performance. The future is flash-to-flash.
Pure Storage® FlashBlade™ is a next-gen flash platform architected for bandwidth, delivering unprecedented performance for a wide range of workloads, including backup and rapid restore. Unlike competitors, FlashBlade restore performance exceeds that of backup. A 75-blade FlashBlade delivers peak backup performance of 90 TB/hr and restore performance that is 3x higher at 75 GB/s, And by the way this is in just 20 rack units.
FlashArray™ snapshots can be directly backed-up to FlashBlade, where they can be persisted and used for rapid recovery leveraging a new snapshot type called a Portable Snapshot where snapshot metadata is encapsulated into the snapshot. Allowing it to live anywhere.
After discovering how and why our customers were using FlashBlade for data protection, we have created Rapid Restore solutions to support all of the key databases, as well as solutions with both the traditional and newer data protection vendors. Rapid Restore isn’t “A” solution – it is a collection of over 10 different backup/recovery use cases and the list keeps growing.
The Rapid Restore solutions for databases are fast. REALLY fast. A single FlashBlade can support 15TB/hr of backup rate and almost 50TB/hr of restore rate. With nearly 3:1 data reduction on an Oracle RMAN backup to FlashBlade, DBAs can complete their database restores in minutes and hours instead of days.
We also have solutions for leveraging FlashBlade as a target for traditional data protection software to deliver improved performance in a smaller footprint. By complementing in-build compression features of FlashBlade with our data protection partner features like de-duplication, replication and cloud tiering users can optimize their backup infrastructures for performance, capacity and resiliency.
But there’s another part of the backup problem. Even though our customers were effectively using FlashBlade to deliver Rapid Restore capabilities, they still needed to store large amounts of data offsite for retention and compliance…which they had continued to do with tape. As previously mentioned, tape is complex and slow…but the real failure of tape is that your data is locked offline somewhere – providing no value for your company.
The answer is to finally replace tape with low-cost cloud object storage, like Amazon S3. This is the new Flash to Flash to Cloud (F2F2C) backup paradigm. To help our customers accelerate their journey to this new backup paradigm, Pure Storage provides customer with a few different solutions. CloudSnap is built-in to Pure Storage FlashArray and provides portable snapshot capability to other FlashArrays, FlashBlade or other NFS devices, or direct to the cloud. Pure also recently acquired StorReduce™, a cloud object storage fabric, that seamlessly manages large-scale unstructured data across clouds. In 2019, leveraging the cloud-first StorReduce scale out data optimization technology Pure intends to enable our customers to create cost-effective hybrid cloud and cloud native recovery solutions.
StorReduce technology, currently entering beta, was designed from the ground up for the cloud, and allows for the cost effective utilization of Object Storage such as Amazon S3 for backups. It will work in conjunction with FlashBlade Rapid Restore. As backup data is written to FlashBlade it deduplicates data inline and can store to the FlashBlade for insanely fast recovery… it can also be immediately replicated to Amazon S3 for low cost retention for archival data. You won’t have to change any of your existing backup processes as it works as a target behind nearly every well known backup application simultaneously. You simply configure your existing backup target to be an all-flash recovery appliance instead of traditional backup appliances and select tiering if you also want to archive to cloud.
If you can get your backup data to the cloud, then you can start to think about how you re-use your data for migration, dev/test, analytics, etc… Since StorReduce is built to run in the cloud, you can leverage the cloud edition of your backup software to recover wholly within the cloud. You will also be able to, in the future, restore your data to Pure’s Cloud Block Store or to your favorite Amazon data service. Turning what used to be cold data that sat on a tape inside a vault someplace into business value leveraging a wide variety of web services.
Checkout out how Pure Storage Cloud Data Protection Solutions can help you adopt a modern flash-to-flash-to-cloud backup strategy for cloud-economics-driven data protection with faster recovery and minimal management overhead. See how StorReduce technology, Pure Storage CloudSnap, and Cloud Block Store will work together to deliver a full F2F2C architecture for data stored on both Pure and heterogeneous arrays to give new life for backup data – retaining it safely and cost-effectively in the cloud, while opening-up new avenues for using backup data.