Data Protection Challenges The growth of data in recent years has been astounding. Modernization of database platforms and analytics along with big data has made data the primary asset ...
No one at Pure Storage knows about it, but I’m starting a new company to transform the transportation system. There’s no technology like it, and it easily represents a billion dollar market. Ready? Ok, here it goes.
It’s bigger than any train or bus you’ve ever seen, and it can transport thousands of people from point A to point B. Amazing, right? But… there’s just one small catch, however. If you want to return from point B to A, well, you are on your own. Bicycles are provided for the few who want to return. After all, if they wanted to return, why would they have taken the trip in the first place??
For the record, the mock-proposal above is a completely imagined, hypothetical scenario to highlight the real point I wanted to make. Such an industry actually does exist and it is indeed a billion dollar market. In this case, it moves data, not people. IDC even has a name for it, “Purpose-Built Backup Appliance” market.
Let me explain.
Data is often the most valuable asset in any organization. For some industries, like SaaS, data is their lifeblood. And therefore Data Protection is extremely important but there are two big flaws in the approach today.
First, the backup industry is dominated by inflexible appliances built to do a single task- to copy data in one direction from source to target. Meanwhile the rest of the computing industry is quickly moving away from inflexible technologies to agile and elastic solutions.
More importantly, these systems are notorious for poor performance when data needs to be recovered. In a blog by an independent consulting firm, titled “If You Thought Database Restores Were Slow, Try Restoring From an EMC Data Domain”, they estimated a ~6TB database would take 25 days to fully restore. It’s an estimation because they killed the job after 24 hours with merely ~2% of the job completed.
These two flaws cause the infrastructure to be both slow and complex. What do I mean?
Unnecessarily Complex Infrastructure with Traditional Backup Appliances
While these appliances are optimized for backup, slow restore performance forces customers to deploy additional storage systems. Production database snapshots are often stored in a separate storage system, similar in cost and performance to the tier-1 operational systems yet physically separate, in order to keep operational performance fast and predictable. Database Admins (DBAs) also want fast-as-possible access to live copies to iterate on latest code changes. Since restoring data from the backup appliance is slow, IT often deploys yet another storage system to support the test/dev environment.
Slow restore results in more than sprawling complexity. In the unfortunate case of disaster recovery, when data needs to be restored as quickly as possible, these appliances can be the bottleneck to getting the business back on its feet.
We believe backup appliances and storage silos are artifacts of an outdated approach. What if backup appliances no longer represent an entire industry with stand-alone dedicated hardware, but are simply another application that runs on a single, powerful platform?
Modern Data Protection Made Fast and Simple- Only Two Storage Systems Needed, FlashArray and FlashBlade
FlashBlade™ is that next-gen platform. It is the industry’s first scale-out storage system architected from the ground-up for unstructured data, delivering unprecedented performance for a wide range of workloads, including backup and restore. Customers deploy FlashBlade to consolidate fast recovery, test/dev, generic backup operations (and more as we’ll see shortly) onto a single platform.
Let’s dig into performance numbers. A 75-blade FlashBlade, announced at //Accelerate this year, delivers peak backup performance of 90 TB/hr and restore performance that is 3x higher at 270 TB/hr – in a diminutive 20 rack units. By way of comparison, Data Domain’s highest end system, the DD9800, would need 3 full racks to reach the same backup performance, according to their datasheet (with DDBoost off). Data Domain does not publish their restore performance, likely because it’s considerably lower than their backup rate. If you’re wondering about cost, we address that question later in the blog.
FlashBlade is 6x More Efficient than Competitive Systems for Backup Performance
The reality is that infrastructure today is even more complex than what is shown above. Many data centers have data lakes and data warehouses for analytics, running on dedicated silos of storage. IT often also supports software dev/devops teams with separate NAS or object systems.
BEFORE: Legacy Infrastructure, Filled with Storage Silos
Let’s look more closely at analytics. Typical data lake and data warehouse infrastructure brim with complexities. Each application or use case will have a separate warehouse, each copying data back and forth from the data lake. For the lines of business this complex environment is a nightmare to use, and for IT it’s a nightmare to manage.
AFTER: Modern Infrastructure, Consolidating Silos into Two Storage Systems
FlashBlade is the industry’s first data platform engineered for a wide range of workloads from instant restore to AI to software dev and more. It is not only built for unstructured data, but any type of unstructured data. By definition, unstructured data means unpredictable data- data can take any form, size, shape, and can be accessed in any pattern. FlashBlade can accelerate any data, small or large, random or sequential, and recently won the coveted award for “Best Innovation in AI Hardware” at the AI Summit Conference in San Francisco.
One of the questions often asked is, “Isn’t it cheaper to store data on backup appliances compared to FlashBlade, especially with deduplication?”
We can go back to the example of ½ rack of FlashBlade replacing 3 full racks of Data Domain DD9800, and we can demonstrate why FlashBlade offers a compelling TCO $ per TB. In fact, in the next section, a Pure customer makes the case on our behalf. However, in the end, we believe the $/TB is often the wrong question to ask. What do I mean?
For some customers, the reason why data exists is to be locked away. They have yet to build skills and tools to find business value in analyzing their data. We recognize there are some customers who fall into this camp and $/TB is absolutely the right metric for them.
In the modern, data-driven world, many enterprises actively mine their data for innovation and competitive advantage. DBAs want to improve their application by iterating code updates with latest data. Data scientists look for new ways to analyze more data. For these enterprises, data does not exist to be locked away. There’s too much value in their data. The right metric for these customers is time-to-insight. $/TB metric is too static to truly reflect their business. So then the question must be asked, what really is the restore performance of conventional backup appliances (e.g., like Data Domain), and why is it that those vendors don’t publish restore performance numbers?
One Pure Storage customer is a state transportation agency in the USA. While they’ve had FlashArray systems running tier-1 operational workloads, their infrastructure consisted of numerous storage silos.
In the old architecture, data was backed up into 1.5 racks of Data Domain. DBAs were forced to recover data directly from Data Domain, which was a painfully slow process. They also had various data warehouse silos for unstructured data.
Before and After: FlashBlade Simplifies the Infrastructure by Consolidating Silos
When the time arrived to upgrade the Data Domain system, the customer received a huge bill for forklift upgrade and data migration from the vendor. So they looked at other options and ultimately decided to modernize and simplify their backup infrastructure with Pure’s FlashBlade.
Today, the customer’s infrastructure consists of only two systems (excluding the tape archive), FlashArray and FlashBlade. Third party software orchestrates all backup and restore operations to FlashBlade and to tape archival. FlashBlade consolidated 1.5 racks of Data Domain, as well as the entire data lake and all data warehouse silos all in a simple and efficient 4 RU. DBAs now have instant access to their data. With the time freed up by this new, simpler infrastructure, they finally have time to kick-off a long-awaited analytics project.
Best of all, the customer experienced huge savings. In their own words, “FlashBlade pays for itself in a year with floor space and power savings”.
It’s easy to get started. The industry’s leading data protection software solutions support FlashBlade. Commvault and Veeam can back up from FlashArray to FlashBlade, and from FlashBlade to any archival tier. Read Veeam’s blog on three ways Veeam and FlashBlade drive higher ROI. With Rubrik, FlashBlade becomes a massive capacity storage tier with an ability to deliver instant restore. Read Rubrik’s blog on how their Cloud Data Management platform delivers rapid restore experience for customers.
Industry Leading Tools Support FlashBlade as Backup Target
If you use application native utilities, like Oracle RMAN, FlashBlade delivers unprecedented backup and restore performance. In a blog published today, we show how you can achieve 15 TB/HR backup and 11 TB/HR restore with Oracle RMAN. Over the next few days, we’ll publish technical blogs on MySQL and SQL Server as well.
It’s never too late to modernize your data protection and to eliminate infrastructure complexity and cost with a single, fast and simple data platform from Pure Storage. To learn more, please visit us here.