With the tremendous growth of data, companies are looking for the best approach to back it up. Ensuring the fast recovery of data when a disaster occurs has become more difficult. Businesses face even more significant challenges if they don’t modernize their rapid restore strategy as well. Most disk-based backup solutions can’t keep up with the demands of rapid recovery. 

Pure FlashRecover™, Powered by Cohesity®, can solve data-restore challenges while keeping downtime to a minimum. This post will demonstrate the performance characteristics of Pure FlashRecover for VMware recovery and how you can use the solution to solve the two biggest challenges of data protection: slow recovery and disaster recovery.    

Pure FlashRecover brings together Pure Storage® FlashBlade® with Cohesity diskless hardware and intelligent, extensible software. This solution enables you to spend less time worrying about retrofitting legacy solutions for future needs so you can focus on the rapid restoration of your applications and VMware environment. 

The Pure FlashRecover file system combines infinite scalability with open architecture flexibility so you can consolidate multiple business workloads on a single platform. FlashBlade provides the most robust and fully distributed unified fast file and object (UFFO) storage. Together, FlashBlade and Cohesity software deliver one of the fastest backup-and-restore solutions on the market today.

Full Recovery vs. Instant Recovery: Why Full Recovery Is Important 

Instant recovery provides instant access to backup data but doesn’t provide full recovery of data rapidly. It may take many hours to complete a full migration/recovery of data to the primary source. While it does solve some challenges, performing instant recovery can be cumbersome when a disaster-recovery situation arises. While you can instantly access backup data, you may experience performance issues and impact other running backup policies. Instant recovery can also overload the VMware infrastructure when thousands of virtual machines (VMs) are involved in the process. In short: Instant access may not be the most optimal solution for full copy recovery. 

On the other hand, Pure FlashRecover offers full rapid recovery. It guarantees the recovery of VMs with high performance while maintaining the defined SLAs for restore without overwhelming the backup infrastructure. It also keeps existing backup SLAs intact when restoring hundreds of VMs.

Performance Test Bed and Configuration

Powered by Cohesity C4000 nodes, you can’t deploy Pure FlashRecover as a standalone cluster. You’ll need to configure them with Pure FlashBlade. Cohesity complements Pure FlashBlade storage with a distributed file-system software architecture designed for high availability. The nodes have a share-nothing topology without a single point of failure or inherent bottlenecks. This disaggregation of compute and storage provides the ability to scale independently and linearly on compute and capacity. The distributed file system spans across all nodes in the cluster and natively provides global deduplication with compression and encryption powered by FlashBlade. 

You can use automation to deploy Pure FlashRecover. You only need to supply the IP address of the existing or new FlashBlade in the data center. Cohesity auto-deployment software will auto-detect the FlashBlade and validate whether there are existing data VIPs (eight) configured to perform the deployment. The deployment will create:

  • 7 NFS 3.0 file systems per  node
    • 6 NFS file systems for data per node
    • 1 NFS file system for indexing on the FlashBlade with each 3TB disk by default or user-specific disk size 

The software will create 28 file systems for clusters with four nodes, mount the NFS file system on each node of the cluster, and load balance across data VIPs.

Next, it will create a new set of NFS file systems on FlashBlade after the deployment. It will mount the NFS file system over data VIPs and balance across all Cohesity cluster nodes. If any nodes fail, it will redistribute the NFS mount point on the remaining nodes, then later rebalance across the cluster once the failed node is back online. If you add new nodes to the configuration, the software will create a new set of NFS file systems for data storage.

Test Configuration

Figure 1 illustrates the Pure FlashRecover test environment used to evaluate the performance on a four-node-cluster configuration. We designed the testbed to validate the performance and scalability of the backup and restore of the VMs with Pure FlashRecover and FlashBlade: 

  • ESX servers have VMware ESXi 6.7 installed. 
  • Servers are clustered in groups of four within vCenter, which is set up for ingestion VMs. 
  • The VMs are spread across three Pure FlashArray™ devices connected to the ESX hosts over iSCSI. 
  • Datastores are created from the three FlashArray devices on which the VMDKs are created for every VM. 
  • To avoid the read throughput, the testbed VMs are selected equally from multiple ESX servers and three datastores, one per FlashArray device.

 

Figure 1: Diagram of Pure FlashRecover, Powered by Cohesity performance test environment.

VM Backup Tests

Objectives

This test case measured the overall backup throughput achieved on 8, 16, 32, and 64 VM full backups with the defined Pure FlashRecover architecture. We selected not more than eight VMs from one ESX server. 


Virtual Machine Backup Performance Results

We performed full backups of VM backup policies with 8, 16, 32, and 64 VMs through the Cohesity interface. The Cohesity interface status log captures the backup speed. The throughput measured on a VM is based on end-to-end performance, including pre-and post-operation on the VMware level (i.e., taking a snapshot, performing the actual data backup, and conducting re-consolidation on the snapshot). To avoid overhead on the VMware snapshot consolidation, we restricted I/O operations on the VM(s) during backups.    

Full Backup

With Pure FlashRecover, the initial full backup in our test achieved 3x data reduction rates overall from inline deduplication. With compression disabled on the Cohesity side, it sends non-compressed deduplicated data sets to FlashBlade, which has compression enabled by default. FlashBladefurther compresses the data to achieve better overall data reduction rates. On initial full backup, the solution achieved a maximum average throughput of 3.2GB/s on tests of 64 VMs. That’s almost equivalent to 12.24TB/hour of ingestion rates on a full backup. The backup rates were linear with 8, 16, 32, and 64 VMs (Figure 3). You can increase the number of VMs in a policy and add more Cohesity C400 nodes to get higher throughput. But,  we limited our environment to four-node clusters.

Figure 2: Backup performance of Pure FlashRecover, Powered by Cohesity.

Rapid Restore on Full Copy Recover

Full copy recovery of VMs for policies with 8, 16, 32, and 64 VMs are performed through the Cohesity interface. The restore performance is captured from the Cohesity interface status log. The throughput measured on a VM is based on end-to-end performance, which includes pre-and post-operation on the VMware level such as creating an object for restore, performing the actual copy of the VMs, and post-processing on VMware. 

The restore tests showcased one of the biggest advantages of Pure FlashRecover: rapid data restore. The restore rates were linear with 8, 16, 32, and 64 VMs (Figure 4). And with a four-node FlashRecover cluster configuration, the restore rates were as high as 4.3GB/s, which is almost equivalent to 15.5TB/Hr of restore rates on VMware. 

Figure 3: Restore performance of Pure FlashRecover, Powered by Cohesity.

Advantages of Implementing Pure FlashRecover 

Using FlashBlade as an NFS storage target for Pure FlashRecover deduplicated data offers a storage-efficient solution for rapid restore of VMs. The exclusive integration between Cohesity and FlashBlade enables you to consolidate secondary storage workflows. This architecture provides the industry’s top compute platform and disaggregated unified flash storage with integrated backup and recovery software. The integrated solution can deliver: 

  • Up to 3x faster backup and restore throughput than disk-based alternatives
  • Recovery of thousands of virtual machines a day 
  • Disaggregated compute and storage for independent scaling of backup and recovery processes
  • Reuse of backup data on FlashBlade for modern apps