The Pure Good Foundation was established in late 2015 with the mission to accelerate positive change by harnessing the power of people, technology, and community to uplift communities ...
By: Alex Infanzon
Pure Storage Solutions Manager
So, you want to copy/clone your SAS data sets for data protection, test/dev or analytical modeling. Many SAS environments have 100s of gigabytes to a few petabytes of SAS files. It is a fairly common practice to clone these files to meet SAS users needs. In this blog, I highlight why consider Pure Storage FlashArray snapshots when cloning your SAS data sets. Snapshots are a fast, simple and an economical way to protect your data sets and even replicate them to a secondary site for disaster recovery.
Snapshots alone are not a substitute for a full disaster recovery strategy in a multi-tier SAS infrastructure, that is, backup and restore of SAS binaries, operating system, SAS applications, and all the pertinent data those applications require. Refer to the Backing Up SAS Content In Your SAS®9 Enterprise Intelligence Platform for developing such a comprehensive content backup strategy. However, Pure FlashArray Flashrecover features can be leverage in creating and deploying such a plan.
Now that we have established the scope of this blog, let’s jump into describing the key components needed for protecting your SAS data using Purity Flashrecover snapshots.
Snapshots are an intrinsic part of the Purity Operating Environment:
Additionally, the Purity Operating Environment data structures allow snapshots to preserve the granular data reduction efficiencies of volumes through global de-duplication and compression, thus volume snapshots require minimal physical capacity on flash drives.
Key benefits for using snapshots are:
The Flashrecover Snapshots white paper provides further details, including replication.
In directory-based operating environments, a SAS data library is a group of SAS files in the same directory. If the directory resides on a FlashArray volume you can create a snapshot of all the data in that volume almost instantaneously. Other SAS data files, such as the ones stored in SAS WORK and UTILLOC file systems do not need to be backed up because they are temporary. You can create a separate volume for those work areas on the FlashArray.
By using snapshots you do not need to determine which SAS data files are unique, meaning, those files that cannot be re-created from external databases and need to be backed up. WIth snapshots, you can backup all the data files with zero space overhead.
Zero complexity is another advantage of snapshots. With other data protection approaches you need to consider what is the source of the SAS data files? For example, If the source is a corporate data warehouse in an external database that is already protected, do you need to back up the SAS version of this data?. Probably not.
On the other hand, if there are newly created SAS data files or, many files that have been around for several weeks or months, those files need to be backed up. With Pure, you can simplify your strategy and create a snapshot of the volume with zero complexity, zero data duplication and zero space overhead. No need to be concerned about storage space.
Furthermore, you can easily automate multi-volume snapshot creation and retention through flexible policy-based management.
How often you make a backup depends on your recovery point objective (RPO) objective. With Pure you can create thousands of snapshots thus enabling you to have as many RPOs as you require.
However, due to how SAS works, you need to create the snapshots at scheduled times during the day. This is because the files associated with SAS must be in an unlocked (closed) state before you back them up, especially the data files associated with SAS in-memory applications (e.g., CAS, LASR, Viya architecture). If you just copy or backup the files without closing them, these files, when restored, will present themselves as corrupt or incomplete. Since snapshots are almost instantaneous your downtime is minimal.
The Purity snapshot management interface is designed to provide flexibility, scale, and ease of use. The interface allows users to select one or multiple volumes simultaneously in order to create a consistent point-in-time snapshot of all the selected volumes. Leveraging FlashArray’s flexible protection policy management, you can automate the creation and retention of snapshots for local data protection and recovery.
To protect your data on a remote site you can use the synchronous or asynchronous replication capabilities built-in the FlashArray (e.g., Purity ActiveCluster).
You could also automate the process using BASE SAS procs and/or SAS macros. Base SAS allows you to communicate with RESTful web services using the SAS DATA step and the HTTP procedure (REST at Ease with SAS®: How to Use SAS to Get Your REST). Which is how you interoperate with the Purity RESTful API.
Snapshots are a fast, simple and an economical way to protect your data sets and even replicate them to a secondary site for disaster recovery.
Pure snapshots deliver superior space efficiency, high scalability, and simplicity of volume snapshot management. When protecting your SAS data sets with Purity Flashrecover Snapshots you get: zero performance penalties, zero recovery restrictions, zero data duplication and zero space overhead.
Pure FlashArray Flashrecover features can be leverage in creating and deploying a comprehensive disaster recovery plan for SAS environments that takes into consideration other factors, such as, your recovery time and recovery point objectives.
A final note: As a reminder, before you start your backups, all the files associated with SAS must be in an unlocked (closed) state, especially the data files associated with SAS’ in-memory servers.