The Pure Storage FlashArray’s ability to rapidly clone and refresh SQL Server databases has proven to be incredibly popular in the SQL Server community. This post introduces a means that makes performing this task simpler than ever before.
It All Starts With The REST API
The REST API, an integral part of the Pure Operating Environment, is the corner stone of FlashArray integration. Every operation available through the Purity graphical user interface (GUI) has an associated REST API call. New features surfaced through the Purity GUI are always immediately available through the API. The REST API shares the same endpoint as the GUI, is always on and requires no gateways or middle-ware in order for it to be used.
All great software is built on solid foundations. The software development kits (SDKs) that Pure ships provide the simplest means by which the REST API can be consumed within PowerShell and Python scripts or programs.
Introducing PureStorageDbaTools
This PowerShell module is freely available from the PowerShell gallery and is free to use under the Apache 2.0 license agreement. At the time of writing the module provides three functions:
- Invoke-PfaDbRefresh
- New-PfaDbSnapshot
- Enable-DataMasks
For security purposes the functions in the module that require usernames and passwords use PowerShell credentials objects. An article detailing how to store FlashArray login credentials in a secure file can be found here.
Diving Into The Module
Lets take a look at the first function. This example illustrates how to refresh two databases from a single source database:
1 2 3 4 5 6 7 8 |
$Pwd = Get–Content ‘C:\users\devops\Documents\Secure–Credentials.txt’ | ConvertTo–SecureString $Creds = New–Object System.Management.Automation.PSCredential (“pureuser”, $pwd) $Targets = @(‘devopsdev1’, ‘devopsdev2’) Invoke–PfaDbRefresh –RefreshDatabase tpch–no–compression ` –RefreshSource devopsprd ` –DestSqlInstance $Targets ` –PfaEndpoint 10.223.115.10 ` –PfaCredentials $Creds |
All the functions in the module are self documented. Running:
1 |
Get–Help [function name] [–Full|–Detailed|–Examples] |
will provide the documentation that accompanies each function. As per the examples provided for Invoke-PfaDbRefresh, two other options exist for providing the source of the database refresh:
- Specifying an explicit snapshot name as the source:
1 2 3 4 5 6 |
Invoke–PfaDbRefresh –RefreshDatabase tpch–no–compression ` –RefreshSource source–snap ` –DestSqlInstance $Targets ` –PfaEndpoint 10.223.115.10 ` –PfaCredentials $Creds ` –RefreshFromSnapshot |
2. Selecting the snapshot from a list that already exist for the database volume:
1 2 3 4 5 6 |
Invoke–PfaDbRefresh –RefreshDatabase tpch–no–compression ` –RefreshSource devopsprd ` –DestSqlInstance devopstst ` –PfaEndpoint 10.223.115.10 ` –PfaCredentials $Creds ` –PromptForSnapshot |
Data Obfuscation
The other regular request that Pure Storage receives regards data obfuscation for test and development environments refreshed from production. And for organisations that wish to obfuscate data at the client level, SQL Server has a great answer for this in the form of “Dynamic data masking”. Dynamic data masking is available with SQL Server from version 2016 in Enterprise edition and in 2016 service pack 1 onwards in standard edition. This feature provides the solution to the problem of masking data from non privileged users. Non privileged users can query masked data which always appear in result sets with the data mask applied. This toolkit facilitates this in two ways:
- Specifying the ApplyDataMasks flag in Invoke-PfaDbRefresh function calls:
1 2 3 4 5 6 7 |
Invoke–PfaDbRefresh –RefreshDatabase tpch–no–compression ` –RefreshSource source–snap ` –DestSqlInstance $Targets ` –PfaEndpoint 10.225.112.10 ` –PfaCredentials $Creds ` –RefreshFromSnapshot ` –ApplyDataMasks |
2. Call the Enable-DataMasks function:
1 2 |
Enable–DataMasks –SqlInstance DEVOPSDEV ` –Database tpch–no–compression |
The data masking functionality determines what columns to apply what data masks to via extended properties:
1 2 3 4 5 6 7 8 9 |
exec sp_addextendedproperty @name = N‘DATAMASK’ ,@value = N‘(FUNCTION = ‘partial(0, “XX”, 20)” ,@level0type = N‘Schema’ ,@level0name = ‘dbo’ ,@level1type = N‘Table’ ,@level1name = ‘customer’ ,@level2type = N‘Column’ ,@level2name = ‘c_address’ GO |
In this example the dynamic data mask (FUNCTION = ‘partial(0, “XX”, 20)’ is set as the datamask value for the c_address column on the customer table. Note that the FUNCTION syntax follows exactly that outlined in Microsoft’s SQL Server dynamic data masking documentation.
Infrastructure-As-Code Tool Interoperability
The use of tools to deploy infrastructure-as-code is increasing and becoming the new normal:
Ansible provides a simple means for installing the PureStorageDbaTools PowerShell module on target servers via this task:
1 2 3 4 5 |
# Setup PowerShell module – name: Add Pure Storage Dba Tools PowerShell module win_psmodule: name: PureStorageDbaTools state: present |
Below is an example playbook to perform a SQL Server database refresh:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
####################################################### # Windows Powershell MS SQL Database refresh – name: PowerShell MS SQL Database refresh hosts: z–stn–win2016–a gather_facts: no vars_files: – vars/ps.yaml tasks: # Include Pure Storage PowerShell DBA tools – include: tasks/setupPS.yaml # PowerShell database refresh – name: PowerShell Database Snapshot win_shell: | $Pwd = Get–Content ‘C:\Temp\Secure–Credentials.txt’ | ConvertTo–SecureString $Creds = New–Object System.Management.Automation.PSCredential (“pureuser”, $pwd) New–PfaDbSnapshot –Database tpch–no–compression ` –SqlInstance {{ dbSource }} ` –PfaEndpoint {{ pEndpoint }} ` –PfaCredentials $Creds args: chg_dir: “{{ tgt_dir }}” register: snapshot # – debug: # msg: “Refreshing database using snapshot name {{ snapshot }}” – set_fact: sName: “{{snapshot.stdout_lines | select(‘match’,’name.*’) | list | first}}” – set_fact: snapName: “{{ sName[10:] }}” # – debug: # msg: “Refreshing database using snapshot name {{ snapName }}” – name: PowerShell Database Refesh win_shell: | $Pwd = Get–Content ‘C:\Temp\Secure–Credentials.txt’ | ConvertTo–SecureString $Creds = New–Object System.Management.Automation.PSCredential (“pureuser”, $pwd) Invoke–PfaDbRefresh –RefreshDatabase tpch–no–compression ` –RefreshSource {{ snapName }} ` –DestSqlInstance {{ dbTarget }} ` –PfaEndpoint {{ pEndpoint }} ` –PfaCredentials $Creds ` –RefreshFromSnapshot ` –NoPsRemoting register: refresh – debug: msg=“{{ refresh.stdout_lines }}” |
Ansible uses windows remoting to instigate tasks on windows servers. Windows remoting can use CredSSP or Kerberos as authentication mechanisms. Of the two authentication mechanisms, Kerberos is the most secure. But Kerberos does not allow “Double-hop” authentication, and luckily there is a simple solution to this:
- Use a task to create a snapshot of the FlashArray volume which the database resides on using New-PfaDbSnapshot
- Obtain the snapshot name from its output
- Use the name of the newly created snapshot in a call to Invoke-PfaDbRefresh with the NoPsRemoting flag specified
Call To Action
The new PureStorageDbaTools module is the solution to refreshing SQL Server databases in the fastest, simplest and most secure programmatic way possible. We encourage you to download and install PureStorageDbaTools from the PowerShell gallery and contribute towards the PureStorageDbaTools open source project on GitHub. Playbooks which illustrate how to integrate the module into Ansible workflows can also be found on GitHub here.
Summary Of Useful Resources
Pure Storage DBA Tools module Powershell Gallery page
Pure Storage DBA Tools GitHub Repository
Storing FlashArray login credentials in a secure file for use with PowerShell
Ansible Playbook for refreshing SQL Server databases