Leveraging FlashArray in a Continuous Integration Build Pipeline

Here, we will be talking about leveraging FlashArray. Previously in this blog post I outlined what continuous integration was and how FlashArray’s snapshot capability can be integrated into a continuous integration build pipeline. This post will build upon this example by covering a full walk-through using a code sample publicly available from code.purestorage.com. To recap, […]


Here, we will be talking about leveraging FlashArray. Previously in this blog post I outlined what continuous integration was and how FlashArray’s snapshot capability can be integrated into a continuous integration build pipeline. This post will build upon this example by covering a full walk-through using a code sample publicly available from code.purestorage.com. To recap, this is what a basic build pipeline typically looks like:

“As Code” is the New Normal

Of the many recent trends in information technology, there has been a shift towards provisioning compute , network and storage resources via infrastructure specified as code. Many of the popular continuous integration build engines have followed this trend by allowing build pipelines to be specified as code, including:

  • Travis CI support for build definitions specified in yaml
  • Jenkins 2.0 support for build definitions specified in Groovy script
  • Visual Studio Team Service 2018 and Team Foundation Server support build definitions specified in yaml

One of the primary drivers behind specifying build pipelines as code is that this allows build pipelines to be stored under source code control with the actual application code itself. This post will focus on Jenkins 2.0 and Team Foundation Server 2018.

Jenkins 2.0

Jenkins originated from a build engine called Hudson developed at Sun Microsystems by Kohsuke Kawaguchi. When Sun Micro systems was acquired by Oracle, a new fork of the Hudson code base was created and Jenkins was born. Jenkins is underpinned by a rich ecosystem of third party plugins, it is incredibly portable by virtue of the fact that it runs on Java, it is free and there is a commercially supported distribution of Jenkins provided by CloudBees. Jenkins 2.0 introduced “Pipeline as code”, this was originally in the “Jenkins script” dialect, later on in February 2017 the declarative pipeline syntax was released. The declarative syntax is opinionated by nature and whilst it does not give end users the same flexibility and power as Jenkins script, it does not come with the same Groovy script learning curve as Jenkins script. Jenkins pipeline as code comes with some particularly powerful features:

  • Multibranch build pipelines
    By specifying the pipeline code in a Jenkinsfile at the root of the source code repository, the build engine will create a build pipeline for each branch of code.
  • Shared Libraries
    The ability to encapsulate code used across pipelines in shared libraries in order to promote build pipeline code reuse.
  • Docker pipeline
    This allows containers to be manipulated directly in the pipeline code without making call outs to external scripts, shells or other scripting languages, all made possible by extensions to the Jenkins Groovy script DSL (domain specific language).

The Pure Storage engineering team use Jenkins as per this blog post on build pipelines and streaming data analytics:

The Twelve Factor App And Testing Using Production Realistic Data

Heroku, founded in 2007 and now owned by Salesforce.com was a pioneer in the cloud platform-as-a-service space. “The Twelve Factor App” concept developed by Heroku, lays out twelve principles which should be followed when developing cloud native applications. Factor Ten states that production and development environment should be as closely aligned as possible. What if, as part of the build pipeline, the test or staging environment could be refreshed from production? In this age of regulation around Personally Identifiable Data and GDPR, we may also want to encrypt or obfuscate some of the data by adding the appropriate actions required to do this into the build pipeline itself.

Creating An Actual Build Pipeline

Now that we have covered why build-pipeline-as-code is popular and why test environments should mirror their production counterparts as closely as possible, lets create a build pipeline that achieves both of these goals using the following steps:

  1. check a visual studio SQL Server data tools (SSDT) project out from source code control (‘git checkout’ stage).
  2. compile the code into an entity known as a DACPAC (‘Build Dacpac from SQLProj’ stage).
  3. refresh a test database from production (‘Refresh test from production’ stage).
  4. deploy the DACPAC to the test database (‘Deploy Dacpac to SQL Server’ stage).

The pipeline code is stored in the Jenkinsfile.simple at the root of this GitHub repository. Refreshing the test database will be facilitated by the Refresh-Dev-PsFunc PowerShell function which can be found on GitHub here.

Jenkins 2.0 Example Pre-Requisites

  • Jenkins 2.0
    Assuming that Jenkins is installed on Windows, the service account that Jenkins runs under will require permissions to perform a checkpoint on the source database and offline/offline the target database. Executing the powershell module will require privileges to online and offline the windows logical disk that the target database resides on.
  • Visual Studio 2017
    Community edition will suffice, this needs to be installed for “Data storage and processing” workloads
  • ‘Source’ SQL Server database with all data files and the transaction log residing on the same volume
  • ‘Destination’ SQL Server database with all data files and the transaction log residing on the same volume
  • GIT for Windows
  • PowerShell Get module
  • Pure Storage PowerShell SDK module
  • dbatools.io PowerShell module
  • Pure Storage refresh dev SQL Server database from ‘Production’ PowerShell function

Steps For Setting-Up The Build Pipeline

  1. Download and install GIT.
  2. Download and install Visual Studio 2017 community edition, ensure that this is installed for “Data storage and processing”.
  3. Download and install Jenkins, for the purposes of expediency, accept the default plugin suggestions.
  4. Whilst logged into Jenkins, go to Manage Jenkins -> Manage Plugins and install the plugin for msbuild, after this has installed you will need to restart Jenkins.
  5. Go to Manage Jenkins -> Global Tool Configuration -> Add Msbuild and then enter the path to the msbuild executable (installed as part of visual studio), this should be: C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\MSBuild\15.0\Bin\msbuild.exe, when configured correctly, this is what the msbuild installation should look like under global tools:
  6. Install the PowerShell get module, followed by the PureStorage PowerShell and dbatools modules from the PowerShell gallery:

    Note: Install-Module needs to be run from within a session with administrator privileges.
  7. Download the Refresh-Dev-PsFunc powershell function script from this GitHub repository, in this example the Refresh-Dev-PsFunc.ps1 script will be placed in C:\scripts\PowerShell\autoload.
  8. Check whether or not a PowerShell profile has been created by running the following command from within a PowerShell session:
  9. If the Test-Path command returns false, execute New-Item command below in order to create a PowerShell profile.Note:there are a total of six different PowerShell profiles, if the build fails because it cannot find the Refresh-Dev-PsFunc function, the most likely reason is that the function is in the wrong profile, i.e. not the one associated with the windows account running the Jenkins service.
  10. Open the $profile for editing via the following command:
  11. Add the following lines to the file and then save it:
  12. The Windows account that the Jenkins service runs under will require access to both the source and destination databases, by default Jenkins will use the built in account “NT AUTHORITY\SYSTEM”. Whilst the use of this account is not recommended for production purposes, to try this exercise out as a proof of concept, this account can be used. In this case this account can be given sysadmin rights to the instance that the source and destination databases reside on as follows:

    Note: For production purposes, the creation of a service account is recommended; an account with a password that does not expire and which does not require a change when the first logon attempt is made using it.
  13. Obtain something to build ! This is where we will obtain a SQL Server data tools project which our build pipeline will build into a dacpac and deploy:
  14. At the top level of the Jenkins console (https://localhost:8080 by default) go to new item -> give the item a name (Jenkins FA Snapshot CI Pipeline – for example) -> pipeline and then hit Ok.
  15. Take the contents of the file from this link and paste this into the script box in the pipeline section, as highlighted by the red rectangle below:
  16. Hit Apply and then Save.
  17. At the top level of the Jenkins console, go to credentials -> system -> global credentials (unrestricted) and hit the adding some credentials link?.
  18. Select “Username with password” as the ‘Kind’ and provide the username and password for an account that can access the array, for the purposes of this exercise the ID must be ‘FA-Snapshot-CI-Array’:
  19. Let’s now perform a build ! The very first time we perform a build the icon to do this will be displayed with the text “Build now”, after the initial build this will change to “Build with parameters”:
  20. For the initial build to succeed the default parameters in the script, i.e. database, sourceinstance, destinationinstance and pfaendpoint need to be changed to values that reflect a database on the source and destination instances, real source and destination instances. Finally, use an array end point that exists:
    1. Once the build has completed successfully, the stage view for the build should look like this:
      Et voila ! our walk through for creating a build pipeline which leverages the FlashArray REST API via PowerShell to refresh a test database from production prior to deploying an artefact to it is complete. If there is a requirement to repeat this process for a second build pipeline, only the steps from 12 onward need to be followed.