This post is the first of a two-part series focusing on Pure’s storage solutions for SQL Server 2019 big data clusters. Each post can be read independently or as a whole.
Microsoft SQL Server has served the IT community admirably when it comes to the processing of high-value relational data. However, since the development of SQL Server, times have moved on. The growth of data has been relentless, as has the growth in interest in data science. Historically, SQL Server has never been a scale-out analytics platform for processing both the read and write elements of workloads. Many of the tools favored by data scientists are more at home on Linux than Windows. SQL Server 2017 enabled SQL Server to go places it has never gone before by allowing it to run on Linux. This move opened a new world of open-source software and Linux based data science tools to SQL Server. Because Linux is more of a first-class citizen in the world of containers than Windows, SQL Server’s availability to run on Linux also broadened its horizons for containerized workloads.
A container engine in isolation can run on a laptop, server, PC, or the public cloud. However, container engines in isolation lack:
The solution to these shortcomings is a container orchestration platform, and at the present time, Kubernetes is the near industry standard for container orchestration. These threads culminated in Microsoft releasing SQL Server 2019 big data clusters, a scale-out platform that runs on Kubernetes for processing both high value relational and unstructured data.
The Role of Storage
High-value data requires storage that is reliable, durable, highly available, secure, and consistent in terms of performance. Given that by design, one of Kubernetes’ primary aims is to abstract infrastructure away from users of the platform, how is storage-as-a-service delivered in this brave new world?
Enter Pure Service Orchestrator
Pure’s storage platforms are trusted by numerous organizations when it comes to processing their most mission-critical SQL Server workloads. But how is this storage consumed in the new world of containers, enter Pure Service Orchestrator™!; Pure’s storage plugin for containerized workloads:
Three key areas set Pure Service Orchestrator apart from a plugin that only provides persistence for containers:
Installing and Configuration Pure Service Orchestrator
Once a Kubernetes or OpenShift Container Platform cluster is up and running, Pure Service Orchestrator is installed and configured by using the following three simple steps:
1) Clone the Pure Storage GitHub repo that contains Pure Service Orchestrator:
2) Specify the details of the arrays used by Pure Service Orchestrator in a values.yaml file. A values.yaml file template can be found here. Below is an example of the contents of a values.yaml file:
3) Run the install script to set up the PSO-operator.
install.sh –image=<image> –namespace=<namespace> \
–orchestrator=<ochestrator> -f <values.yaml>
Configure the Storage for The Big Data Cluster
SQL Server 2019 big data cluster storage is specified in a ‘Configuration.’ The following instructions assume:
Pure has always provided a storage experience that the SQL Server community loves. A trend that will continue for SQL Server 2019 big data clusters with a genuine storage-as-a-service experience on Kubernetes that provides elastic scaling, fault tolerance and smart provisioning.