Summary
This article shows how to copy a SQL database that is in an availability group into Kubernetes and use Portworx to snapshot the databases in the pod to provide multiple copies for testing and development.
This article on SQL Server distributed availability groups initially appeared on Andrew Pruski’s blog. It has been republished with the author’s credit and consent.
I was talking with a colleague last week and they asked, “How can we copy a SQL database that is already in an availability group into Kubernetes and then provide copies of that database for development or testing?”
One of the great things about Kubernetes is the ability to quickly and easily spin up SQL instances, but getting databases, say from a production environment, into Kubernetes is an entirely different challenge.
If the production SQL instance is already in an availability group (which is fairly standard practice), then we can provision a distributed availability group to seed databases from the Windows AG into a pod in Kubernetes. And then from there, we can use Portworx® to snapshot the databases in the pod to provide multiple copies for testing and development.
So let’s run through the process!
Here is the existing Windows availability group:
Just a standard, two-node AG with one database already synchronized across the nodes. It’s that database we’re going to seed over to the pod running on the Kubernetes cluster using a distributed availability group.
Here is the Kubernetes cluster:
Four nodes, one control plane node, and three worker nodes.
To provide persistent storage for the SQL instance in the Kubernetes cluster, Portworx has been installed:
I won’t go through the installation of Portworx, but you can view the documentation here.
After Portworx is installed, a storage class can be created to provision persistent volumes dynamically for our SQL instance.
Here is the manifest for the storage class:
Once we have the storage class, we can now deploy a statefulset running one SQL Server pod:
This is pretty stripped down. No resources limits, tolerations, etc. It has one persistent volume for the system databases from the px-mssql-sc storage class.
One thing to note:
Here an entry in the pod’s hosts file is being created for the listener of the Windows availability group.
We can also view the persistent volume created in Portworx:
This is the volume that we will snapshot to provide clones of the database once it has been seeded over into the Kubernetes cluster using a distributed availability group.
The next thing to do is deploy two services: one so that we can connect to the SQL instance (on port 1433) and one for the AG (port 5022):
Note: We could use just one service with multiple ports configured, but I’m keeping them separate here to try and keep things as clear as possible.
Check that everything looks OK:
Now, we need to create a master key, login, and user in all instances:
Then, create a certificate in the SQL instance in the pod:
Back up that certificate:
Copy the certificate locally:
And then copy the files to the Windows boxes:
Once the files are on the Windows boxes, we can create the certificate in each Windows SQL instance:
OK, great! Now we need to create a mirroring endpoint in the SQL instance in the pod and grant connect to the login created earlier:
There are already endpoints in the Windows instances, but we need to update them to use the certificate for authentication:
Now we can create a one-node clusterless availability group in the SQL instance in the pod:
No listener here; we’re going to use the mssql-ha-service as the endpoint for the distributed availability group.
OK, so on the primary node of the Windows availability group, we can create the distributed availability group:
We could use a host file entry for the URL in AG2, but here, we’ll just use the IP address of the mssql-ha-service.
OK, nearly there! We now have to join the availability group in the SQL instance in the pod:
And that should be it! If we now connect to the SQL instance in the pod…the database is there!
There it is! OK, one thing I haven’t gone through here is how to get auto-seeding working from Windows into a Linux SQL instance. As long as the database data and log files are located under the Windows SQL instance’s default data and log path, they will auto-seed to the Linux SQL instance’s default data and log paths.
Now that the database has been seeded over to the SQL instance running in the pod, we can use Portworx volume snapshots to provide multiple copies!
The first thing to set up is a VolumeSnapshotClass:
This will allow us to take a snapshot of the persistent volume on which the SQL databases are hosted.
Once the VolumeSnapshotClass is created, we can take the snapshot using:
Make sure that the persistentVolumeClaimName matches the persistent volume claim that was created for the statefulset the database was seeded into.
To check that Portworx created the snapshot:
And there we have the snapshot of the persistent volume that the database is on. Now we can create a new persistent volume claim from that snapshot and deploy another instance of SQL Server in a statefulset referencing that claim.
So, let’s create the persistent volume claim from the snapshot:
This references the stork-snapshot-sc storageclass that is installed with Portworx and provides the snapshotting functionality.
Once that has been run, we can check the PVCs in the cluster:
And there it is! We can reference it when deploying another SQL Server statefulset.
The manifest for the new statefulset is:
Confirm the new statefulset has deployed successfully:
To connect to the new SQL instance, we need a service:
And confirm the new service:
The final thing to do is to connect to the new instance of SQL in SSMS. As we cloned the volume with the system databases on, the instance thinks it is in the distributed availability group. So we need to delete that and the corresponding availability group:
And now we have a copy of the database, online, in the new SQL instance:
If we need to provision further copies and don’t want to have to remove the AG configuration, we can now snapshot this SQL instance!
So that’s how to seed a database from a Windows-based SQL instance in an existing availability group to Kubernetes and provision clones of that database using Portworx.
Try Portworx
Test drive the cloud-native storage software platform for Kubernetes.