Containers, popularized by Docker and Kubernetes, born out of Google, are causing a seismic shift in the way in which applications are packaged, deployed and executed.
The traction behind containers and Kubernetes has not gone unnoticed by Microsoft:
The messaging from key members of Microsoft’s engineering team to the data platform community is container-and-Kubernetes-centric:
Relational databases are perceived as technologies belonging to the “Legacy stack”. Containers and Kubernetes are technologies which belong very much to the “New stack”. The reality is that the legacy and new stacks are converging.
When people first dip their toes in the water with containers, there is a tendency to compare containers to virtual machines. To dispel this notion, Docker’s blog post, “Containers are not VMs”, is recommended reading. The primary concerns addressed by containers are application packaging and delivery, whereas the focus of virtual machines is infrastructure virtualization. The docker container engine runs on most of the popular operating systems and clouds. You can download Docker Community Edition for free. However, there are some crucial things required to run enterprise grade containerized applications that stand alone container engines do not provide:
And this is where a container orchestration framework comes into play.
Google had a vision of giving developers outside of Google the same development experience that developers inside Google enjoyed. A vision that Joe Beda articulated in an interview at KubeCon 2017. Towards this end, Google took the engineering know–how that went into Borg, Google’s own in-house container orchestration platform, and developed Kubernetes. Kubernetes was made open source in 2014 under the Apache 2.0 license in conjunction with the Cloud Native Computing Foundation.
Before developing Kubernetes, Google had two choices: give the developer community something like Borg, their in-house orchestration platform, or have them use virtual machines. Because Google was not enamored with the experience virtual machines gave developers, they went down the Borg route. For this very reason, a lot of the conventional thinking around virtual machines needs to be replaced with a fresh mind set. The essential key concepts behind Kubernetes include:
A Kubernetes cluster comprises of two different types of node:
The control plane must always run on Linux, however, the worker nodes can run on Linux or Windows. Cluster state is stored in a simple key value store called etcd which originates from CoreOs. A Kubelet, an entity that is essentially a Kubernetes agent, runs on each worker node. Containers run inside pods. A pod is the unit of scheduling on a Kubernetes cluster. Containers in the same pod are run on the same node.
Master node(s) also furnish the cluster API server, this is RESTful endpoint by which objects can be created, managed and interrogated.
Applications that run on Kubernetes can be stateless or stateful. Stateless applications use stateless containers, the classic example being something such as Nginx. Stateful application use containers that persist and manage state, things such as mongodb, redis or any kind of database, NoSQL or otherwise. This is in stark contrast to a virtual machine, which always requires at least one disk.
The touch point for application storage in a Kubernetes cluster is the volume and volumes are always associated to pods. To consume persistent storage and make it available to a pod, persistent volume claims are specified. Two options exist for how space is allocated on actual physical storage devices:
Continuously destroying and re-creating storage is not a good practice. Therefore, when a stateful pod is re-scheduled to run on a different node, the Kubelet unmounts any volumes associated with the pod prior to its move, and once the pod has moved, the Kubelet on the destination node re-mounts the volumes.
To illustrate some of the concepts introduced in this blog, let’s look at the YAML required to create a SQL Server instance. In order for this example to work, an object of type secret needs to be created first via this kubectl command:
kubectl create secret generic mssql --from-literal=SA_PASSWORD="MySuperSecretP@ssw0rd"
The following YAML can then be placed in a file, deployment.yaml for example, and applied as follows:
kubectl apply -f deployment.yaml
- name: mssql
- containerPort: 1433
- name: MSSQL_SA_PASSWORD
- name: mssqldb
- name: mssqldb
- protocol: TCP
The above excerpt creates three objects:
Kubernetes integration into Pure storage platforms is provided by the Pure Service Orchestrator:
In keeping with the Pure mantra of simplicity, the installation and configuration of Pure Service Orchestrator requires just two simple steps:
With Pure Service Orchestrator, a Kubernetes cluster’s storage can be scale out to the orders of petabytes across the industry’s leading block and object/file storage platforms. Not only that, but Pure Service Orchestrator intelligently determines where best to create persistent volumes based on a number of items, including the free capacity, performance and health of each storage device.
Of all the applications that run against FlashArray, SQL Server is one of the most popular. The storage consumed by mission-critical SQL Server databases on FlashArray™ systems across the world is of the order of petabytes. Pure brings FlashArray’s qualities and features enjoyed by SQL Server running on bare metal and virtual machines to the the world of containers and Kubernetes, including:
Pure is here to help with your SQL Server journey into the brave new world of containers and Kubernetes. The platforms for running SQL Server on may change and evolve, but the qualities that Pure customers love and enjoy will steadfastly remain the same !.