Persistent Data for Stateful Applications in Kubernetes on Flash

Kubernetes has taken off as a mainstream implementation for apps that run as microservices in containers. Konvoy is a packaged set of integrated operational services that provides simplifies install, configuration, and set up of basic tools to manage and update a Kubernetes cluster.


Apart from open-source Kubernetes, there are various distributions that use the same upstream base and provide a lot of added benefits. D2iQ’s Konvoy is one of the Kubernetes distributions that we will discuss in this post. Before getting into Konvoy, it’s worth discussing some of the pertinent business challenges driving the adoption of Kubernetes. 

Most traditional midsize to large enterprise organizations focus on how to transform their monolithic data centers to more cloud-like environments. There is a whole laundry list of items, but the following are some key points that organizations experience as business processes transform.

  • Transitioning from monolithic applications to microservices: Designing, migrating, implementing, piloting, and going  live in a new Kubernetes environment is a journey. Automating various application workflows and pipelines on a standard platform is easy to use and manage with optimum utilization of resources. 
  • The new versions of Kubernetes and the growing ecosystem around it can lead to complexity. The maturity of technical skills is always at various levels to set up, support, and manage highly critical business environments in Kubernetes. In addition to system platforms, provisioning and managing data platforms can be quite a tough task.  
  • Accessing, managing, and protecting data at scale is even a bigger challenge than abstracting the infrastructure using Kubernetes. Heterogeneous data stores and platforms lead to silos that result in process friction and manageability overhead. (Managing applications and infrastructure becomes more challenging when the volume, variety, and velocity of data grow exponentially.)

In this post, we’ll focus on the second and third items, with Konvoy and Pure Storage®, respectively. 

Konvoy is a packaged set of integrated operational services for Kubernetes. Konvoy provides a simple and easy way to install, configure, and set up some of the basic tools to manage and update a Kubernetes cluster on its own. In the following table, two files — cluster.yaml and inventory.yaml— are responsible for the complete install and configuration of Konvoy. Admins or architects can perform a pre-install check prior to the final install to identify and fix any configuration errors.

  • The inventory.yaml file is responsible for the list of the master and worker nodes along with the private SSH key. 
  • The cluster.yaml is responsible for all the core Kubernetes components along with the list of add-on tools for monitoring, logging, visualization, load balancing, etc. 

All these tools are part of the Konvoy “ops-portal” that consists of various dashboards for each tool in a single management pane.

Many pods running applications and tools are deployed in the Kubernetes cluster to perform various business functions. A pod is the smallest deployment unit in a Kubernetes cluster. A pod can consist of one or more containers inside it. Modern commercial and custom applications running in the Kubernetes cluster require persistent storage to store, reuse, manage, and protect the data. 

Kubernetes primarily manages CPU and memory resources. Local storage is available on servers or VMs that host the Kubernetes cluster can be provisioned for pods. However, using local storage on Kubernetes hosts has its own challenges. 

  • Consuming local storage is a manual process and is not scalable. 
  • Adding more CPU and memory to the Kubernetes cluster would end up adding more storage that may or may not be of use, thereby over-provisioning storage.
  • Application pods are tied statically to the local storage of the node that they are running on.
  • Crossing data boundaries across servers is difficult and limited for applications that require sharing and collaboration.
  • Recovery from failures is time-consuming and not transparent.

Pure Storage Pure Service Orchestrator™ provides a Pure provisioner that dynamically provisions storage for stateful applications on demand. PSO also provides seamless scaling across arrays. There is no manual effort required from the admin or the architect to provision and scale with the number of pods in the Konvoy cluster. PSO also provides high availability with automatic failover without disrupting the services running in the Konvoy cluster.

The PSO CSI Kubernetes driver natively runs inside the Konvoy Kubernetes cluster as a first-class citizen. Setting up PSO in the Konvoy cluster is a one-time effort. The values.yaml file for PSO requires the storage endpoint information and is installed using helm charts for pure-csi.

PSO has two storage classes – pure-block and pure-file to provision storage from FlashArray™, Cloud Block Store™ and FlashBlade™ respectively. In the table above, FlashBlade information is configured to represent storage class pure-file.

Konvoy has the ability to specify a default storage class for applications that require persistent storage. In the above example, Konvoy configures the pure-file as the default storage class.

This is a one-time setting that you can do at the time of configuring PSO in the Konvoy cluster. Subsequent requests for persistent storage to the Pure provisioner will allow the Physical Volume Claim (PVC) to be bound to a Physical Volume (PV) on FlashBlade using storage class pure-file by default. 

The following example illustrates how a PVC pure-claimis created using the demo-pvc.yamlfile and then mapped into a pod called nginx that would create 10Gi of persistent storage.


 

After creating and applying both YAML samples from the above table in the Konvoy cluster, a 10Gi storage is automatically provisioned on the FlashBlade using the storage class pure-file.



Kubernetes offers different access modes for the PVs for exporting and mounting by the PVCs. FlashBlade supports RWO, RMX, and RWM while FlashArray and Pure Cloud Block Store support RWO access modes respectively. With all three modes supported by Pure Storage, it is possible for various workloads like databases, analytics, AI/ML, and CI/CD to run on Pure Storage on-premises as well as in the cloud.

It’s imperative to understand the simplicity that Konvoy provides to install, configure, and set up Kubernetes as well as  the relevant tools ecosystem, as needed. Integrating Konvoy seamlessly with Pure Storage using PSO provides the ability to run latency-sensitive applications like databases, IOPs, and bandwidth-driven workloads like analytics, AI/ML, CI/CD with zero storage touch. PSO allows disaggregation of compute from storage. Performance and capacity scaling of compute and storage independently becomes easy and cost-effective.