VMware vSphere has been re-architected with native Kubernetes with its new product portfolio Tanzu. Depending on which analyst report you read, VMware owns anywhere from 50-85% of the virtualization market¹. Most enterprises have already built their private clouds based on VMWare and as they adopt Kubernetes for modern applications, they are looking at moving their workloads to Tanzu. While Tanzu is set to transform the way modern private clouds are architected, Enterprise applications are mostly stateful and this blog focuses on these challenges and the solutions for them.

Tanzu Support in Portworx

Portworx® is the most widely used Enterprise software-defined storage platform for containers and microservices. Portworx was designed to manage stateful applications directly via Kubernetes – with a single control plane for both applications and data. It can take advantage of commodity servers with local drives, existing SAN (Storage area network), cloud storage, or virtual storage, and provide container granular and application-aware services. In the public cloud and anything that provides a cloud-like experience, Portworx manages the actual backend physical storage via a module called “Cloud Drives”. Cloud Drives decouples compute from storage and manages storage on public clouds like AWS, Microsoft Azure, Google Cloud Platform, and on vSphere in on-prem installations.

E.g. In AWS, Portworx provisions and manages underlying EBS drives that host data for virtual volumes exported to applications.  

To support Tanzu, Portworx extended the Cloud Drive module to provision and manage backing infrastructure via CSI. 

What Is CSI?

CSI is a standard for exposing arbitrary block and file storage systems to containerized workloads on Container Orchestration Systems (COs) like Kubernetes. Using CSI, VMWare can support storage vendors in a common way, with a common set of storage capabilities. A CSI driver is an interface between Kubernetes and cloud storage providers. This driver exports the storage provider’s physical storage as container consumable volumes known as PVs (Persistent Volumes). 

The Functioning of the Portworx CSI driver for Tanzu

When Portworx is deployed within Tanzu, the Cloud Drive (CD) module does the following:

  • Initializes a Portworx node which involves joining the Tanzu cluster as a storage capable node, adding to the global capacity of the cluster.
  • Then the Cloud Drives (CD) module executes a subsequence of calls to check the following points: 
    • If there are a set of drives, attached to the node, the CD module will use them. If they are not already attached to the node the CD module will attach them
    • If there are no drives attached or available for that node, the CD module will create and attach them.

The CSI implementation in cloud drives makes calls to the k8s instead of talking to the underlying storage API. Now is a good time to get into some details to understand how Portworx manages drives in the CSI implementation of the Cloud Drives component.

Drive Management in CSI Implementation

During the installation, Portworx users should provide a drive spec, where they define the desired drive size. The internal CSI provider (a part of Cloud Drives that implements the CSI) parses a spec and prepares objects for creation. 

Tanzu Support in Portworx: How Portworx operates the cloud drives

Figure 1: Schema demonstrating how Portworx operates the cloud drives.

To create and attach a drive, Portworx creates a PVC (Persistent Volume Claim). You’ll need to provide a StorageClass with a provisioner set to the storage provider driver name. If a StorageClass isn’t provided as a parameter, Portworx will use the cluster default StorageClass.  

Then, Portworx uses the PV ID to create a VolumeAttachment CRD. The CSI driver attaches drives through the component called external attacher. It runs as a sidecar and watches for the newly created VolumeAttachment. It then attaches a physical drive to a node. 

Drive Capacity Management

Another component of the CSI driver—the external-resizer—is also deployed as a sidecar container. It implements the logic of watching the Kubernetes API for Persistent Volume claim edits, issuing the ControllerExpandVolume RPC call against a CSI endpoint, and updating the PersistentVolume (PV) object to reflect the new size. 

When the actual storage usage grows, the cluster capacity can be increased and the cloud drives will modify a PVC to expand its requested storage size. The external-resizer will trigger an expansion in the volume associated with the PVC in vSphere Cloud Native Storage, which finally gets reflected on the corresponding PV object’s capacity.

Node Failure Handling

When Portworx initializes on a node at a high level, it will look for existing cloud drives that are available to use before creating new cloud drives. The creation of a new cloud drive initializes a brand new storage node in the cluster. If a node attaches an existing cloud drive, it will re-use that cloud drive set’s identity and disks. This results in the recovery of a node that was previously used to attach that cloud drive set.

The Portworx node recovery mechanism is essential in the following scenarios:

  • A Portworx storage node is terminated and replaced. This could happen when you’re  upgrading the OS or its packages, upgrading Kubernetes, or updating the node’s specs. In this scenario, when the new node comes up, it will look up the available cloud drive set and will find that the drive set of the terminated storage node is available (because it’s no longer attached). It will attach that drive set and resume the identity of the removed node.
  • A Portworx storage node is terminated permanently. Or Portworx running on the storage node has a failure and doesn’t come up. In this scenario, if a storage-less Portworx node is present, it will detect that the storage node has gone down. The storage-less node will find that the cloud drive set of the terminated node is available and attach it to gain the identity of the terminated storage node.

Solution Benefits

Tanzu support in Portworx through the CSI driver delivers the following benefits: 

  • Portworx seamlessly manages backing storage without admin intervention.
  • Integrating Portworx using CSI is more secure since you don’t need to configure Portworx with Cloud API access.
  • You can use Portworx out of the box on any cloud provider with a CSI driver. 

  1. https://www.networkworld.com/article/3340259/vmware-s-transformation-takes-hold.html, https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/products/vmware-idc-virtual-machine-market-shares-2017.pdf