This post was originally published on this siteAt Pure//Accelerate 2018 Cody Hosterman and I delivered a session on “Moving Data Between Cloud and On-Premises Virtualized Environments“....
Today we launched Purity Run, a safe, dedicated, compute-and-memory-isolated execution environment within FlashArray. With Purity Run, our customers can run custom code within the FlashArray via a VM, a container or plain executable. Extensibility and programmability are two core tenets of Pure’s Data Platform. Purity Run allows customers to extend Pure platforms to match their unique needs. Use Cases for Purity Run could be as simple as consolidation of a few servers in a remote office to as demanding as co-locating data intensive compute jobs closer to data they need to churn on, or even implementing an application-specific custom protocol for your applications to talk to FlashArray.
Curious for more? If so, read on below and also check our other blogs of this launch series.
Cloud-Era Flash: “The Year of Software” Launch Blog Series
In the “analytics-first” world organizations are often challenged to quickly get value out of their data. This often requires customers to set up analytics compute farms and siphon data from storage arrays into data lakes to kick start their data analytics journey. We at Pure are in the business of “simple” and “faster time-to-value”. FlashArray has negligible internal I/O latency which lends itself well to data intensive workloads. By embedding a simple, isolated compute framework in FlashArray we enable customers to quickly program compute jobs right on the array to get productive quickly. Why? Because that’s where your data is stored in the first place! Rather than shuttle it all the way to a compute host or restore it into a data-lake (especially if there is massive quantities of it), just simply schedule a container/VM/compute job right on the array to quickly generate results.
Purity Run comes pre-installed with a Docker Community Edition (CE) in a Ubuntu Linux VM. This is ideal for developers to start experimenting with their data immediately. Purity Run, presently carves 8 vCPUs and 16GB of memory for its Linux VM in which to schedule and run Docker powered containers. These resources are fully isolated from FlashArray. So if a container goes rogue, there is no impact on the array or its performance. Worst case, you can kill the VM and all its hosted containers without any performance or availability impact to your FlashArray. When Purity Run is not allocated or running, the CPU and memory resources are relinquished back to the FlashArray.
Bringing compute closer to data is not a new concept, in fact it is quite popular in the analytics world especially with data intensive applications; but in a DIY environment, it is not simple to execute. So far, customer use cases have varied from scheduling elastic-search containers directly on FlashArray to exploit massive I/O throughput enabled by Pure’s NVMe backend to extending the FlashArray with additional protocols like file/object, or even custom applications-specific interfaces.
Object stores, file protocols??? But it is a block storage array, what in the world do you mean? While extending the FlashArray with compute capability allows for faster time-to-results in certain use cases as mentioned above, the same compute framework allows for protocol extensibility. FlashArray speaks block (Fiber Channel & iSCSI) out of the box. There are use cases where the same block stream data is needed to be accessed via a different protocol, such as S3 compatible object interface or even NFS in some cases. This typically happens when a developer wants to develop a new application and additionally wants to use historical data for testing or context. Rather than migrating your data into a new object store by procuring new servers and storage, simply install and run your favorite object store binary directly in Purity Run, take a clone of you existing data, mount it into Purity Run (it will show up as a block device in the runtime environment which your installed VM can mount) and you are off to the races. You can call it simple and productive; we at Pure call it flexible and extensible.
Another example of protocol extension is files (SMB and/or NFS). In an storage environment which is block dominated, if there is a need for file access (for home directories or test/dev), a file server can instantiated within Purity Run within minutes which is fully configured for failover and high availability. You can read all about it here. Again, flexible, extensible with a touch of simple.
Following in the footprints of Windows file services in Purity Run, we see the potential of a rich ecosystem of partner applications. This approach provides our customers and partners with an opportunity to create vertically integrated solutions which are ready for operation right out of the box. Examples range from Catalogic Software for copy data management to Splunk collectors & indexers running on the FlashArray for massively parallel I/O. Over the coming months we will roll out details about our partner app ecosystem.
We are excited to put this platform in your hands. We can’t wait to see what you will do with it!