This post was originally published on this siteAt Pure//Accelerate 2018 Cody Hosterman and I delivered a session on “Moving Data Between Cloud and On-Premises Virtualized Environments“....
The world is witnessing a massive data explosion with a multitude of devices generating 10s of zettabytes (that is 1 followed by 21 zeros or 10^21). An IDC study on data explosion projects 50 zettabytes of data being generated annually by 2020. The main contributors this data growth include new IoT devices, wearable devices, self-driving cars, satellite data, casino cameras/machines, building management data, transportation logistics data, manufacturing factory data, bill board data, credit card transaction data, surveillance cameras, smart meters, and so on [see picture below]. In case you are wondering why a Pure Storage FlashArray is in the mix (see picture below), we have several thousand arrays that are in deployment and they call home periodically. The arrays have generated several petabytes worth of log data that we use for predictive support.
Dealing with massive unstructured data poses a huge challenge.
Instead of exposing data in terms of blocks and files, Object Storage introduces the concept of objects or blobs of data stored in containers called buckets. Object Storage was designed for scale and for large unstructured datasets i.e billions of objects.
Object Storage has no dependency on the operating systems and are not like hierarchical filesystems (provides a flat namespace). There is no need to do LUN provisioning, creating or mounting filesystems on a server and exposing it to user applications. The administrator creates user access key/secret key and set user/bucket level policies, the users can then create buckets and access objects in the buckets. Users access the objects via a simple web interface using RESTful HTTP APIs from their applications. The most popular API models are Amazon Simple Storage Service™ (S3) and OpenStack® Swift with Amazon S3 APIs being the most popular in terms of adoption. Lots of cloud native applications have made Object Storage and S3 as their primary model for their data consumption. As there is little or no manageability overhead most Object Storage systems are very simple to manage. Scalability on the other hand could get out of hand when you are dealing with traditional spinning media.
During the Pure Storage annual technology event //Accelerate 2017 in June 2017 we announced our biggest software launch in Pure’s history. A detailed blog on all the 25+ features can be found here. A quick abstract of the software release for both FlashArray and FlashBlade is summarized below.
From the FlashBlade Purity//FB perspective, Object Storage with S3 compatible APIs was one of the major notable feature. Here are the key points on the Object Storage implementation:
Purity //FB version 2.0.5 supports the following S3 APIs:
We have tested with various AWS S3 SDK including Java, Python, GO, C#, and .NET. In future blogs we will talk in detail on how to use FlashBlade S3 compatible APIs.
We will explore the inner workings of the Object Storage implementations on FlashBlade and provide example code demonstrating the supported APIs in future blogs.