Traditionally, cloud-native applications use object storage in the cloud or on-premise for archival and disaster recovery. The object store targets that are in the cloud are used as secondary storage that is cheap and can store large volumes of data. However, more and more commercial off-the-shelf (COTS) and custom applications are supporting S3-compliant object storage for highly parallel and distributed access to large data sets in production.
Modern application design is rapidly moving towards using AWS S3-compliant Object storage that can scale in capacity independent of the compute layer. But, performance has been the biggest inhibitor for applications that attempt to use object storage in production environments. The read and write operations lack the ability to scale in performance and speed to access the data, respectively.
Some of the common use cases that need S3-compliant fast-object storage include:
- High density of data backup with faster restore times
- High volume of unstructured cold data tiering for analytics
- Log data from custom applications
- Large repositories for Dev/ML/Ops tools
Decoupling the compute from the storage allows users to size the data for specific use cases and provide higher performance at scale.
JFrog Artifactory is one of the most common universal binary package management tools that is tied to Continuous Integration/Continuous Delivery (CI/CD), analytics, AI/ML workflow pipelines that constitute a single or multiple assembly lines. End users like software developers, data scientists, etc. write new or modify existing source code for various business functions. The source code needs to be compiled using compiler and other software-dependent artifacts to generate a binary or an executable file.
The source code compile stages are called a build process and the build artifacts used in the software builds along with the final binary files are stored and managed by JFrog Artifactory. Artifactory is a version control system for all the build artifacts and binaries that provides end-to-end automation and manages binaries and builds artifacts used in various workflow pipelines.
As Artifactory sits in the crossroads feeding into other workflow pipelines, it can store a high volume of binaries and artifacts in various package formats. Artifactory consists of a database and a filestore. The path to reach the respective artifacts and binary packages in the filestore is part of the metadata information stored in the database. JFrog Artifactory Enterprise supports HA and S3 compatible object stores.
In most customer production environments, JFrog Artifactory is set up on servers with local storage that is managed by Artifactory’s shard file system. However, the filestore capacity keeps increasing as the number of binaries grow in numbers in various package formats. The natural instinct is to add more servers with local storage to accommodate capacity scaling. High data growth in Artifactory environments has the following challenges.
- Adding more servers for capacity scaling leads to server sprawl and high manageability overhead. There is no disaggregation of compute and storage capacity.
- Data is stored in silos. Data has to move between workflow pipelines and teams in order to have data continuity for the applications.
- Performance does not scale while reading and writing artifacts and binaries to and from Artifactory as the number of uploads and downloads start to increase.
- Artifactory backup inflates the local storage space without any storage efficiency. Artifactory does not compress the data.
Pure Storage® FlashBlade™ unified data platform is designed for files and S3-compliant object stores and scales with performance and capacity independent of the compute. Most of the object store offerings in the market are retrofits, a thin object translation layer that’s stitched on top of legacy NFS software stack and designed for performance. They are too complex to deploy too.
FlashBlade is the first high performance, cloud-optimized file, and object storage for modern data – effectively, a new category of storage. FlashBlade delivers multi-dimensional file performance via a highly parallelized architecture and uniquely extends that capability to Fast Object, addressing modern data challenges with the seamlessness and simplicity of the public cloud.
FlashBlade is the first scale-out storage solution to intersect on all three dimensions of big, fast, and simple. It is tuned to deliver multi-dimensional performance for any data size, structure, or access, delivering 10x or greater savings in power, space, and cooling costs. The above diagram illustrates that various workflow segments can coexist on a standard data platform like FlashBlade that handles different file and object protocols and heterogeneous workloads to provide a multi-dimensional performance.
For the purpose of this post, Artifactory is configured in a Kubernetes cluster. JFrog Artifactory is based on hybrid architecture on FlashBlade. The database is configured over NFS using Pure Service Orchestrator™ along with the artifactory home directory that consists of the read/write cache. The Artifactory filestore is configured on Fast Object. The following diagram shows the hybrid architecture on FlashBlade.
Performance tests with various file sizes were performed on Artifactory hybrid architecture on FlashBlade and the data was compared with the results from a popular public cloud object store offering. The following figure shows that the FlashBlade Fast Object demonstrated a 62% download performance improvement compared to the public cloud offering.
The Artifactory hybrid architecture on FlashBlade using Fast Object also provided some additional benefits.
- Reduced server sprawl, since the capacity and performance can scale independently of the compute.
- The array level snapshots and bucket versions for the database and the filestore respectively provide data protection and storage efficiency.
- Data reduction of 2:1 on FlashBlade allows optimization of the cache space, thereby enabling more data for faster download.
Why Storage Matters when Modernizing Databases
Our latest white paper, “Modernizing JFrog Artifactory on S3-compatible FlashBlade,” dives deeper into the development and validation of our joint offering.
More and more modern cloud-based applications like JFrog Artifactory are using S3-compliant object store on-premises and in the cloud. FlashBlade™ Fast Object not only provides the simplicity to create fast object stores but also provides the speed and storage efficiency at scale.
To learn more about JFrog Artifactory with Pure Storage, register for our upcoming webinar on Wednesday, May 20th.
Written By: