Accelerate DevOps with JFrog Artifactory Direct Cloud Storage
Using JFrog Artifactory with FlashBlade instead of a public cloud object store provides improved download speeds that can accelerate the software delivery pipeline.
JFrog Artifactory is one of the most commonly used universal artifact repository management tools in software development. To achieve faster time to market, developers are challenged with accelerating the software delivery pipeline either in the cloud or with on-premises data center resources. Application development environments have shown exponential data growth from high iterations of parallel build processes that generate binary packages in the CI/CD pipeline, which create a major bottleneck in software development.
With Pure Storage® FlashBlade//S®, a scale-out, unified fast file and object platform, this bottleneck can be turned into competitive advantage with:
Four times faster binary download speeds vs. cloud
75% CPU utilization reduction on Artifactory servers
Minimal usage of read cache
Reduced management overhead of additional HA servers and Nginx performance
Many enterprises are challenged with the download speed of large binary packages required during the development process from the Artifactory filestore. JFrog has documented various ways to set up and configure repositories in the filestore. JFrog also has provided best practices to optimize Artifactory to handle heavy package downloads for on-premises implementations. However, external storage plays a major role to handle the heavy loads in local data centers.
In a previous blog post, I illustrated the use of FlashBlade®unified fast file and object (UFFO) as an accelerated data storage platform for JFrog Artifactory workflows. The blog post provided ways to configure the Artifactory database and read cache (cache-fs) over NFS and filestore on S3-compatible storage on FlashBlade as shown in Figure 1 below. Optimizing the Nginx load balancer in an Artifactory-HA setup helped to speed binary package downloads. Internal tests indicated that there was up to 65% download speed improvement over a cloud-provided S3 bucket.
Figure 1: JFrog Artifactory architecture on FlashBlade using a unified fast file and object.
The Artifactory filestore is created as an S3 bucket “filestore” on the FlashBlade system in the screenshot below. Enabling versioning on the S3 bucket on FlashBlade is recommended. The versioning capability on S3 buckets configured on FlashBlade enables the administrator to set up bucket policies and set lifecycle rules for Artifactory objects and object versions in the bucket.
Figure 2: Artifactory filestore configured on FlashBlade as an S3 bucket.
In the above architecture, Artifactory uses an Enterprise X license to upload and download the artifacts and binary packages from the FlashBlade S3 bucket hosted on premises.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
[root@sn1–r620–a04–03artifactory]# time curl -v -L -u $SOURCE_ADMIN:$SOURCE_PASSWORD https://10.21.152.63:8082/artifactory/springboot/springboot-master/latest/sha256__f3b70a40dcef5dc5b602ecbe1e66cbba32baa0504dc9ffc40d9391baf92495c2 > /tmp/art-test3
The HTTP/1.1 200 OK response from the Artifactory server, downloads the requested binaries and artifacts by the users through the Artifactory stack and the Nginx load balancer. While download speeds were proven to be faster with a FlashBlade S3 bucket versus cloud-provided S3 buckets, there are still some challenges with this architecture:
The Nginx load balancer and the Artifactory application stack impact the speed when developers download a large number of binary packages and artifacts at scale.
The CPU utilization on the Artifactory servers is also an impediment at scale. Additional Artifactory servers can be added to the HA configuration to mitigate the CPU utilization. However, that leads to a server sprawl that is both costly and introduces additional management overhead.
The read cache (cache-fs) size in Artifactory helps to cache the frequently downloaded binary packages and artifacts to improve the download speed. Ideally, the read cache is configured on the local server storage. In enterprise environments with a large number of developers, the read cache size is often increased beyond its default size of 5GB to cache the binary packages. This has some challenges:
Growing and shrinking the read cache on a single or multi-server configured in HA is not dynamic.
There is no data reduction for the cached items on the local server.
To address these issues, JFrog Artifactory has included the Direct Cloud Storage Download feature in v7.23.7 and later for on-premises implementation. This feature was earlier introduced in Artifactory v6.5 as a SaaS offering for all major public cloud providers to directly download the binary packages from S3/Blob storage buckets. An Enterprise Plus license is required to enable this feature in Artifactory.
The direct cloud storage download feature still uses a similar architecture on FlashBlade using an S3 bucket except that the download request is redirected by the Artifactory server to the FlashBlade S3 bucket as shown in Figure 3 below. FlashBlade is a fast object platform that provides high download speed for Artifactory instances configured on premises compared to the S3/Blob storage in the public cloud.
Figure 3: JFrog Artifactory using Direct Cloud Storage Download feature with FlashBlade//S.
In addition to the Artifactory license requirement, some additional configurations are needed in Artifactory to enable the Direct Cloud Storage Download feature:
Installing/upgrading to Artifactory v7.41.7 or later is recommended.
Update/configure the $ARTIFACTORY_HOME/var/etc/artifactory/binarystore.xml with a new setting <enableSignedUrlRedirect>true</enableSignedUrlRedirect>.
3. Update the $ARTIFACTORY_HOME/var/etc/system.yaml file in Artifactory to enable Mission Control.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
[root@sn1–r620–a04–03etc]# cat system.yaml
artifactory:
metrics:
enabled:true
shared:
user:artifactory
group:artifactory
## Mission control template
<b>mc:</b>
<b> #port: 8080</b>
<b> enabled:“true”</b>
[root@sn1–r620–a04–03etc]#
4. Reboot/restart the Artifactory service.
5. From the Artifactory UI, the Direct Cloud Storage Download feature can be enabled selectively on a single or many repositories in the Artifactory filestore. In the following example, the feature is enabled for the generic-local repository.
The requesting artifacts and binary packages from the generic-local repository will be redirected to the S3 bucket on the FlashBlade system for a direct download to the user platform. In the following example, a 10GB text file is downloaded directly from the FlashBlade S3 bucket filestore in less than three secs.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
[root@sn1–r620–a04–03artifactory]# time curl -v -L -u $SOURCE_ADMIN:$SOURCE_PASSWORD https://10.21.152.63:8082/artifactory/generic-local/1000000-2k > /tmp/art-test2
* Issue another request to this URL: ‘https://10.21.236.202/filestore/filestore/25/25807374671e2a1653b531ff6e8ee9f85f2aa11b?X-Artifactory-username=admin&X-Artifactory-repositoryKey=generic-local&X-Artifactory-artifactPath=1000000-2k&X-Artifactory-projectKey=default&x-jf-traceId=1009205830dcb6a8&response-content-disposition=attachment%3Bfilename%3D%221000000-2k%22&response-content-type=application%2Foctet-stream&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20220908T202735Z&X-Amz-SignedHeaders=host&X-Amz-Expires=29&X-Amz-Credential=PSFBSAZRKBLLIHIAJDNPGIPBOKFPJMNADFHOCBPPCB%2F20220908%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Signature=c33ff86fd17569a855ed113b03dab23ad7cbef8e6a7d63b3ceec65fea8d62973’
* About to connect() to 10.21.236.202 port 80 (#1)
* Trying 10.21.236.202…
* Connected to 10.21.236.202 (10.21.236.202) port 80 (#1)
> GET /filestore/filestore/25/25807374671e2a1653b531ff6e8ee9f85f2aa11b?X-Artifactory-username=admin&X-Artifactory-repositoryKey=generic-local&X-Artifactory-artifactPath=1000000-2k&X-Artifactory-projectKey=</span><span style=”font-weight: 400;”>default &x-jf-traceId=1009205830dcb6a8&response-content-disposition=attachment%3Bfilename%3D%221000000-2k%22&response-content-type=application%2Foctet-stream&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20220908T202735Z&X-Amz-SignedHeaders=host&X-Amz-Expires=29&X-Amz-Credential=PSFBSAZRKBLLIHIAJDNPGIPBOKFPJMNADFHOCBPPCB%2F20220908%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Signature=c33ff86fd17569a855ed113b03dab23ad7cbef8e6a7d63b3ceec65fea8d62973 HTTP/1.1
As shown in the table above, the user request to download an object from the Artifactory filestore is redirected with an HTTP/1.1 302 Found request to the filestore bucket on FlashBlade. The requested object is then downloaded directly from the FlashBlade filestore bucket to the user platform.
A download test was performed to download text files with sizes 1K, 10K, 100K, 1M, 10M, 100M, 1G, and 10G. The files were downloaded eight times in three batches for a total of 24 downloads. An identical test approach documented in this white paper was used to test Artifactory download speed for different configurations:
Artifactory filestore in a major cloud provider S3 bucket
Artifactory database and read cache on local storage and filestore on a FlashBlade S3 bucket
Artifactory database and read cache over NFS and filestore on a FlashBlade S3 bucket
Artifactory database and read cache over NFS and filestore on a FlashBlade S3 bucket using the Direct Cloud Storage download feature
The test results indicate that the Direct Cloud Storage Download feature to download Artifactory objects directly from the FlashBlade S3 filestore bucket to the user platforms is the fastest among all the test scenarios. The following are the observations from the tests:
The download speed of Artifactory objects from FlashBlade using the Direct Cloud Storage Download feature is four times faster than with an S3 bucket in the cloud.
The direct download capability from the FlashBlade S3 bucket demonstrates a 75% CPU utilization reduction on the Artifactory server(s).
With direct download from FlashBlade, there’s minimum usage of read cache (cache-fs) and reduced manageability overhead of additional HA servers and Nginx performance.
The Direct Cloud Storage Download feature from JFrog and FlashBlade, a fast file and object platform, provides improved download speeds that can accelerate the software delivery pipeline compared to the use of a public cloud object store. The ability to configure one or more repositories using the Direct Cloud Storage Download feature in Artifactory enables the administrator to manage the Artifactory resources effectively and also provides faster download speed for users in the software development environment.
Learn more about Pure Storage solutions with JFrog Artifactory: