Using Oracle’s vdbench tool to test the data-reducing capabilities of your storage array

When testing an all-flash storage array, availability, scalability, and affordability should be tested before testing performance.

Here we will suggest a simple set of tests that can be included in a Proof-of-Concept (PoC) test plan that will help you to compare the data-reducing capabilities of different data-reduction storage arrays.

Note: It is assumed that these scripts are run as a step in a PoC where no production data is stored on the array, and you can safely zero-out the array.

Simply repeat the following steps for each set of reducible datastreams you wish to test:

1. Zero out the storage array
2. Create a single 1 TiB volume
3. Present the volume as a raw device to a physical or virtual host initiator
4. Logically fill the 1 TiB lun with reducible datastreams of your choosing
5. Record the reported data-reduction ratio
6. Discuss with your vendor how this ratio was calculated

Of course, it would be preferable to use a 1 TiB sample of your real data. However, if this is not possible, then using Oracle’s vdbench tool to fill the 1 TiB lun with reducible data is a good and simple alternative.
The important thing is to send, as nearly as possible with your chosen load-generation tool, the exact same datastreams to each empty data-reduction storage array that you are testing in your PoC. Construct datastreams that you feel are defensibly representative of common datastreams that your data-reduction storage array will experience in your production environment.

We call our data-reduction feature FlashReduce. FlashReduce employs five forms of high performance and inline adaptive data reduction techniques. FlashReduce is constantly attempting to intelligently improve the data-reduction ratio on the FlashArray in a “flash-friendly” manner.

We followed the above mentioned methodology on a FA-405 running Purity 4.1.2 and got the results below.

Please consider adding tests similar to this to your data-reduction storage array PoC test plan.

Test 1 using Oracle’s vdbench tool

During the fill the FA-405 running Purity 4.1.2 reported a data-reduction ratio for this set of datastreams of between “7.1 to 1” and “8.4 to 1.” After FlashReduce’s deep-reduction algorithms had a chance to work for a while, the data-reduction ratio increased to “9.4 to 1.

Test 2 using Oracle’s vdbench tool

Notice, here we reduce “dedupunit” to “1K” to test the data-reduction storage array’s ability to reduce datastreams that are not on nice even 4K boundaries. During this fill the FA-405 running Purity 4.1.2 reported a data-reduction ratio for this set of datastreams of between “3.8 to 1” and “4.2 to 1.” After FlashReduce’s deep-reduction algorithms had a chance to work for a while, the data-reduction ratio increased to “4.6 to 1.

Test 3 using Oracle’s vdbench tool

During this fill the FA-405 running Purity 4.1.2 reported a data-reduction ratio for this set of datastreams of between “4.0 to 1” and “4.7 to 1.” After FlashReduce’s deep-reduction algorithms had a chance to work for a while, the data-reduction ratio increased to “5.3 to 1.

The screenshots and scripts can be found HERE.

In future blog posts we will demonstrate how to use different generic load generation tools to test the data-reducing capabilities of a data-reduction storage array using this same basic methodology.

Summary

Most modern storage arrays are not just “storage arrays.”
Instead they more accurately should be classified as “data-reduction storage arrays.”

Almost all real-world datastreams have both compressible and dedupable components. Thus, testing a data-reduction storage array with non-reducible datastreams is not a defensible storage performance testing methodology.

When conducting a PoC, before testing performance, the data-reducing capabilities of the data-reduction storage array must be tested. The best way to do this is with a sufficiently large amount of your real-world data. However, if this is not feasible, use generic load-generation tools like Oracle’s vdbench tool to send datastreams that are defensibly representative of common datastreams that your array will experience in a production environment.