Using fio to perform a plumbing test

Because there are so many factors that can affect storage performance, before running any kind of storage performance test, real-world or synthetic, it’s always a good idea to try to ascertain what the maximum data throughput capabilities are for your test environment as it is currently configured.

Plumbing tests like these can help you more easily discover problems or misconfigurations in your testing environment that should be addressed before conducting a storage performance test.

There are many ways to do this, but one way is to:

  • create a number of thin-provisioned volumes on the storage array
    • leave the volumes empty, thus leaving all of the logical blocks in the volumes unmapped to physical blocks
      • this will help the array to serve reads as fast as possible.
  • map the storage volumes to raw luns on a number of host initiators
  • use a load-generation tool to run a series of 100% read tests to discover the maximum unidirectional data throughput capabilities of your test environment
  • then run a series of 100% write tests to see if the maximum data throughput capabilities of your test environment are different for writes
    • create highly reducible datastreams to increase the front-end write capabilities of your data-reduction storage array
  • follow this up with a series of read/write tests to discover the maximum bi-directional capabilities of your test environment

Below is a fio job file that can serve as a template to construct your own plumbing sequential read ramp test:

The fio sequential write ramp plumbing test is exactly the same as the sequential read ramp test above, except that we use the “rw=write” parameter.

For a bi-directional read/write plumbing test, you can do something like this:

A much more in-depth explanation of these tests, and the fio job files, can be found HERE.

Please keep in mind that using 100% unmapped, or partially unmapped luns, is a trick that some take to artificially inflate their results.

While this is fine for a “plumbing” test, it is completely useless for a proper storage performance test.

If anyone tries to convince you that it is OK to run a performance test with datasets and datastreams that are not a defensible approximation of your real-world environment, they either don’t understand proper storage performance testing principles, especially for a data-reduction storage array, or they are trying to sneak something past you.

It is just as useless to use 100% non-reducible data in your datasets and datastreams as it is to use unmapped luns, NUL writes, single-character writes, or even writes consisting of a small repeating set of characters for a storage performance test on a data-reduction storage array.

Insist upon datasets that are filled and aged in such a way that they defensibly approximate the various datasets and datastreams in your environment.

Anything short of this is a flawed synthetic storage performance test.