IO Plumbing Tests with FIO

Learn how to perform IO plumbing tests with FIO, a powerful benchmarking tool, including updated best practices, modern configurations for NVMe and persistent memory, and tips for simulating real-world workloads.

IO Plumbing Tests

image_pdfimage_print

Flexible IO tester aka FIO is a open-source synthetic benchmark tool initially developed by Jens Axboe and now updated by various developers.   FIO can generate various IO type workloads be it sequential reads or random writes, synchronous or asynchronous, based on the options provided by the user.  FIO provides various global options through which different type of workloads can be generated.  FIO is the easiest and versatile tool to quickly perform IO plumbing tests on storage system.

FIO enables ease of generating sequential or random IO workload with varying number of threads and the percentage of reads and writes for specific block size to mimic real world workload . FIO also has the option to generate very detailed output.  By default it provides key metrics output like IOPS, latency and throughput.

Updated Best Practices for I/O Testing with FIO

Since the original publication of this blog, the landscape of storage technologies and performance testing tools like FIO has evolved significantly. To ensure meaningful and accurate I/O performance evaluations in modern environments, administrators should adopt the following updated best practices:


1. Utilize the Latest Version of FIO

FIO has seen numerous updates since 2016, including new options, bug fixes, and enhancements. Always download and use the latest version from the FIO GitHub repository to access the newest features and optimizations.

Newer FIO Features:
  • io_uring Support: For Linux environments, FIO now supports io_uring, which provides high-performance asynchronous I/O.
  • JSON Output Enhancements: Improved JSON support allows seamless integration with visualization tools and CI/CD pipelines.
  • Custom Workloads: Expanded capabilities for scripting more complex I/O patterns.

2. Benchmark for Modern Storage Architectures

The rise of NVMe, NVMe-over-Fabrics (NVMe-oF), and all-flash storage arrays has dramatically altered the I/O performance landscape. Testing methodologies should reflect these advancements:

  • NVMe-Specific Workloads: Use FIO options like direct=1 and iodepth to simulate high I/O concurrency and low latency typical of NVMe devices.
  • NVMe-oF: Include network latency in tests by using FIO over RDMA or TCP for NVMe-oF deployments.
  • Persistent Memory (PMEM): Benchmark storage-class memory using workloads that mimic in-memory performance demands.

3. Simulate Real-World Workloads

Modern applications generate diverse I/O patterns. Create FIO jobs that closely mimic production workloads to obtain actionable insights:

  • Database Workloads: Use a mix of random reads and writes to simulate transactional databases.
  • Virtualized Environments: Incorporate mixed I/O operations to test hypervisor-based scenarios.
  • Object Storage: Configure sequential I/O patterns for testing write-heavy object storage systems.
Example: Simulating a Mixed Workload

4. Automate and Integrate

Incorporate FIO into your automation workflows for consistent and repeatable testing:

  • CI/CD Pipelines: Use FIO as part of automated storage performance validations.
  • Visualization Tools: Export JSON results to tools like Grafana or Prometheus for easy analysis.

5. Monitor Hardware Metrics

During FIO tests, monitor hardware-level metrics for a complete picture of performance. Use tools like:

  • iostat and blktrace for Linux environments.
  • NVMe CLI to capture detailed stats from NVMe devices.
  • Vendor-Specific Tools (e.g., Pure Storage’s Pure1 or management APIs) for insights into array-level performance.

6. Optimize Test Configurations

Modern storage systems benefit from fine-tuning FIO configurations:

  • Queue Depth (iodepth): Adjust based on the storage system’s capabilities.
  • Block Size (bs): Test various sizes to determine optimal performance for your workload.
  • Direct I/O (direct=1): Bypass caching layers to benchmark raw storage performance.

By incorporating these updated practices into your FIO workflows, you can effectively benchmark modern storage systems, generate actionable insights, and ensure that your infrastructure meets the demands of today’s high-performance applications.

IO Things to consider

To avoid I/Os out of host system cache, use the direct option which will directly read/write to the disk.  Use the Linux native asynchronous IO using the ioengine directive with libaio.

When FIO is invoked, it will create the file with the name provided in –name to the size as provided in –size with block size as –bs.  If the –numjobs are provided, it will create the files in the format of name.n.0 where n will be between 0 and –numjobs.

–jobs = More the jobs, higher the performance can be (based on the resource availability).  If your server is limited on the resources (TCP or FC), run fio across multiple servers to push more workload to the storage subsystem.

–time_based = FIO will run all the way till runtime value.

Software

At this time I am using fio on RHEL 7.0.  You should be able to find the relevant rpm at this site.

FIO Cheat sheet

  1. Sequential Reads – Async mode – 8K block size – Direct IO – 100% Reads

2. Sequential Writes – Async mode – 32K block size – Direct IO – 100% Writes

3. Random Reads – Async mode – 8K block size – Direct IO – 100% Reads

4. Random Writes – Async mode – 64K block size – Direct IO – 100% Writes

5. Random Read/Writes – Async mode – 16K block size – Direct IO – 90% Reads/10% Writes

Sample IO output

The following command creates 8 files (numjobs=8) each with size 512MB (size) at 64K block size (bs=64k) and will perform random read/write (rw=randrw) with the mixed workload of 70% reads and 30% writes. The job will run for full 5 minutes (runtime=300 & time_based) even if the files were created and read/written.

Conclusion

The FIO tool remains an indispensable resource for testing and benchmarking storage systems, thanks to its flexibility and ability to simulate diverse I/O workloads. As highlighted in this blog, FIO enables precise control over I/O parameters, such as block size, concurrency, and workload mix, making it a reliable choice for assessing storage performance.

With advancements in storage technologies like NVMe and persistent memory, and updates to FIO itself, administrators can further enhance their testing methodologies by leveraging modern features like io_uring and JSON output integration. By combining FIO with contemporary tools and best practices, users can effectively evaluate performance, optimize configurations, and ensure that their storage infrastructure meets the demands of today’s high-performance applications.

How the Right Storage Plays a Role in Optimizing Database Environments

Written By: