Administrators constantly battle with database performance. Deduplication and compression are data reduction techniques that help reduce the amount of data storage space needed. Deduplication is the process of removing duplicate data, while compression reduces storage size for your data. Compression is great for saving resources, but administrators are often concerned with data loss, as they should be.
In most environments, data is sent to a resource on the network to deduplicate and compress data. There’s often a conflict between administrators on whether both deduplication and compression are necessary or if only one should be enabled during backups. Both processes slow down the backup process, so some administrators want to avoid the process if they don’t need to use it.
Compression and deduplication can reduce file storage size to a fraction of the total size. This article aims to help administrators make a decision on what type of backups they want to use and if the performance loss is worth the reduction in storage requirements. We’ll also discuss how Pure Storage® FlashArray™ can speed up your backups and reduce performance degradation during deduplication and compression.
Can Deduplication and Compression Cause Data Loss?
Pure Storage deduplication (or dedupe, for short) and compression algorithms are lossless, which means that the original data can be perfectly reconstructed, every single time, from its compressed form. What SQL Server (through Windows) writes to the FlashArray system is exactly what SQL Server will read from the FlashArray system. There is no compromise in data integrity from having dedupe and compression enabled in FlashArray. It’s along the lines of how row and page compression and columnstore indexes in SQL Server compress your data without compromising its integrity.
Furthermore, when you issue a write to the array, we won’t acknowledge that write back to the OS until it has been written to two separate super-fast NVRAM devices. We will then do more things with that data, like moving it to the SSDs, but at that point, the data has been made safely redundant.
Can I Turn Deduplication and Compression Off?
No, you cannot turn dedupe and compression off.
We do this to minimize write amplification on our SSDs and extend the life of your investment in Pure Storage.
How Does It Work?
As Arthur C. Clarke observed, “Any sufficiently advanced technology is indistinguishable from magic.”
That’s how I felt when I saw FlashArray perform for the first time at my previous job, where I was a database engineer. Inline dedupe and compression. Terabytes of data reduced by a factor of 3.5:1. Already-compressed data being compressed even further. Handling an 11TB OLTP database with ease. Magic, indeed!
Today, there are many hundreds of SQL Server instances and thousands of databases running happily on Pure Storage. DBAs sleep better because they don’t have to worry about storage. Businesses can do more because they’ve freed themselves from the chains of slow IO. There are many, many happy customers.
And if you don’t think FlashArray can handle your mission-critical SQL Server databases, think again: That’s exactly what it did at my previous job.
Flash-enabled Magic
You probably think that it’s not a very good idea to have a disk-based device performing dedupe and compression on 100% of the writes to host your SQL Server databases. And I would agree with you on that. What allows Pure Storage to do this in our arrays is flash that can perform operations in nanoseconds and even in parallel. The Purity operating environment architecture enables FlashArray to leverage that extremely low latency and parallelism to handle incoming and outgoing data extremely fast and with high redundancy.
Learn More about How FlashArray Works
For a deeper dive into FlashArray and the Purity operating environment, watch this whiteboard session with a Pure Storage principal architect. The session is packed with tons of details on how we handle data inside our arrays.
Read Part 2 in this series to see how and why dedupe and compression in FlashArray can compress your already-compressed heaps and indexes (i.e., tables) on your SQL Server databases.
Scale Modern Database Workloads
Supercharge your databases with a simple, resilient and efficient storage platform.






