A version of this post originally appeared on IT Ops Times.
Data isn’t just growing—it’s exploding. In 2010, 1.2 zettabytes of brand-new data were created. That’s more than has been created in all of human history. This year, that number is expected to climb to 33 zettabytes. By 2025, it’s expected to reach 175 zettabytes.
That’s not just exponential growth—it marks the emergence of a completely different asset. This new reality requires a mind-shift in the way we collect, store, and analyze information that’s already at our disposal. Brand-new industries are being built around this reality. Others that have been around for decades struggle to adjust to the new normal.
Considering what might be coming down the pike, we haven’t even scratched the surface of what data can do for business. Or, its potential to doom organizations and industries that aren’t prepared.
In the automotive space, for example, car manufacturers are focused on self-driving vehicles. It’s an endeavor that requires an entirely new model for collecting, managing, and processing data. Suddenly, companies are generating data at the 100-petabyte level as they race to build the best product in this new market.
In telecommunications, 5G has brought speeds and latency much closer to the high-speed networking that is available at the network edge. Building that out requires a ton of infrastructure.
In healthcare, imaging, genome sequencing, and drug discovery have created an explosion of data and an opportunity to revolutionize the way we practice medicine. This work cannot and should not be held back by legacy technology that’s not up to the task. Future success depends on a fresh approach to the infrastructure layer.
Modern Data Calls for Modern Solutions
Modern applications and modern data, unsurprisingly, come equipped with a constantly evolving set of challenges. Those challenges break down into workloads, a series of tasks that infrastructure must perform to successfully run an application and glean value from the results.
At the core of any organization’s infrastructure is storage. Modern data needs to be stored in a way that is accessible and can be turned into value. It also needs to retain its integrity to supply historical insight and future predictions in a way that accounts for both expected and unexpected scale. The ability to consolidate these workloads into a single platform is critical in today’s fast-paced business world.
In the past, storage systems were built specifically for either file storage or object storage. File storage was built on hierarchical systems while object storage was designed to store data at a massive scale through slow access methods. Today, neither option will suffice.
To handle the massive volume and complexity of modern data, databases need to deliver rich metadata capabilities. They need to layer both file and object semantics on top like tents on an underlying platform. This fundamental approach to database design “under the covers” has never been done before, yet demand is growing exponentially. Organizations are hard-pressed to find solutions that are organically engineered to meet the requirements of modern data. And as organizations continue a now-accelerated journey to digital transformation and begin to realize their structural goals, it quickly becomes clear that performance and simplicity at scale require a unified approach.
Data Protection for a Secure Start
Before you can think about how you’ll extract value from today’s most valuable industrial currency—data, you need to consider how you’ll protect the integrity and viability of it. Data protection is a team effort between file and object storage. Pure Storage® has partnered with Commvault to build a standard architecture that relies on a combination of file and object storage for deployments. While parts of the architecture need a file API, the bulk data is object. Unified fast file and object (UFFO) storage allows both of these tasks to run on a single platform. It simplifies management, reduces cost, and puts fewer vendors between you and your data.
Then there’s the other side of this same coin: You need to keep data safe from malicious actors, ransomware attacks, and theft. Retailers, for example, have to go to great lengths to ensure security through the explosive growth of telemetry data. Apps deployed for the business and the transition from physical to online sales leave modern organizations more vulnerable.
Digital transformation has exponentially expanded the cyberattack surface. This problem requires you to prepare for the worst-case scenario and protect what you can’t necessarily even see. Prevent future attacks by identifying and running analysis on attacks that have already happened. Naturally, this entails the collection of enormous amounts of telemetry log data and forensic information.
For example, consider a financial organization that specializes in fraud detection. Every year, attacks and fraud become more sophisticated. Every time it shuts one door, a new one opens. The company leverages a unified file and object storage platform to facilitate machine learning to build state-of-the-art, fraud-detection models. The process looks a little bit like software development, which requires an object store, while the output of machine learning tools mainly run on file. Even at companies of massive scale, teams prefer file because researchers tend to prototype on their laptops and want the same familiar abstractions at scale.
There is also a critical need emerging for a unified approach to file and object storage in the area of quantitative finance. The dominant workload is a virtual firehose of unstructured data: market tick data, stock feeds—options feeds, anything that can be processed to help a financial model make more accurate predictive decisions. IoT companies face this same problem with the continued explosion of connected devices, but the finance world has been dealing with this issue for decades. They tend to utilize datastores that need file storage under the hood.
You need to establish a top-down approach if you want to build an optimal data platform that enables the analysis and curation of data from which downstream consumers can gain value. In analytics, that might mean leveraging applied AI to build systems that automate quality checks and curation. In general, researchers and developers want to know how quickly they can go from concept to deployment. One thing is for certain, seek out a vendor offering technology that can evolve with your ever-evolving needs.