There’s a knock-on effect on data storage when cyberattackers step up their activity. The more attempts made to breach networks or pressure organizations to pay a ransom, the more events to be managed by SIEM systems. This means much more data—and therefore, much more data storage capacity. 

However, SIEM systems don’t need just any storage, something I discussed in a recent webinar, “How the Right Storage Can Improve SIEM Operations.” As my fellow panelists explained, even the best SIEMs that come with their own out-of-the-box capabilities don’t always give you the visibility that you might need and want in your environment. 

Why is visibility so key? As the saying goes, you can’t protect what you can’t see. You need to be able to correlate anomalies across the network, the endpoints, and the end users to quickly identify and put together a targeted response to a potential threat via modern data protection and a resiliency architecture. And when your pool of data is too small, incomplete, or too slow, you’re likely not going to find the adversaries in time.

Related reading: Troubleshoot SIEM Systems by Looking at Storage

Big Data Problems Need Big Data Solutions

“This is really a big data analytics problem,” said Eric Burgener, Research Vice President at IDC, during our session. “You can collect data, and then you can run analyses to determine when anomalous behavior occurs and what type of behavior it is.”

Eric made a great point. Piles of data don’t mean much for SIEM solutions if they can’t be accessed and analyzed at high speed. I brought up a metric I share often called “breakout time.” That’s the time between when an attacker might log in to an organization as a regular user and then elevate their privilege to become an administrator or domain admin in the environment. 

We used to measure breakout time in hours; now it’s down to about 90 minutes. When you’re trying to detect the anomalies in security alerts, you’re trying to catch them in that first 90-minute window, before the attacker can elevate between a standard user and a privileged user in the environment.

As Eric said, making data work hard enough and fast enough requires collecting data from a lot of different locations, analyzing it, and turning it into actionable insights. 

“That’s where the right storage architecture becomes very important,” he explained. “Big data analytics are highly scalable systems that can support massive amounts of data against which they can run the analytics and then also the ability to support high degrees of concurrency.”

The most worrisome challenge with SIEM solutions is that they need to simultaneously ingest large amounts of data, fast, while at the same time correlating that data to identify anomalies that can be orchestrated to cyber threat hunters. When the storage is slow, the platforms can generally either ingest fast or process fast—but not both. For this reason, many organizations have to compromise on how many sources from which they can ingest data.

Realize the Potential of Data

Michelle Abraham, Research Director at IDC, told us that without the storage architecture that Eric described, you’re not getting ROI out of your SIEM investment. 

“Organizations need to not just have a SIEM in place, but be able to use it to its fullest capabilities,” she said. “If they’re just using the SIEM to collect data they might need for compliance purposes, that’s not the full potential.”

The growing number of open jobs in the cybersecurity field also speaks to the need for SIEM solutions to fill in for the humans, where and how they can. To do this, SIEM products need AI and lots of data. “In many cases, SIEMs today are using user and entity behavioral analysis to look for those anomalies—and those machine learning algorithms need the data in order to be trained,” Michelle said.

Data Keeps Coming

Let’s face it, the rush of cybersecurity data isn’t going to slow down—it’s going to speed up. As Eric explained during our session, most enterprises will see a growth in data volume of 30% to 40% per year over at least the next five to six years. It’s likely many of these organizations will outgrow the data architectures they have.

“This really points to the need for having scale-out solutions that can meet this particular requirement,” Eric explained. “This is not something that’s just specific to SIEMs. Because of this huge need for data to drive analytics, you’ll be collecting more and more data over time—and the more data you collect, the better results you get.”

And with SIEM, we need more than just scalability. We need solutions for building tiers of data—like hot, warm, and cold. Having a platform that gives you the ability to do that non-disruptively and without taking downtime ever is what Pure Storage® FlashBlade® can do.