Driving Scientific Discovery with Big Data

Learn how the Nuffield Department of Population Health at the University of Oxford handles expansive data sets, rich media, and a variety of projects in its quest for medical breakthroughs.

NDPH

Medical research dominated headlines during the COVID-19 pandemic. And the Nuffield Department of Population Health (NDPH) was one of the first institutions to give the world some hope. Working alongside the Nuffield Department of Medicine, NDPH’s RECOVERY Trial provided an early breakthrough in the COVID-19 response: discovering that an inexpensive steroid could save many severely ill patients. That’s just one example of the critical health research NDPH conducts.

Created in 2013, the department is a leading light in exploring how healthcare is delivered in the UK and around the world. And research on this global scale means data in remarkable volumes—something that Dave Ewart, NDPH’s IT infrastructure manager, knows only too well.

World-class Research and Worldwide Data

Ewart’s team runs all the crucial IT infrastructure that underpins the department’s work. His staff manages all servers, storage, and networking. As a result, the team is also ultimately responsible for the safety and security of the remarkable data stored within those systems.

Hundreds of researchers and postgraduate students gather, generate, record, and analyse vast amounts of data on a constant basis as part of NDPH’s ongoing medical studies. Some projects involve data gathered from hundreds of thousands of people, which create numerous complex databases gathered from a multitude of sources.

While the data that NDPH gathers and handles is complex, Ewart sees his team’s goal as quite simple: to provide a solid, reliable foundation for NDPH staff to do their work.

This staff spans a wide variety of roles and technology needs. Statisticians, medics, laboratory-based scientists, data analysts, HR administrators, and finance managers all rely upon technology that can handle their various project requirements. But back in 2019, the storage that underpinned those projects was starting to show its age.

Better Backups and Powerful Performance

When NDPH formed, it consolidated the IT storage infrastructure from its various research groups into two systems: one for core storage and the other for DMZ storage, with this separation boosting internal data protection. To improve reliability, there were multiple instances of each system.

Fast-forward a few years, and Ewart wanted to introduce real-time replication to protect data and minimise risk. But this wasn’t possible on the existing hard disk arrays. Instead, replication happened on a timed basis, posing a risk for both resilience and for data recovery if something went wrong.

There were performance issues too. Standard maintenance processes on the department’s clusters, like migration and snapshots, were delayed. And compute demands from end users often taxed the existing infrastructure. Ewart’s team decided to move away from hard disks and began looking for a high-performance, flash-based system.

All-flash, All Secure

After speaking with IT consultancy COOLSPIRiT, Ewart chose FlashArray//C™ from Pure Storage to replace all existing NDPH instances. Using Purity ActiveCluster™, the arrays now run real-time, fully symmetric bidirectional replication, improving the resilience of the department’s core storage and DMZ, and minimising risk. 

As well as helping to protect from unforeseen events, this setup lets Ewart and his team run maintenance on areas of the system without any downtime. This means that NDPH scientists, researchers, and department staff can continue their work without interruption.

Given the scope and scale of the health information NDPH handles, data protection is vital for governance and auditing. 

“Pure was the first major system we’d used that was encrypted at rest,” says Ewart. “We knew this would mean we could effectively protect the department’s data at every stage of its lifecycle. And now we have peace of mind, knowing that our scientists and researchers can reliably access the one-of-a-kind data sets they’ve gathered for their work.” 

Scalable, Flexible, and Affordable for Years to Come

With varied work and inevitable data growth, Ewart knew he needed storage that was designed to evolve and expand as NDPH’s needs changed. This meant ensuring operating costs were kept under control. NDPH is already saving money on energy bills, as power usage for its Pure FlashArray systems is considerably lower than the previous spinning disc storage. The arrays also take up less rack space and generate much less heat, dramatically reducing the money spent on air conditioning. What’s more, the team knows their backup power supply will last longer if it’s ever needed.

“NDPH is a world-class research facility, making remarkable strides in the fields of medical and healthcare research,” says Ewart. “Given the breadth, variety, and volume of data we handle, we knew we’d need scalable storage we could depend on now and for years to come. With all our FlashArrays, we now have the performance, security, and capacity that the department needs as a foundation for its vital work.”