Last January, we offered some broader predictions for the enterprise technology market. Turns out those predictions are still looking pretty good (I was never very good at market timing ;-)). This year, our prognostications will be storage specific. There are three independent disruptions coming together to reshape the storage landscape in 2014 and beyond: flash memory, hyper-convergence (of compute and storage), and software defined storage (SDS).

Pure Storage - Top Ten

Readers of this blog know we’ve been talking about the impact of flash for years (See for example, Inflection Point for Enterprise Storage), but we have devoted far less virtual ink to hyper-convergence and SDS.

Hyper-converged storage is the combining server and storage functionality within a single chassis. Google is widely recognized as the founder of the hyper-converged model, as they recognized server farms could easily overwhelm the I/O capabilities of disk based shared storage. Enterprise datacenters have evolved from servers to cluster architectures, like one finds with virtual servers. Recently a number of vendors have begun packaging hyper-converged storage for the enterprise—think VMware vSAN, EMC ScaleIO, or the appliances from Nutanix or SimpliVity—with a basic premise that a large volume of low cost disk is best to address the performance and capacity of the modern datacenter. (Wikibon has instead taken to calling this architecture ServerSAN.)

Software defined storage is the abstraction of storage provisioning, management, and operations (the control plane) from the underlying infrastructure that stores, protects, snapshots, replicates, et al. the data (the data services). The promise of SDS is to deliver IT agility via policy based automation for storage, most often through frameworks like OpenStack, VMware vCloud Automation Center, and EMC ViPR. This definition of SDS directly parallels that for Software Defined Networking (SDN), and it is a critical component in the move towards a software-defined datacenter (SDDC). Under these definitions, one would expect all hyper-converged storage to necessarily also be software-defined for manageability at scale, but dedicated storage arrays (which are not hyper-converged by definition) can support SDS as well.

With those preliminaries out of the way, here are our datacenter storage predictions for 2014 and beyond:

(1)  All enterprise data that humans and applications are waiting on starts moving from hard drives to flash Servers and networks have gotten ~1000-fold faster over the past twenty years thanks to Moore’s Law. As a result, mechanical disk is now hopelessly outclassed by the rest of the datacenter, which spends a preponderance of its time and energy waiting for hard drives to seek and rotate (which represents about 97% of what the disk is doing for virtualized and database workloads).

Consumers are now very used to the performance of flash, since it’s the technology underpinning their smart phones and tablets, as well as driving much of their user experience at consumer websites like Google, Facebook, Apple, LinkedIn, etc. Pure Storage catalyzed the enterprise and service provider transition to all flash by pioneering a recipe that delivers 100% flash storage below the price point of SAN/NAS disk.

Exceptions should still be made for workloads that are gated on sequential bandwidth like video (disk is still pretty good at streaming), but otherwise hard drives are destined for backups, archiving, and other second tier use cases.

(2)  Hybrids of flash and disk are recognized to perform like disk, not like flash The hyper-converged storage offerings available to the enterprise today are, like their SAN/NAS disk-centric counterparts, designed to blend flash and disk. These hybrids are all marketed as offering the performance of flash at the price of disk. However, as we have remarked on this blog, the reality falls far short of that promise. Rather, hybrids of flash and disk perform like disk: The disparity between solid-state flash and mechanical hard drives is so great that missing flash cache just 5% of the time (95% cache efficiency) reduces your performance five fold! For real-world applications, hybrids are incrementally faster disk systems. The same performance gap ultimately doomed disk/tape hybrid appliances—remember hierarchical storage management and virtual tape libraries? (More on inherent performance challenges with hybrids can be found here.)

The only way to close the 1000-fold performance gap with compute and networks is to go all-flash. Notably, the leading purveyors of disk have tacitly acknowledged this by developing all-flash product lines—see EMC XtremIO and NetApp EF-x50 & future FlashRay.

(3)  Software defined storage will be gated by hybrid storage complexity, with real progress hinging on the transition to fully virtualized and ultimately all-flash storage  If abstracting the control plane for networking is difficult, then doing the same for today’s disk-centric storage may prove impossible. First, much of legacy SAN storage is as of yet unvirtualized—consider the issues storage admins face day to day that map to physical hardware: RAID configuration, drive rebuilds & sparing, block & host alignment, performance contention & rebalancing. Even for virtualized storage (hyper-converged storage is generally virtualized), there are often still the complexities of caching policies, volume & sub-volume tiering, pool management, trading off performance vs. RAID/mirroring schemes, encryption, over-provisioning & reclamation, and so on. Looking at the management interface for legacy SAN/NAS storage systems, one thinks of an airplane cockpit with its hundreds of esoteric widgets. Now consider that the entire command-line interface for a next-generation all flash array (AFA) like that from Pure Storage fits on a couple of business cards rather than a floor to desktop stack of manuals. With AFAs, all I/O and data management is logical rather than physical. By shedding two orders of magnitude of management complexity, we can catch storage up to servers and networks, and thereby enable cloud automation.

Afraid that is all for now, although definitely expect more on hyper-converged storage and SDS from Pure soon. Happy new year from Team Puritan.