For the tech industry, these are heady times indeed: The transformation in the enterprise technology landscape over the next ten years will rival that of the mid to late nineties due to the advent of the World-wide Web and prolifferation of the Internet. (I’d love to say I saw this coming, but the reality is that if you’d told me a few years ago we would be reinventing the data center as well as data and application platforms, I would have scoffed.)

The themes driving this transformation are already well underway:

  • Cloud computing – Virtualized, elastic (a.k.a. linearly scalable/scale out), software-defined, multi-tenant computing is a fundamental sea change for server software as well as tech infrastructure.Pure Storage - Top Ten
  • XaaS (Everything as a Service) – Getting the “cloud” right is hard, so it generally makes sense to let someone else take care of your business’ software needs other than those that are core competency.
  • Big Data – I like the IBM 3 V’s characterization: semi- and unstructured (variety) data is being generated at such an increasing rate (volume and velocity) that only cloud platforms are economical for mining the value therein.
  • Moore’s Law trumps Newton’s Law – The performance of consumer websites (think Google instant search) is raising the bar for enterprise applications. As a result, DRAM and flash memory will continue to push mechanical disk out of the execution path for online and performance applications.
  • Mobile trumps the PC – Desktops and laptops have been superseded by smartphones and tablets for users other than information workers—programmers & content authors, and even they will increasingly use smartphones and tablets as their primary device for consumption.

These themes are not controversial, but hopefully our predictions spurred by them will prove more so. For regular readers of this blog, I’ve elected to place the flash and storage related predictions to the bottom, as those will be less surprising. Without further ado, here are our top ten enterprise technology predictions for 2013:

Big and Traditional Data

(#10) Putting SQL back into NoSQL – The Apache Hadoop family is the dominant Big Data technology growing out of the NoSQL movement, and yet SQL’s use as an API for Big Data in on the rise, with Cloudera’s Impala accelerating that trend (full disclosure: I’m an outside director at Cloudera). No surprise, as NoSQL was always less about SQL per se and more about weaknesses in existing database architectures—lack of elasticity/cloud scalability, the overhead of joins/transactional semantics, cost, etc. But it is ironic that this will be the year of SQL for NoSQL.

(#9) Big Data goes on-line – Currently, the majority of Hadoop workloads are batch-oriented analytics. However, HBase, which provides more real-time database functionality on top of the Hadoop file system, is poised to take off in 2013. And with truly elastic data stores like Hadoop, analytic workloads can more feasibly be blended with OLTP without compromising performance.

(#8) Memory technologies continue to shake up the relational database landscape – Expect to see efforts to get the most performance critical data entirely into solid-state (DRAM and flash). This will drive the growth of alternative “in memory” database architectures, as well as the redeployment of traditional SQL databases on all-flash storage. And the performance pressure is not just for OLTP: at Pure, our #1 use case is accelerating the performance of database analytics. (With SQL encroaching on NoSQL, expect traditional RDBMS technology to get a boost for analytics on structured data.) Typically, SQL performance is 10X better on all-flash vs. disk—e.g., batch runtimes for analytic workloads drop from 22 hours down to 2 hours, opening the door for moving analytic app’s on-line or developing 10X richer versions that still complete before the next business day. Longer term, expect the relational database to shed some of its disk-centric baggage (e.g., the OLTP transaction log designed to sequentialize random I/O), and expect to see cloud elasticity and solid-state drive convergence between OLTP and analytic database architectures.

(#7) But Big Data stays on (slow) disk for now – All storage vendors want to get on the Big Data bandwagon, but the trouble is most of today’s Big Data workloads are large, sequential batch jobs. This leaves little role for performance storage—15K spindles are just too expensive relative to SATA, and flash is only 2-3X faster than disk at sequential reads (but more like 100X faster on random reads). Also, Big Data is typically compressed by the server software, meaning there’s less opportunity for the deduplication and compression that makes flash price competitive with disk. No doubt, as more Big Data goes on-line, we will see increased demand for getting at least the meta-data/indices into flash (which makes great sense if they are partitioned from the rest of the dataset), but traditional databases, server applications, and VDI are better fits for flash today (more on this below).


(#6) VDI takes offAs we’ve remarked elsewhere, Virtual Desktop Infrastructure (VDI) got to critical mass in 2012 thanks to the appeal of Bring Your Own Device (BYOD) and the realization that VDI can actually improve the user experience on legacy PCs. Indeed all flash storage with inline deduplication has proven to be perfect for VDI—far lower cost than disk (due to >10:1 data reduction) and faster performance than a laptop’s local SSD! (Not so surprising when you consider a network round trip is a fraction of the cost of reading or writing the flash, and so if you can do a better job managing the flash on the other side of the network, it’s a performance win for the client.) For more on this, please check out Pure’s newly-released all-flash reference architectures for VDI.

That’s it for Part 1. If you’re curious about the top five afraid you’re just going to have to wait for Part 2 coming in short order.

Thanks for listening, and Happy New Year from all of us at Pure Storage.