This article on low-latency analytics initially appeared on Kirk Borne’s LinkedIn. It has been republished with the author’s credit and consent.
On-prem data sources have the powerful advantage (for design, development, and deployment of enterprise analytics applications) of low-latency data delivery. Low-latency data delivery is a system-level requirement that is tied to a critical business user requirement: low-latency analytics product delivery! You cannot have the second without the first.
In my early days as a data systems manager at NASA, I learned this distinction: Business requirements specify what must be delivered to provide value to end users; and system requirements specify how the proposed system will accomplish the business requirements. In my early years as a scientist (doing my own research on my own computer), I cared little about the “system” and more about the end results. As I progressed in my career into management roles for enterprise data systems, I gained a deeper understanding and appreciation of the synergies and interdependencies between system and user requirements.
The criticality of these synergies becomes obvious when we recognize analytics as the products (the outputs and deliverables) of the data science and machine learning activities that are applied to enterprise data (the inputs). Low-latency data access and data delivery (system requirements) are necessary for low-latency delivery of analytics products (business user requirement).
Here are some examples of specific analytics products: integrated and enriched data sets (including integrations with third-party data), curated feature stores, dashboards, predictions, prescriptions, alerts (for anomaly detection), alarms (for prescriptive maintenance), models, apps, data science notebooks, APIs, and other analytics application endpoints for internal enterprise users and external customers. See also some special categories of analytics products in this article: “Three Emerging Analytics Products Derived from Value-driven Data Innovation and Insights Discovery in the Enterprise.”
Data Infrastructure: An Essential System Requirement
Analytics products represent the user-facing and client-facing derived value that is extracted from an organization’s data stores. Consequently, low-latency data infrastructure is an essential system requirement for delivering low-latency analytics products that empower internal business users and external customer users to get their tasks done efficiently (as quickly as possible) and effectively (as much as possible).
In addition to low latency, there are also other system features (i.e., potential requirements) that enable the enterprise data infrastructure to contribute significantly to the efficiency and effectiveness of enterprise analytics activities, applications, and products. These additional system features include parallelism and non-disruptive infrastructure upgrades:
- Data parallelism allows many different enterprise users, business use cases, and analytics applications to stream data in and out of the storage system simultaneously, while also speeding up single applications that require access to multiple distributed data sources.
- Non-disruptive infrastructure upgrades allow for storage device firmware updates, enterprise storage expansion, and reliable seamless failover (from unexpected infrastructure incidents), while keeping the data pipelines and low-latency analytics product applications running smoothly and continuously.
All of these system capabilities, which are made available through Pure Storage, have arrived just in time. When data science was in its “early days” within businesses, the data scientists mostly worked offline with static sources (like databases, notebooks, open data repositories, or web-based reports) to build and test analytics models for potential deployment in the enterprise. As the number of sensors in business and industry environments began to increase dramatically, including ubiquitous internet of things (IoT) devices and data sourcing through APIs, so have the number of analytics applications multiplied and become embedded within more business enterprise processes.
Accompanying (and perhaps as a consequence of) these developments, similar to my career progression, data scientists began to develop that deeper appreciation of the importance of system requirements. The corresponding growth of system-wide analytics product development and deployment has made the enterprise data infrastructure a significant variable in the equation of business analytics success. Pure Storage analytics solutions can boost business performance and competitive advantage all across that data environment.
Easy, low-latency access to multiple diverse on-prem data sources is therefore highly useful and essential in today’s high-stakes business landscape. Pure Storage offers on-prem data solutions that deliver on the essential combination of system and business user requirements, thus providing a resilient and robust data infrastructure for the delivery of low-latency analytics products that propel business innovation, success, and growth.
Read more about these solutions in the following articles:
- FlashBlade: Storage for Modern, Data-centric Organizations – “Designed to enable parallelism.”
- What Is a Non-Disruptive Upgrade (NDU)? – “Baked into the architecture of FlashArray™.”
- Why FlashBlade Is Truly Evergreen – “Non-disruptive upgrades, so data remains in place.”
- FlashArray//E Extends the Pure//E™ Family, Spelling the End for Hard Drives – “With capacity starting at 1PB, FlashArray//E™ broadens customers’ options to tackle data growth without needing to expand aging, highly inefficient, and expensive-to-run disk systems.”
Read our two related articles in this three-part series focused on enterprise analytics innovation:
- Solving the Data Daze: Analytics at the Speed of Business Questions
- The Data Space-time Continuum for Analytics Innovation and Business Growth