With each new release of SAP HANA support package stacks (SPS), exceptional new features and functionality are introduced to enhance the customer experience and capabilities. One area of functionality that has seen increased attention recently is the concept of data temperature management.
SAP HANA can be a costly database to run over time, especially as data growth occurs. To ensure that the cost of operating the in-memory database can be managed, a method of separating data into tiers was created. Each tier is based on the locality of data, typically dealing with capacity and performance requirements in parallel. The first tier in the data temperature model is for hot data, which is frequently accessed and has the highest performance requirements. The second tier is for warm data, data which is infrequently accessed and has lower performance requirements than hot data, but needs to reside as a core part of the database for operations to continue. The third and final tier is for cold data, information accessed sporadically with low-performance requirements.
Implementing a multi-temperature data strategy for SAP HANA can help reduce total cost of ownership (TCO) while preserving an organization’s long term data growth strategy.
Hot data for SAP HANA resides in-memory. As data growth occurs in this tier, more physical memory will be required, which in turn uses more power and cooling, leading to an increased cost of operating the database. However, in the event that some of the data in the hot tier is not being used on a frequent basis, then this becomes an opportunity for cost savings by moving it to the warm tier.
The warm tier consists of the following data temperature management techniques:
- Paged Attributes: Only applicable to S/4HANA and SAP Business Suite Powered by SAP HANA. It allows a table’s columns (only for column tables) to be loaded and unloaded between memory and persistent storage on an as-needed basis. This is implemented through the Data Aging Framework provided by SAP NetWeaver ABAP.
- Dynamic Tiering: A native SAP HANA technology that allows for tables to be located in extended storage, increasing warm data volume capacity.
- Extension Nodes: Based on SAP HANA Scale-out architecture, a node in an SAP HANA Distributed system can be specified for warm data management. Data that needs to be relocated from a hot tier to the warm tier, in this scenario, is redistributed (moved) to the extension node. An extension node allows for a larger amount of data (100% of DRAM size) to be stored than the normal SAP HANA sizing requirements (50% of DRAM size). This warm data method can be implemented natively or with BW/4 HANA and SAP Business Warehouse on SAP HANA.
- Native Storage Extension(NSE): A native and built-in warm datastore, allowing for less frequently accessed data to be managed by accessing it from disk as opposed to in-memory.
When SAP HANA SPS04 was released earlier in 2019, I blogged about the performance advantages of having a FlashArray™ as the persistent storage for multiple data temperature deployments. With the release of DirectMemory cache for FlashArray//X, query performance of data in NSE can be increased by up to 15% when compared to FlashArray//X without DirectMemory cache.
To prove the benefits of adding DirectMemory to SAP HANA deployments implementing a warm data tier, Pure was the first SAP partner to do in-depth testing and analysis on the performance of online analytical processing (OLAP) operations for a range of scale-up scenarios:
- A 2TB database with all tables in-memory.
- A 2TB database with 75% of tables specified to use NSE, with FlashArray//X and Direct Memory Cache.
- A 2TB database with 75% of tables specified to use NSE, with FlashArray//X.
- A 2TB database with 75% of tables specified to use NSE, with direct attached storage.
The metrics collected for comparison are OLAP queries completed per second, where each OLAP query would only request data from tables specified to be accessed using NSE. The comparable in-memory test made use of the same tables and data in all of the other tests, with the exception that all data was in the hot tier.
The results in figure 1 – comparison of data locality – highlight the following:
- The best performance is achieved using only in-memory data. This is also the option with the highest TCO.
- FlashArray//X with DirectMemory cache has only a slight degradation in performance (10%) compared to in-memory data but saves up to 60% of the cost.
- FlashArray//X is only 15-20% slower than in-memory data but offering up to a 75% reduction in cost.
- Direct attached storage is 40% slower with the largest physical footprint and no data services.
Each deployment will have different requirements, and the way in which data is used will dictate how effective warm data management is. However, it can be seen that FlashArray with DirectMemory cache has a significant benefit to offer for SAP HANA environments.
See Accelerating Warm Data Management with Pure Storage DirectMemory Cache solutions brief for further details