Security is consistently listed as a top concern of enterprises when they begin to look at cloud computing and cloud storage, or an on-premise storage capacity and performance consumption-based chargeback model.

The first line of defence in data security is encryption. Encryption is one of the most popular and effective data security methods used by organisations. The purpose of data encryption is to protect digital data confidentiality as it is stored on storage systems and transmitted using the network.

The type of data encryption used will have consequences on storage capacity and performance sizing. Businesses must look at all of the considerations to make an informed decision about what type of encryption to use.

Encryption methods

Data at Rest Encryption

Pure Storage makes every effort to keep data stored in its systems both available to authorised users and secure against electronic intrusion and physical misappropriation.

Pure Storage data-at-rest encryption is always on and requires no configuration (including no external key management). The encryption mechanism used is AES256, and the FlashArray is FIPS 140-2 compliant.

Encryption is software based which means it cannot be broken through modifying the drive firmware. By default, all keys are generated and managed internally to the array.

Our key management is sophisticated, including automatic key rotation, periodic key regeneration, and unreadable partitioned keys that are spread over FlashArray flash modules. Keys are changed when a drive is removed from the array in addition to every 24 hours. This ensures that flash modules that are failed or proactively replaced are secure and cannot be read.

More information is available at:

https://www.purestorage.com/content/dam/purestorage/pdf/whitepapers/FlashArray-Data-Security-and-Compliance.pdf

Host-based Transparent Encryption

Pure Storage partners with Vormetric to offer optional external key management with Vormetric DSM. For customers that require host-based encryption, we have jointly announced with Thales the integration between FlashArray and Vormetric Transparent Encryption (VTE).

More information is available at:

https://blog.thalesesecurity.com/2019/03/13/guest-blog-end-to-end-data-encryption-with-data-reduction-from-thales-pure-storage/

SAP HANA data volume encryption

SAP HANA features two data encryption services: data encryption in the persistence layer and an internal encryption service available to applications requiring data encryption.

The SAP HANA database encryption in the persistence layer is post-processing:

https://help.sap.com/viewer/102d9916bf77407ea3942fef93a47da8/1.0.11/en-US/dc01f36fbb5710148b668201a6e95cf2.html

If data volumes are encrypted, all pages that reside in the data area on disk are encrypted using the AES-256-CBC algorithm. Pages are transparently decrypted as part of the load process into memory. When changes to data are persisted to disk, the relevant pages are automatically encrypted as part of the write operation.

Compression techniques

The compressed SAP HANA database will still show compression efficiencies on persistence, because both use different compression algorithms.

Database compression

The SAP HANA database uses several data compression techniques in memory: standard dictionary compression, and advanced compression types such as prefix encoding, run-length encoding, cluster encoding, indirect encoding, and sparse encoding.

Storage compression

Pure Storage FlashArray uses five different data reduction technologies: pattern removal (looking for simple repeated patterns and zeroed data), 512B aligned variable dedupe, inline compression, deep reduction, and copy reduction.

Pure Storage compression is implemented as a two-stage process: The first pass performs what is called “medium weight” compression on new data after deduplication. Data that is committed to the back-end storage is then subsequently post-processed to perform a more intensive compression process, called Deep Reduction. The Purity operating system for FlashArray uses a combination of a lightweight LZO algorithm for first pass compression and a deep reduction version of Huffman coding for the post-processing. Pure has decided that the two-stage process offers a better balance between performance and space savings. In addition, during periods of heavy system load, certain optimization processes can be curtailed to ensure a consistent level of performance.

Capacity efficiencies

Depending on the encryption type used, storage capacity sizing will either use raw capacity (i.e. excluding storage capacity savings), or efficient capacity (i.e. includingstorage capacity savings).

FlashArray Data at Rest encryption

When using FlashArray Data-at-Rest encryption or Vormetric Host-Based Transparent Encryption, the expected storage compression ratios are expected as follows:

  • Storage capacity is expected to show data reduction efficiencies (Deduplication, Compression) between 1.6:1 and 2.2:1 for physical HANA database instances (dependent on scale-up/out architecture) when using Data at Rest Encryption.
  • Storage capacity is expected to show data reduction efficiencies (Deduplication, Compression) between 3.5:1 and 4.5:1 for virtual HANA database instances (dependent on scale-up/out architecture) when using Data at Rest Encryption.

Pure Storage calculates capacity efficiencies for storage sizing purposed with an average of 1.9:1 Data Reduction Ratio for physical HANA, and 3.5:1 for virtual HANA.

Vormetric Host-Based Transparent Encryption

Using the integration of Pure Storage FlashArray and Vormetric Transparent Encryption (VTE) allows you to preserve full data reduction benefits on the array.

Database encryption

When data is encrypted at the HANA database level, all data that is written to persistence is unique and will therefore negate any data reduction efficiencies like compression and/or deduplication, resulting in increased storage capacity requirements.

Whether the database is physical or virtual, storage capacity efficiencies will be reduced to a 1.0:1-1.5:1 Data Reduction Ratio.

The example below shows the data reduction efficiency results (Deduplication, Compression) for the encrypted virtual HANA databases (Start 2019-06-22 01:57, End 2019-06-22 02:21), versus the non-encrypted virtual HANA databases (Start 2019-06-22 06:03, End 2019-06-22 06:38):

Encr3

Performance impact

With SAP HANA encryption, there is a performance penalty with reads/writes while the data is unencrypted/encrypted. Data volume encryption incurs an overhead when data is decrypted during read from disk and encrypted when writing to disk.

Data in memory is always decrypted and therefore there is no performance penalty associated with access to normal in-memory database operations.

Scenarios that involve access to data volumes and therefore have a performance impact are: Column loads, writing savepoints, creating database snapshots and replication, creating data backups, creating system copies or refreshes, importing hybrid LOBs (loaded in memory vs. remaining on persistence). These scenarios are dominated by storage I/O write operations and will cause encryption related CPU overhead.

When the SAP HANA persistence layer is encrypted, data needs to be decrypted if read from disk and pulled into memory. The impact of the SAP HANA persistence layer decryption process on the database and storage performance is negligible.

The example below shows the performance penalties for the encrypted virtual HANA databases versus the non-encrypted virtual HANA databases:

The  Pure Storage FlashArray write bandwidth shows a result of 3.7 GB/s encrypted vs 4.3 GB/s unencrypted sustained incompressible write performance, with the CPU not going above 50-60% for both the encrypted and unencrypted data.

Encr1

Encr2