Monitoring Snapshot Space Consumption with FlashArray™

In this use case, we look at how FlashArray monitors snapshot space consumption, enables usage alerts, and offers predictive planning.

DAS

6 minutes
image_pdfimage_print

Snapshots are incremental data copies, which means that only the blocks on the device that have changed after your most recent snapshot are saved. This minimizes the time required to create the snapshot and saves on storage costs by not duplicating data.

Snapshots are designed to be used as recovery points during changes to an underlying volume. Most often they will be deleted based on a 7-day retention schedule. It is important to understand the risk of having them live longer: Not only do they consume space, but they also tend to cause a performance hit, if not designed properly.

Snapshots are not intended to be used as backup copies, since they do not have a file system or database level consistency check.

Coordinated snapshots can be taken across multiple storage volumes, allowing you to take exact point-in-time, data coordinated and crash-consistent snapshots across multiple volumes.

Snapshot Space Consumption

As with conventional disks, a volume’s storage capacity is presented as a set of consecutively numbered 512-byte sectors into which data can be written and from which it can be read. Hosts read and write data in blocks, which are represented as consecutively-numbered sequences of sectors. FlashArrays do not allocate physical storage for volume sectors that no host has ever written, or for trimmed (expressly deallocated by host or array administrator command) sector addresses. FlashArrays allocate only the exact amount of physical storage required by each host-written block after reduction. In FlashArrays, there is no concept of allocating storage in “chunks” of some fixed size.

The Purity//FA storage software applies pattern elimination, duplicate elimination, and compression techniques to data as it enters an array, as well as throughout the data’s lifetime. FlashArray snapshots occupy physical storage only in proportion to the number of sectors of their source volumes that are overwritten by hosts. The net physical storage consumed by host-written data is a fraction of the number of unique volume sector addresses to which hosts have written data.

Snapshot Space Reporting

Snapshot Space is defined as the amount of space that would be freed by deleting snapshots. The figure below is a screenshot of the space used by the vSHNData 01-03 volumes and their snapshots. The space usage of a volume/snapshot as reported in the GUI is defined as follows:

1. Space attributed to the vSHNData_01 volume:

The amount of space that is unique to the volume (i.e. which is not duplicated on another volume or snapshot).

2. Space attributed to a snapshot of the vSHNData_01 volume:

The amount that would be freed if that snapshot was deleted. This is the amount of space in the volume that has been overwritten on the volume (or a newer snapshot) but not de-duped since the point the snapshot was taken. If there have been no writes to the volume since the time the snapshot was taken, the size of the snapshot will be 0.

Space1

Note that space reporting does not report what the differences between the snapshots and volumes are, only the amount of space that could be reclaimed by deleting the volume or the maximum that could be reclaimed by deleting a particular snapshot.

Snapshot Space Monitoring/Alerting

As storage administrator, it is important to set usage alerts, and forward them to those that need to see them. A production storage array should never consume more than 80% of capacity and performance (CPU). At 75% of the storage system space consumption, we want to set a yellow alert, at 95% a red alert. It is important to address storage consumption level issues before they occur to avoid running into major problems. If you use snapshots properly, these alerts will hopefully never show.

To manage snapshots capacity consumption, quotas can be defined in the form of a tuneable parameter, based on snapshot size per volume or per system. In case of space crunch, it is possible to limit the snapshot space in the system.

Because data stored in a FlashArray is virtualised, thin-provisioned, and reduced, volume storage is monitored, managed, and displayed from two viewpoints:

1. Host view: Displays the virtual storage capacity (size) and consumption as seen by the host storage administration tools.

2. Array view: Displays the physical storage capacity occupied by data and the metadata that describes and protects it.

Capacity on the array is monitored as per the following example:

Space2

Useful alerts, recommended to monitor, are the following:

1. Alert 50 – Volume (or Protection Group) Snapshots are at X% of Snapshot Limit:

Snapshot creation is quick and easy. This alert informs you that the number of snapshots created in a Protection Group are either coming close or has met the limit the number of snapshots that can be made. You will not be able to create any more snapshots within the Protection Group.

2. Alert 142 – Volumes are at X% of Limit:

This alert informs you that the number of volumes created in an array are either coming close or has met the limit the number of volumes that can be made. You will not be able to create any more volumes.

3. Alert 25 – Storage Consumption has Reached X% of Usable Capacity:

This alert has three thresholds at which it will alert: 80%, 90%, 100%.  These alerts serve as a warning for the FlashArray Administrator about an upwards trend in space consumption.  If the array fills up on space usage, without correction, you may suffer an outage. If phone home is enabled, at the 100% threshold, this alert is also sent to Pure Storage support so they can investigate the issue and reach out to you with any recommended actions.

Example:

Space3

Predictive Planning/Alerting

Pure1® Meta™ applies predictive analytics to a massive collection of storage array capacity and performance data to enable both a white glove customer support experience and breakthrough capabilities like accurate forecasting.

With Pure1® Analyze™, you get a wealth of simple-to-understand metrics on the health, capacity and performance of your infrastructure.

You can see all your snapshot information in one place, as per the following example:

Spacei

You can predict performance and capacity requirements through Pure1 Meta workload analysis, using trending and forecast to produce accurate estimates of load and capacity growth:

Space5

A dashboard allows you to see all Pure arrays, volumes and their growth deltas to allow you to see at a glance what is consuming storage, in case the size of the growth is concern:

Space7

The Pure1 Meta Workload Planner predicts both capacity and performance, and provides intelligent advice on workload deployment, interaction and optimisation:

Space8

Written By: