PowerCLI and VVols Part VII: Synchronizing a Replication Group

In this post, I will overview how to synchronize a VVol-based replication group with PowerCLI. See previous posts below for more context: PowerCLI and VVols Part I: Assigning a SPBM Policy PowerCLI and VVols Part II: Finding VVol UUIDs PowerCLI and VVols Part III: Getting VVol UUIDs from the FlashArray PowerCLI and VVols Part IV: […]


In this post, I will overview how to synchronize a VVol-based replication group with PowerCLI. See previous posts below for more context:

This post is somewhat specific to Pure Storage–the cmdlets of course are universal, but behaviors may not correlate to your storage array. So if you are using VVols on a non-Pure array, certainly consult your vendor.

Furthermore, this is certainly specific to PowerCLI when it comes to the commands. With that being said, the fundamentals on how this works with Pure is common for all orchestration tools, so you should be able to use this information for other tools. Though of course the cmds/syntax will be different.

 

In VVols, VMs are replicated in replication groups. These replication groups usually have some kind of replication schedule. Once a hour, every 5 minutes, etc. There may come a time to manually synchronize a replication group–like before a planned failover, or test failover. So how is this managed with PowerShell.

Well there are a few options. First off you can simply use the array directly. We have a Pure Storage PowerShell SDK with a cmdlet called “New-PfaProtectionGroupSnapshot” that can do this for you. You can run that against the protection group you want to synchronize and wait for the replication to finish and move on to the VMware PowerCLI cmdlets. Nothing wrong with that.

But there is also a VMware PowerCLI cmdlet called “Sync-SpbmReplicationGroup”. Can we use that? Yes, and no. Let me explain.

In the VVol replication API, there are two operations around forcing a synchronization, one for synchronizing a replication group and one for querying for the status.

From a Pure Storage perspective, the issue with this VVol API operation (this is not specific to PowerCLI) is that VMware requires that the sync operation only be executed against the target replication site. This assumes that any asynchronous replication solution starts replication jobs from the target site, or in other words has the ability to “pull” a synchronization. The FlashArray async replication does not work this way. FlashArray asynchronous replication is a “push” type operation, not “pull”.

A synchronization is “pushed” to the target:

Replication point-in-times are compiled at the source FlashArray and sent to the target, so this command must be issued to the source site. See the problem here?

While we (Pure) are evaluating some changes to how this works, this is the case today (today being the start of 2019). VMware wants to only start syncs via the target site. Pure only allows syncs to be issued to the source site.

So do we even support this operation?

Yes, kinda.

It just may not behave as you’d expect. Let me walk through it. So first store a target replication group in some object. In my case I put it in $targetGroup.

For instructions on getting to this step see the blog post here.

Now I can run Sync-SpbmReplicationGroup and pass in the target group. You must also pass in a name for what you want to call the point-in-time. This point-in-time name can be a string that makes sense to you.

What happens is the cmdlet just waits and queries the array until a synchronization completes. But the important point here is that this actually does NOT kick of a synchronization. It is issued to the target FlashArray, which can do nothing but wait. And that is exactly what happens. When a sync request comes in, the target FlashArray (or more accurately VASA provider) notes the request and waits for a synchronization. Which will come according to the replication schedule, or if someone kicks off a manual replication. So it will not start just because you sent this.

So however that replication synchronization comes through, once it completes, the cmdlet returns the point-in-time. You will see the point-in-time name as well.

You will notice some do not have names (note we might change this in the future). These are synchronizations that were not started via VMware, like PowerCLI. This is still the reason I recommend (if you want to specify a point-in-time besides the latest) using VMware to synchronize before a failover or test–gives you more insight into that point-in-time.

So how should you do it? Well this is up to you. My recommendation is the following.

First run the sync-spbmreplicationgroup operation, BUT also add the -runasync tag. This will then return a task.

This will also have the VASA provider start its watchdog for a synchronization event.

Then connect to the source FlashArray with our PowerShell module:

Then you can get the protection group name from the replication group. The simplest method if to get the source replication group using get-spbmreplicationpair:

Then pull the pgroup name from the source replication group name:

Now run the synchronization.

You can continue to run the sync-spbmreplicationgroup operation until the synchronization completes.

When the state is equal to success, a synchronization has completed. You can then run get-spbmpointintimereplica to see the new PiT:

Note we are looking into improving this behavior, so this may change at some point in the future (like removing the requirement to use our SDK to push the synchronization).

PrepareFailover

Another related operation is PrepareFailover. The PowerCLI cmdlet for this is Start-SpbmReplicationPrepareFailover. So what does this do?

Well two things:

  • It kicks off a synchronization event. This is sent to the source replication group (source FlashArray). When received the FlashArray will do a replicate-now to the protection group to synchronize the replication. You would do this before a planned migration
  • It also tells the target side to expect no more synchronizations (though it will allow more to come through if they get sent). So the target site will not longer behave like as described above for sync-spbmreplicationgroup. It will always return “success” for that replication group until it is actually failed over.

So if I run start-spbmreplicationpreparefailover:

The cmdlet will complete when the synchronization is done. If I run sync-spbmreplication group against the target group now, it will immediately return back success:

So in short, prepareFailover should only be used when:

  • The source side VMs has been shut down and you are ready to do a final sync and failover. Use the sync-spbmreplicationgroup method above otherwise.
  • This should not be used for a test failover. And cannot be used for a disaster recovery failover (it needs to go to the source side and the source side may not even be running in that case)