Non-disruptive DR Drills for Oracle Databases Using Pure Storage ActiveDR — Part 3 of 4

Part 3 of this four-part blog series shows how to use some simple scripting to perform on-demand DR drills without disrupting your production site.

DR drills

image_pdfimage_print

In Part 3 of this four-part series, I’ll show some simple scripting to perform non-disruptive DR tests for Oracle databases using ActiveDR™.

Part 1: Configure ActiveDR and protect your DB volumes (why and how)
Part 2: Accessing the DB volumes at the DR site and opening the database
Part 3:  Non-disruptive DR drills with some simple scripting
Part 4 : Controlled and emergency failovers/failbacks

In Part 2, we completed the DR database configuration using the replicated volumes that were configured in Part 1.

Part 3 shows how we can use some simple scripting to perform on-demand DR drills without disrupting our production site.

In my opinion, this is a game-changer. Think about it: Zero-risk DR testing that can be done anytime (goodbye Saturdays/Sundays stuck in the office) and as frequently as required. How many times have you released a new application into production without checking that its DR process works? Full-scale legacy DR failovers are just too disruptive to run each and every time we release a new application, and their DR processes often go untested until the next scheduled DR drill which could be six months or more away. The ability to do DR drills anytime without disturbing production operations removes any uncertainty and prevents unwanted surprises in the event of a real disaster situation.

Now that we have all of our ActiveDR and DB configuration in place from Parts 1 and 2, we’re ready for on-demand non-disruptive DR drills. The process is extremely simple and I can summarise it in five steps:

  1. Promote Oracle-DR pod
  2. Start DR database services and applications
  3. Complete any DR verification tests
  4. Stop DR applications and database services
  5. Demote Oracle-DR pod

This can easily be done with a few clicks on the DR FlashArray™ GUI and by manually starting the DR database at the command line. However, in a real-world scenario, I’m going to want some automation to handle this for me so I’ve got some very simple shell scripts you can take a look at on GitHub.

Note: These scripts are for example purposes only. If you wish to use them, be aware that they’re not supported by Pure Storage and you should add your own error checking and hardening.

The scripts and their function are described below:

Instead of using multiple screenshots to show this DR drill process in action, it can more easily be understood with a short video.

The basic concept to understand is that we have three states for the Oracle-DR pod which are all modified with a single click or CLI command:

  • Demoted : The DR pod is receiving updates from the source pod via the replica link and the volumes in the pod are read-only.
  • Promoted:  The replication is paused/queued and the volumes in the pod are read/write allowing the database to be opened.
  • Demoted : The volumes in the pod return to read-only, they’re restored to their original state at the time of being promoted (by means of hidden snapshots), and the queued replication writes from the source are applied to resynchronise the DR volumes with the latest PROD data.

The FlashArray system can have multiple pods that can be used for different purposes so this is not an all or nothing approach. You could have an independent pod for each database or multiple databases in the same pod. You might have all your SQL databases in one pod and all your Oracle databases in another pod. Or, you might have some databases that do not run in a pod at all, instead choosing to replicate them with snapshots for things like downstream reporting systems that need to be up all the time. Configuring this based on how you want to operate is flexible. And did I mention ActiveDR is included in the price of every FlashArray? We don’t nickel and dime with additional charges for software features here at Pure.

So now that we’ve seen how to remove the pain and complexity of DR drills, we’ve got our weekends back and have reached a general state of bliss. But what about if we were to have a real disaster (not just a test) and we need a full failover? We’ll explore this in the final blog post of this series —Part 4.