DEV Community

Antoun Moubarak
Antoun Moubarak

Posted on

Automate Disaster Recovery Plan Validation in OCI Using a Custom Precheck Tool

Overview

Disaster Recovery (DR) planning is a fundamental pillar of any high-availability architecture, ensuring business continuity in the face of unexpected disruptions. In Oracle Cloud Infrastructure (OCI), Full Stack Disaster Recovery (FSDR) takes DR to the next level by automating and orchestrating recovery processes across OCI regions.

At the heart of OCI’s FSDR service is the Disaster Recovery Protection Group (DRPG) — a logical collection of related OCI resources such as compute instances, databases, and more, that need to be recovered together to maintain system integrity. Within a DRPG, DR Plans define the precise sequence of actions necessary for failover, switchover, and DR drills, ensuring recovery efforts are seamless and efficient.

But how can you be certain that your DR plans are foolproof and ready to execute when disaster strikes?

To help teams proactively validate their DR plans, this CLI-based Precheck Tool verifies the configuration of all active DR plans within a given DRPG. It identifies potential misconfigurations or missing dependencies before executing failover or switchover actions — helping you catch issues early.

In this blog, we’ll walk through:

  • What the tool does
  • How it works
  • How to use it
  • How it integrates with OCI Notifications

What the Tool Does

This Python-based tool automates the validation (precheck) of all active DR plans associated with an OCI Disaster Recovery Protection Group (DRPG). It:

  • Automatically identifies whether the DRPG is primary or standby
  • Executes prechecks for each DR plan (Switchover, Failover, Start Drill, Stop Drill)
  • Waits for precheck completion and logs the results
  • Sends notifications via OCI Notification Topics (email/SMS/etc.)

This tool is especially useful for validating DR readiness as part of a scheduled job.

How It Works

The tool leverages:

  • OCI Python SDK for interacting with the DR services and notifications
  • Instance Principals for secure authentication inside OCI Compute
  • Structured logging to file and console
  • Email alerts via OCI Notification Topic if

Here’s what a basic run looks like:

python full_stack_dr_plans_precheck.py -id ocid1.drprotectiongroup.oc1..xxxxx -nf ocid1.onstopic.oc1..yyyyy
Enter fullscreen mode Exit fullscreen mode
Required:

-id DRPG_OCID, --drpg-ocid DRPG_OCID: 
OCID of the DR Protection Group (can be primary or standby)

Optional:

-nf ONS_TOPIC_OCID, --ons-topic-ocid ONS_TOPIC_OCID: 
OCID of the OCI Notification Topic for alerting
Enter fullscreen mode Exit fullscreen mode

What Gets Logged

During execution, the tool creates in the logs/ directory and writes:

All INFO logs to: .log

If enabled, a notification is sent using the specified Notification Topic.

  • Example Output
2025-09-29 12:01:32 INFO     Standby DRPG: drpg-ashburn (ocid1.drpg...) is ACTIVE
2025-09-29 12:01:34 INFO     Running precheck for FAILOVER plan: app-dr-failover
2025-09-29 12:01:54 INFO     Precheck passed: app-dr-failover
2025-09-29 12:02:01 INFO     Running precheck for SWITCHOVER plan: app-dr-switchover
2025-09-29 12:02:19 ERROR    Precheck failed: app-dr-switchover
Enter fullscreen mode Exit fullscreen mode
  • Notification Example

The tool automatically publishes an alert via OCI Notifications.

Subject:
FSDR Precheck Results for drpg-ashburn - ocid1.drprotectiongroup...

Body:
drpg-ashburn: ocid1.drprotectiongroup.oc1..aaaa...

2025-09-29 12:02:19 ERROR Precheck failed: app-dr-switchover

Installation & Setup

Ensure the script is run inside an OCI Compute instance with Instance Principal access.

git clone https://github.com/antounmoubarak/full-stack-dr-plans-precheck.git
cd full-stack-dr-plans-precheck

  • Install dependencies:
pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode
  • Run the tool:
python full_stack_dr_plans_precheck.py -id <your_drpg_ocid>
Enter fullscreen mode Exit fullscreen mode
  • (Optional) Add --ons-topic-ocid to receive email alerts.

Security Notes

This script uses Instance Principal Authentication, so no API keys or secrets are stored locally.
All regional configuration is temporary and deleted after execution.

Final Thoughts

Disaster Recovery isn’t just about having a plan — it’s about knowing that plan works. Automating your prechecks gives you a safety net and builds trust in your cloud architecture.

Next Step:
Add scheduling via cron or any other scheduling tool.

To set up a cron job on Oracle Linux VMs that runs the script every day at 00:00 (midnight), follow these steps:

  • Open the Crontab Editor
crontab -e
Enter fullscreen mode Exit fullscreen mode

This opens the user's crontab file in the default text editor (usually vi or nano).

  • Add the Cron Job

Assuming the Python script is located in the directory: /home/opc/full-stack-dr-plans-precheck, add this line at the bottom of the file:

0 0 * * * /home/opc/full-stack-dr-plans-precheck/full_stack_dr_plans_precheck.py -id <drpg_ocid> -nf <topic_ocid>
Enter fullscreen mode Exit fullscreen mode

0 0 * * * = Every day at 00:00
Make sure the script is executable.

  • Save and Exit

  • In vi: Press Esc, then type :wq and press Enter

  • In nano: Press Ctrl+O, Enter, then Ctrl+X

  • Verify the Cron Job is Installed

crontab -l
Enter fullscreen mode Exit fullscreen mode

This will list the current user's cron jobs and should include your new entry.

  • Check Cron Service Status

Make sure the crond service is running:

sudo systemctl status crond
Enter fullscreen mode Exit fullscreen mode

If it’s not running:

sudo systemctl start crond
sudo systemctl enable crond
Enter fullscreen mode Exit fullscreen mode

Top comments (0)