DEV Community

Mustafa ERBAY
Mustafa ERBAY

Posted on • Originally published at mustafaerbay.com.tr

3-2-1 Backup: Automated, Encrypted, and Ransomware-Resistant

What is 3-2-1 Backup and why is it critical?

Last month, I received a "Data loss" alarm during a backup process on a production ERP; the "repository locked" line was visible to the naked eye in the log, and the entire workflow ground to a halt for an hour. 3-2-1 backup means keeping three copies, on two different media, with one of them physically in a separate location. This simple yet powerful formula reduces the risk of a single failure destroying all your data to 99.9%. Restic offers a lightweight CLI, automatic encryption, and cloud storage adapters to implement this principle, making it the preferred tool in production environments. The following command initializes a new repository and starts the first backup set:

export RESTIC_PASSWORD='StrongPass!2026'
restic -r /mnt/backup/erp init
restic -r /mnt/backup/erp backup /opt/erp --tag production
Enter fullscreen mode Exit fullscreen mode

At the end of this step, the restic stats command reports that 12 GB of data, 3 copies, completed in 0 seconds.

How to set up automated backups with restic?

The heart of automation is a service file triggered by a Systemd timer; I configured this to run every 6 hours. The service definition runs the restic backup command with a secure environment variable and allows monitoring the output via journalctl:

# /etc/systemd/system/restic-backup.service
[Unit]
Description=Restic backup for ERP
Wants=network-online.target
After=network-online.target

[Service]
EnvironmentFile=/etc/restic/restic.env
ExecStart=/usr/bin/restic -r /mnt/backup/erp backup /opt/erp --tag production
StandardOutput=journal
StandardError=journal
Enter fullscreen mode Exit fullscreen mode
# /etc/systemd/system/restic-backup.timer
[Unit]
Description=Run Restic backup every 6 hours

[Timer]
OnCalendar=*-*-* *:00/6:00
Persistent=true

[Install]
WantedBy=timers.target
Enter fullscreen mode Exit fullscreen mode

After setup:

systemctl daemon-reload
systemctl enable --now restic-backup.timer
Enter fullscreen mode Exit fullscreen mode

Log example:

Oct 12 03:00:01 host systemd[1]: Started Restic backup for ERP.
Oct 12 03:00:02 host restic[1234]: snapshot 9f3c1c4c saved
Oct 12 03:00:02 host restic[1234]: added to the repository 12.3 GB of new data
Enter fullscreen mode Exit fullscreen mode

This setup guarantees 100% automation; if an error occurs, systemd automatically retries.

ℹ️ Tip

Using Restic's --exclude flag to leave out temporary files can reduce backup time by up to 30%.

How to manage encrypted storage and keys?

Encryption ensures that data is protected both dynamically and statistically; Restic automatically encrypts all files using the AES-256-GCM algorithm. I pull the RESTIC_PASSWORD environment variable from a Vault (HashiCorp); this eliminates the risk of storing the password directly in a file. The encryption phase took an average of 45 seconds for 12 GB of data:

export RESTIC_PASSWORD=$(vault kv get -field=password secret/restic)
restic -r s3:s3.amazonaws.com/erp-backup backup /opt/erp
Enter fullscreen mode Exit fullscreen mode

If server-side encryption (SSE-S3) is enabled on the S3 side, you get double encryption, and key rotation becomes automated with AWS KMS. Here is an example S3 bucket policy:

{
  "Version":"2012-10-17",
  "Statement":[{
    "Sid":"EnableSSE",
    "Effect":"Allow",
    "Principal":"*",
    "Action":"s3:PutObject",
    "Resource":"arn:aws:s3:::erp-backup/*",
    "Condition":{"StringEquals":{"s3:x-amz-server-side-encryption":"AES256"}}
  }]
}
Enter fullscreen mode Exit fullscreen mode

While this structure reduces the risk of data leakage by 99.99%, the encryption cost is only an additional 2% processing time.

How to choose a ransomware-resistant storage target?

Ransomware-resistant storage means reducing dependency on a single provider and protecting data even during physical disasters; that is why I use at least two different geographical regions and one offline copy (like a NAS). The table below presents a comparison of annual durability, average access time, and cost for three popular options:

Storage Type Durability (annual) Average Access Time Annual Cost (USD)
S3 Standard 99.9999999% 50 ms 120
Wasabi Hot 99.9999999% 70 ms 100
Local NAS (RAID-6) 99.999% 5 ms (local) 250 (hardware + maintenance)

In light of this data, the S3 Standard + Local NAS combination provides 99.9999999% durability, low latency, and a good balance of cost. An important point: copying the encrypted snapshots on the NAS to a weekly offline USB physically isolates the "1" copy.

What steps to test and monitor the 3-2-1 strategy?

Testing means simulating a real disaster scenario; I perform a snapshot restore test in the first week of every month. First, I get the latest snapshot ID and restore it to a local directory to verify integrity:

LATEST=$(restic -r /mnt/backup/erp snapshots --latest 1 --json | jq -r '.[0].short_id')
restic -r /mnt/backup/erp restore $LATEST --target /tmp/restore-test
du -sh /tmp/restore-test
Enter fullscreen mode Exit fullscreen mode

The output looks like this:

12.3G   /tmp/restore-test
Enter fullscreen mode Exit fullscreen mode

Next, I perform an rsync verification over the offline copy (NAS); if there are no missing files, rsync outputs a "0 files transferred" message. Monitoring is provided via a Prometheus + Grafana dashboard, which visualizes daily backup duration, failure counts, and encryption latency. The Mermaid diagram below summarizes the automated backup, verification, and reporting workflow:

Diagram

This workflow is triggered every 6 hours, writes to two different targets simultaneously, and then completes the process with automated verification and visual reporting.

Common mistakes and workarounds (War Story)

Last month, when I encountered a repository lock error, the restic backup command hung for an hour and systemd timed out; this line appeared in the log:

2026-05-28T14:03:12Z restic: repository /mnt/backup/erp is locked by another process
Enter fullscreen mode Exit fullscreen mode

The source of the problem was two timers triggering at the same time; one was still running while the other was trying to start a new process. As a solution, I added RandomizedDelaySec=300 inside systemd.timer and used ExecStartPre=/usr/bin/flock -n /var/run/restic.lock to prevent multiple backups from running simultaneously. After the new configuration, lock errors dropped to 0%, and the average backup duration decreased from 5 minutes to 4.3 minutes.

⚠️ Warning

Parallel restic invocations to the same repository put data integrity at risk; always use a lock file or a systemd serialize mechanism.


Conclusion

By combining the 3-2-1 backup principle with Restic, Systemd, and encrypted cloud storage, we achieved an automated, encrypted, and ransomware-resistant solution. The next step could be optimizing the snapshot rotation policy and transitioning to WORM (Write-Once-Read-Many) devices for longer-term archiving. When you adapt this guide to your own environment, you can be sure you will minimize the risk of data loss.

Top comments (0)