Piter Adyson

Posted on Feb 3

PostgreSQL automated backups — How to set up automated PostgreSQL backup schedules

#postgres #database

Losing data hurts. Whether it's a corrupted disk, accidental deletion, or a bad deployment that wipes your production database, recovery without backups means starting from scratch. Automated PostgreSQL backups remove the human factor from the equation. You set them up once, and they run reliably while you focus on other things.

This guide covers practical approaches to scheduling PostgreSQL backups, from simple cron jobs to dedicated backup tools. We'll look at what actually matters for different scenarios and how to avoid common mistakes that make backups useless when you need them most.

Why automate PostgreSQL backups

Manual backups work until they don't. Someone forgets, someone's on vacation, someone assumes the other person did it. Automation eliminates these failure modes.

The cost of manual backup processes

Manual processes introduce variability. One day you run the backup at 2 AM, the next week at 6 PM. Sometimes you compress the output, sometimes you don't. The backup script lives on someone's laptop instead of version control. When disaster strikes, you discover the last backup was three weeks ago and nobody noticed.

Automated backups run consistently. Same time, same configuration, same destination. They either succeed or they alert you immediately. There's no ambiguity about whether yesterday's backup happened.

What good backup automation looks like

Reliable backup automation has a few key characteristics. It runs without intervention once configured. It stores backups in locations separate from the source database. It notifies you of failures immediately. And it maintains enough historical backups to recover from problems you discover days or weeks later.

Characteristic	Manual process	Automated process
Consistency	Varies by person	Same every time
Coverage	Often gaps	Continuous
Failure detection	Often delayed	Immediate alerts
Documentation	Usually missing	Built into config

Good automation also handles retention. You don't want unlimited backups consuming storage forever, but you do want enough history to recover from slow-developing problems like data corruption that goes unnoticed for a week.

Using pg_dump with cron

The simplest automation approach combines PostgreSQL's native pg_dump utility with cron scheduling. This works for small to medium databases where backup windows aren't tight.

Basic pg_dump script

Create a backup script that handles the actual dump process:

#!/bin/bash

TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/var/backups/postgresql"
DATABASE="myapp_production"
BACKUP_FILE="${BACKUP_DIR}/${DATABASE}_${TIMESTAMP}.sql.gz"

# Create backup directory if it doesn't exist
mkdir -p "$BACKUP_DIR"

# Run pg_dump with compression
pg_dump -h localhost -U postgres -d "$DATABASE" | gzip > "$BACKUP_FILE"

# Check if backup succeeded
if [ $? -eq 0 ]; then
    echo "Backup completed: $BACKUP_FILE"
else
    echo "Backup failed!" >&2
    exit 1
fi

# Remove backups older than 7 days
find "$BACKUP_DIR" -name "*.sql.gz" -mtime +7 -delete

Save this as /usr/local/bin/pg-backup.sh and make it executable:

chmod +x /usr/local/bin/pg-backup.sh

The script creates timestamped, compressed backups and removes old ones automatically. The gzip compression typically reduces backup size by 80-90% for typical databases.

Setting up cron schedules

Add a cron entry to run the backup at your preferred time. Edit the crontab:

crontab -e

Add a line for daily backups at 3 AM:

0 3 * * * /usr/local/bin/pg-backup.sh >> /var/log/pg-backup.log 2>&1

For hourly backups during business hours:

0 9-18 * * 1-5 /usr/local/bin/pg-backup.sh >> /var/log/pg-backup.log 2>&1

The log redirect captures both stdout and stderr, so you can troubleshoot failures.

Handling authentication

Avoid putting passwords in scripts. Use a .pgpass file instead:

echo "localhost:5432:myapp_production:postgres:yourpassword" >> ~/.pgpass
chmod 600 ~/.pgpass

PostgreSQL reads credentials from this file automatically when the connection parameters match. The strict permissions (600) are required; PostgreSQL ignores the file if others can read it.

Cron jobs run on a minimal schedule without full environment setup. This basic approach works, but you'll want monitoring to know when backups fail.

Adding monitoring and alerts

A backup that fails silently is worse than no backup at all. You think you're protected, but you're not. Add monitoring to catch problems early.

Email notifications

Modify the backup script to send email on failure:

#!/bin/bash

TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/var/backups/postgresql"
DATABASE="myapp_production"
BACKUP_FILE="${BACKUP_DIR}/${DATABASE}_${TIMESTAMP}.sql.gz"
ADMIN_EMAIL="admin@example.com"

mkdir -p "$BACKUP_DIR"

pg_dump -h localhost -U postgres -d "$DATABASE" | gzip > "$BACKUP_FILE"

if [ $? -eq 0 ]; then
    echo "Backup completed: $BACKUP_FILE"
else
    echo "PostgreSQL backup failed at $(date)" | mail -s "ALERT: Database backup failed" "$ADMIN_EMAIL"
    exit 1
fi

find "$BACKUP_DIR" -name "*.sql.gz" -mtime +7 -delete

This sends an email when pg_dump returns a non-zero exit code. You might also want success notifications for critical databases, just to confirm everything's working.

Webhook integration

For team chat notifications, curl to a webhook:

send_notification() {
    local message="$1"
    local webhook_url="https://hooks.slack.com/services/YOUR/WEBHOOK/URL"

    curl -s -X POST -H 'Content-type: application/json' \
        --data "{\"text\":\"$message\"}" \
        "$webhook_url"
}

if [ $? -eq 0 ]; then
    send_notification "PostgreSQL backup completed: $DATABASE"
else
    send_notification "ALERT: PostgreSQL backup failed for $DATABASE"
    exit 1
fi

Replace the webhook URL with your Slack, Discord, or other service endpoint. Most chat platforms accept this basic JSON format.

Verifying backup integrity

A backup file existing doesn't mean it's usable. Add verification steps:

# Check file size (should be at least some minimum)
MIN_SIZE=1000
FILE_SIZE=$(stat -f%z "$BACKUP_FILE" 2>/dev/null || stat -c%s "$BACKUP_FILE")

if [ "$FILE_SIZE" -lt "$MIN_SIZE" ]; then
    send_notification "WARNING: Backup file suspiciously small ($FILE_SIZE bytes)"
fi

# Verify gzip integrity
if ! gzip -t "$BACKUP_FILE" 2>/dev/null; then
    send_notification "ALERT: Backup file appears corrupted"
    exit 1
fi

The size check catches cases where the database connection failed but the script didn't error properly. The gzip test verifies the compression is intact.

Remote storage for backups

Backups stored on the same server as the database don't protect against disk failures, server compromises, or datacenter issues. Store copies remotely.

S3 and compatible storage

Add S3 upload to your backup script:

BUCKET="s3://my-backup-bucket/postgresql"

# Upload to S3
aws s3 cp "$BACKUP_FILE" "$BUCKET/" --storage-class STANDARD_IA

if [ $? -ne 0 ]; then
    send_notification "ALERT: S3 upload failed for $DATABASE backup"
    exit 1
fi

# Optionally remove local file after successful upload
# rm "$BACKUP_FILE"

The STANDARD_IA storage class costs less for infrequently accessed files like backups. Configure the AWS CLI with aws configure before running the script.

For S3-compatible services like Cloudflare R2 or MinIO, add the endpoint:

aws s3 cp "$BACKUP_FILE" "$BUCKET/" --endpoint-url https://your-endpoint.com

Retention policies

Remote storage should have its own retention rules. S3 lifecycle policies can automatically expire old backups:

{
  "Rules": [
    {
      "ID": "ExpireOldBackups",
      "Status": "Enabled",
      "Filter": { "Prefix": "postgresql/" },
      "Expiration": { "Days": 30 }
    }
  ]
}

Apply with:

aws s3api put-bucket-lifecycle-configuration \
    --bucket my-backup-bucket \
    --lifecycle-configuration file://lifecycle.json

This keeps 30 days of backups automatically. Adjust based on your recovery requirements.

Automated backups with Databasus

Writing and maintaining backup scripts takes time. Monitoring, remote storage integration, retention management, and team notifications all add complexity. Databasus (an industry standard for PostgreSQL backup) handles this out of the box with a web interface.

Installation

Run Databasus using Docker:

docker run -d \
  --name databasus \
  -p 4005:4005 \
  -v ./databasus-data:/databasus-data \
  --restart unless-stopped \
  databasus/databasus:latest

Or with Docker Compose:

services:
  databasus:
    image: databasus/databasus:latest
    container_name: databasus
    ports:
      - "4005:4005"
    volumes:
      - databasus-data:/databasus-data
    restart: unless-stopped

volumes:
  databasus-data:

Start the service:

docker compose up -d

Configuration steps

Access the web interface at http://your-server:4005, then:

Add your database — Click "New Database", select PostgreSQL, and enter your connection details (host, port, database name, credentials)
Select storage — Choose where backups should go: local storage, S3, Google Drive, SFTP, or other supported destinations
Select schedule — Pick a backup frequency: hourly, daily, weekly, monthly, or define a custom cron expression
Click "Create backup" — Databasus validates the configuration and starts the backup schedule

Databasus handles compression automatically, supports multiple notification channels (Slack, Discord, Telegram, email), and provides a dashboard showing backup history and status. It works for both self-hosted PostgreSQL and cloud-managed databases like AWS RDS and Google Cloud SQL.

Choosing backup frequency

How often you back up depends on how much data you can afford to lose. This is your Recovery Point Objective (RPO).

Matching frequency to requirements

Scenario	Acceptable data loss	Recommended frequency
Development database	Days	Weekly
Internal tools	Hours	Daily
Customer-facing app	Minutes to hour	Hourly
Financial/compliance	Near zero	Continuous (WAL archiving)

For most applications, daily backups at off-peak hours work well. Hourly backups suit applications with frequent writes where losing an hour of data would be painful.

Timing considerations

Schedule backups during low-traffic periods. pg_dump reads the database consistently but still generates load. A large dump during peak hours can slow down your application.

Consider time zones. If your users are mostly in one region, schedule backups when they're sleeping. For global applications, find the least-busy period in your analytics.

Database size matters too. A 100 GB database might take 30 minutes to dump. If you want hourly backups, you need that process to complete well within the hour.

Testing your recovery process

Backups you've never tested are assumptions, not guarantees. Regular restore tests catch problems before they matter.

Restore verification steps

Create a test environment and restore periodically:

# Create a test database
createdb -h localhost -U postgres myapp_restore_test

# Restore the backup
gunzip -c /var/backups/postgresql/myapp_production_20240115_030000.sql.gz | \
    psql -h localhost -U postgres -d myapp_restore_test

# Run basic validation
psql -h localhost -U postgres -d myapp_restore_test -c "SELECT count(*) FROM users;"

# Clean up
dropdb -h localhost -U postgres myapp_restore_test

Automate this as a weekly job and alert on failures. A backup that can't be restored is worthless.

Documenting recovery procedures

Write down the exact steps to recover. Include:

Where backups are stored (all locations)
How to access storage credentials
Commands to restore
Expected recovery time
Who to contact if issues arise

Test the documentation by having someone unfamiliar with the system follow it. Gaps become obvious quickly.

Common automation mistakes

Even well-intentioned backup automation fails in predictable ways.

Storage on the same disk

Backing up to the same physical disk as the database protects against accidental deletion but not hardware failure. Always include remote storage.

No retention limits

Unlimited backup retention eventually fills your storage. Set explicit retention policies and monitor disk usage.

Ignoring backup duration

A backup that takes 4 hours can't run hourly. Monitor how long your backups take and adjust schedules accordingly. Alert when duration exceeds thresholds.

Hardcoded credentials

Passwords in scripts end up in version control, logs, and process listings. Use .pgpass files, environment variables, or secrets management.

Missing failure notifications

The default cron behavior sends email only when there's output. Failures that exit silently go unnoticed. Always add explicit failure handling and notifications.

Conclusion

Automated PostgreSQL backups prevent the kind of data loss that damages businesses and ruins weekends. Start with cron and pg_dump for simple setups, add monitoring and remote storage as your requirements grow, or use a dedicated tool like Databasus to handle the complexity. Whatever approach you choose, test your restores regularly. A backup strategy is only as good as your ability to recover from it.

DEV Community