Day 14: Backup & Recovery – Saving Your PostgreSQL Adventure

#database #tutorial #devops #postgressql

Hello, persistent friend! Yesterday we took a deep dive into security. Today we're going to talk about the database equivalent of pressing Ctrl+S in your favourite RPG — backups and recovery.

We'll break down the three fundamental backup strategies recognised by PostgreSQL: SQL dumps, file system backups, and continuous archiving (point‑in‑time recovery). We'll sprinkle in analogies and code examples so you can implement these right after reading.

💾 Why Backups Matter (a Love Letter to Future You)

Imagine writing a novel but forgetting to save halfway through. Ouch. Databases are no different: accidents, hardware failures and the occasional DROP DATABASE happen. As the PostgreSQL docs emphasise, you need a clear understanding of backup techniques and assumptions because valuable data should be backed up regularly【333740468300983†L55-L67】.

🛠️ Strategy 1 – SQL Dumps (Logical Backups)

SQL dumps are like photocopying your diary. They produce a text file full of CREATE TABLE, INSERT, and other commands to recreate your database. PostgreSQL's pg_dump tool makes this simple:

# Backup a single database to a SQL file
pg_dump mydb > mydb_dump.sql

# Restore it with psql
createdb -T template0 mydb_restored
psql -X mydb_restored < mydb_dump.sql

pg_dump is just another PostgreSQL client, so you can run it remotely. It's internally consistent, doesn’t block other operations, and even works across versions【28493766741294†L39-L79】. When your database evolves, dump files can be imported into newer PostgreSQL versions. The downside? It’s slow for very large databases and doesn’t include data like WAL for point‑in‑time recovery.

When to use: migrating to a new server, upgrading PostgreSQL versions, or taking snapshot backups of smaller databases.

🧱 Strategy 2 – File System Level Backups

A file system backup is like cloning your hard drive. You copy the entire PostgreSQL data directory:

# Make sure the server is stopped!
sudo systemctl stop postgresql
tar -cf backup.tar /var/lib/postgresql/data

While simple, this method requires you to shut down PostgreSQL for a usable backup【673240960809645†L43-L54】. You cannot back up individual tables; you must copy the entire cluster because table files rely on commit logs in pg_xact【673240960809645†L54-L62】. You can mitigate downtime by using file system snapshots if your storage supports it — create a frozen snapshot, copy the data, then release the snapshot. Just remember to include WAL files and run CHECKPOINT first to reduce recovery time【673240960809645†L65-L77】.

When to use: offline backups of small servers or quick clones of development environments.

🔄 Strategy 3 – Continuous Archiving & Point‑in‑Time Recovery (PITR)

Think of continuous archiving as recording a movie instead of just taking a single photo. PostgreSQL writes every change to the write‑ahead log (WAL). By archiving these WAL files along with a base backup, you can restore to any point in time【34016548869926†L44-L71】.

# In postgresql.conf
wal_level = replica          # or logical for logical replication
archive_mode = on
archive_command = 'cp %p /var/backups/wal/%f'

# Create a base backup
pg_basebackup -D /var/backups/base -Fp -Xs -P

Continuous archiving lets you replay WAL segments until the exact moment you need, enabling point‑in‑time recovery and warm standby servers【34016548869926†L60-L76】. It doesn’t require a snapshot, and log replay corrects any internal inconsistencies in your base backup【34016548869926†L56-L60】. However, it requires more storage and complexity.

When to use: mission‑critical systems where downtime is unacceptable and you need the ability to recover to a precise second.

🧰 Bonus – Open‑Source Backup Tools in 2025

Manual backups can be tedious. Here are some tools to automate things (and keep your sanity):

WAL‑G — the spiritual successor to WAL‑E. Built in Go, it compresses and encrypts backups, handles multiple databases and integrates seamlessly with cloud storage【122680987072613†L86-L89】.
Barman — an EDB‑maintained tool written in Python. It performs remote backups for multiple servers with centralised management【122680987072613†L101-L104】.
pgBackRest — known for reliability and flexibility, it supports full, differential and incremental backups with parallel processing, compression and encryption【122680987072613†L116-L119】.

Choose based on your environment’s scale and cloud strategy【122680987072613†L165-L175】.

📋 Comparing Strategies

Method	Pros	Cons
SQL Dump	Version‑portable, consistent, can migrate across architectures【28493766741294†L74-L79】	Slow for huge databases, no WAL for PITR
File System Backup	Simple, fast to copy entire data directory	Requires downtime【673240960809645†L43-L54】, cannot restore individual tables【673240960809645†L54-L62】
Continuous Archiving	Supports point‑in‑time recovery and warm standbys【34016548869926†L60-L76】	Complex setup, requires extra storage

🚀 Try It Yourself

Take a backup of your favourite database using pg_dump.
Create a base backup with pg_basebackup and set archive_mode=on.
Experiment by restoring to a specific moment in time.

Share your experiences (and war stories) in the comments!

Tomorrow we’ll wrap up the series by discussing replication and high availability — because nothing says “production ready” like your database having a stunt double. See you then!