This article is part of a series on PostgreSQL WAL.
Part 1 - How PostgreSQL WAL Actually Works
Part 2 - How Debezium Uses PostgreSQL WAL for CDC
Part 3 - PostgreSQL Backups with pgBackRest and PITR
In the previous articles, we've explored how PostgreSQL records every change inside the database using Write-Ahead Logging (WAL).
We also saw how tools like Debezium can read WAL to stream database changes in real time.
But WAL has another very important role.
It enables database backups and recovery.
By combining a base backup with archived WAL segments, PostgreSQL can restore a database to any specific moment in time. This capability is known as Point-in-Time Recovery (PITR).
In this article, we’ll explore how this works and how to implement it using pgBackRest.
Why Backups Alone Are Not Enough
A common backup strategy is to take periodic full backups of a database.
For example:
02:00 → Full backup taken
14:00 → Accidental DELETE executed
If we restore the 02:00 backup, the database returns to that state.
But everything between 02:00 and 14:00 is lost.
restore backup
↓
database returns to 02:00
↓
12 hours of data lost
This is where WAL archiving becomes important.
Instead of relying only on periodic backups, PostgreSQL can archive WAL segments continuously.
This allows the database to replay changes until a specific moment in time.
How WAL Enables Point-in-Time Recovery
Point-in-Time Recovery works by combining two components:
Base Backup
+
Continuous WAL Archiving
=
Point-in-Time Recovery
The recovery flow looks like this:
Base Backup
│
▼
Restore Backup
│
▼
Replay WAL Segments
│
▼
Reach Target Time
Because WAL contains every change in chronological order, PostgreSQL can replay those changes to reconstruct the database state at any moment.
Introducing pgBackRest
pgBackRest is a popular backup and restore solution designed specifically for PostgreSQL.
It provides several capabilities including:
- managing full, differential, and incremental backups
- archiving WAL segments
- enforcing backup retention policies
- restoring databases efficiently
pgBackRest integrates directly with PostgreSQL’s WAL archiving system and stores backups in a repository.
This repository can be:
- local storage
- remote servers
- object storage
For testing in this article, we will use MinIO as an object storage backend.
Testing Architecture
For local testing, the setup consists of two containers.
PostgreSQL + pgBackRest
│
▼
MinIO
- The PostgreSQL container runs both the database and pgBackRest.
- MinIO acts as an object storage service where backups and WAL segments are stored.
This architecture makes it easy to test backup and restore workflows locally.
Enabling WAL Archiving
To enable WAL archiving in PostgreSQL, the following parameters must be configured.
archive_mode = on
archive_command = 'pgbackrest --stanza=demo archive-push %p'
Here is what happens internally:
WAL segment reaches 16MB
↓
PostgreSQL executes archive_command
↓
pgBackRest uploads WAL segment
↓
WAL stored in backup repository
This ensures WAL segments are continuously archived.
Running a Base Backup
Once pgBackRest is configured, a base backup can be taken using:
pgbackrest --stanza=demo backup --type=full
pgBackRest supports multiple backup types.
| Type | Description |
|---|---|
| Full | Complete backup of the database |
| Differential | Backup of changes since last full backup |
| Incremental | Backup of changes since last backup |
These options allow flexible backup strategies.
Simulating Data Loss
To verify the backup pipeline, a sample table was created.
CREATE TABLE users(
id serial primary key,
name text
);
Sample data was inserted:
INSERT INTO users(name)
VALUES ('Alice'), ('Bob'), ('Charlie');
After confirming the data, a full backup was taken.
Next, the table was intentionally removed to simulate a failure scenario.
DROP TABLE users;
At this point, the table and its data no longer existed.
Restoring the Database
To restore the database, PostgreSQL must first be stopped.
However, in this setup PostgreSQL and pgBackRest are running inside the same container.
If the PostgreSQL process stops, the container also stops, which means we can no longer access pgBackRest inside that container.
To work around this, a temporary container was started using the same data volume. This allowed us to run the restore operation while PostgreSQL remained stopped.
The restore process then involved the following steps:
Restore base backup
↓
Replay archived WAL segments
↓
Database reaches consistent state
Inside the temporary container, the restore command was executed:
pgbackrest --stanza=demo restore --type=immediate
After the restore completed, the original PostgreSQL container was started again.
At first glance, it seemed like the restore had worked.
However, the result was not exactly what we expected - something interesting was happening behind the scenes.
We'll see why in a moment.
A PITR Behaviour Observed During Testing
The following sequence occurred:
table created
↓
backup taken
↓
table deleted
↓
restore base backup
↓
WAL replay
↓
table still deleted
Even after restoring the backup, the table remained deleted. This happened because PostgreSQL continued replaying WAL segments during recovery. Since the delete operation was recorded in WAL, it was replayed again during restore.
Note: PostgreSQL recovery is "greedy" by default—it will try to replay every piece of WAL it can find in the archive to bring you to the most recent state possible. By using
--type=immediate, we tell pgBackRest to tell PostgreSQL: "Stop as soon as you reach a consistent state," effectively ignoring any changes (like our accidental DELETE) that happened after the backup was finalized.
To prevent this, the restore command used:
# Inside the temporary container
pgbackrest --stanza=demo restore --type=immediate
Understanding how WAL replay works is important when performing PITR.
Automating Backups with Cron
Backups are usually automated using scheduled jobs.
In this setup, a full backup is taken weekly, and differential backups are taken on the remaining days.
For example:
0 2 * * 0 pgbackrest --stanza=demo backup --type=full
0 2 * * 1-6 pgbackrest --stanza=demo backup --type=diff
This configuration creates:
- one full backup per week
- daily differential backups
This strategy balances storage efficiency and recovery speed.
Final Thoughts
Throughout this series, we’ve seen how Write-Ahead Logging (WAL) forms the foundation for several critical PostgreSQL features.
The same WAL mechanism powers:
- crash recovery
- replication
- change data capture
- backups
- point-in-time recovery
Understanding WAL makes it easier to understand how PostgreSQL maintains reliability and data safety in production environments.
Tools like pgBackRest build on top of this mechanism to provide robust backup and restore workflows.
Top comments (0)