Mohamed Hussain S

Posted on Mar 13

PostgreSQL Backups and Point-in-Time Recovery with pgBackRest

#postgres #opensource #database #devops

This article is part of a series on PostgreSQL WAL.

Part 1 - How PostgreSQL WAL Actually Works

Part 2 - How Debezium Uses PostgreSQL WAL for CDC

Part 3 - PostgreSQL Backups with pgBackRest and PITR

In the previous articles, we've explored how PostgreSQL records every change inside the database using Write-Ahead Logging (WAL).

We also saw how tools like Debezium can read WAL to stream database changes in real time.

But WAL has another very important role.

It enables database backups and recovery.

By combining a base backup with archived WAL segments, PostgreSQL can restore a database to any specific moment in time. This capability is known as Point-in-Time Recovery (PITR).

In this article, we’ll explore how this works and how to implement it using pgBackRest.

Why Backups Alone Are Not Enough

A common backup strategy is to take periodic full backups of a database.

For example:

02:00 → Full backup taken
14:00 → Accidental DELETE executed

If we restore the 02:00 backup, the database returns to that state.

But everything between 02:00 and 14:00 is lost.

                          restore backup
                                 ↓
                      database returns to 02:00
                                 ↓
                        12 hours of data lost

This is where WAL archiving becomes important.

Instead of relying only on periodic backups, PostgreSQL can archive WAL segments continuously.

This allows the database to replay changes until a specific moment in time.

How WAL Enables Point-in-Time Recovery

Point-in-Time Recovery works by combining two components:

                              Base Backup
                                   +
                        Continuous WAL Archiving
                                   =
                         Point-in-Time Recovery

The recovery flow looks like this:


                               Base Backup
                                    │
                                    ▼
                              Restore Backup
                                    │
                                    ▼
                           Replay WAL Segments
                                    │
                                    ▼
                            Reach Target Time

Because WAL contains every change in chronological order, PostgreSQL can replay those changes to reconstruct the database state at any moment.

Introducing pgBackRest

pgBackRest is a popular backup and restore solution designed specifically for PostgreSQL.

It provides several capabilities including:

managing full, differential, and incremental backups
archiving WAL segments
enforcing backup retention policies
restoring databases efficiently

pgBackRest integrates directly with PostgreSQL’s WAL archiving system and stores backups in a repository.

This repository can be:

local storage
remote servers
object storage

For testing in this article, we will use MinIO as an object storage backend.

Testing Architecture

For local testing, the setup consists of two containers.

                         PostgreSQL + pgBackRest
                                   │
                                   ▼
                                 MinIO

The PostgreSQL container runs both the database and pgBackRest.
MinIO acts as an object storage service where backups and WAL segments are stored.

This architecture makes it easy to test backup and restore workflows locally.

Enabling WAL Archiving

To enable WAL archiving in PostgreSQL, the following parameters must be configured.

archive_mode = on
archive_command = 'pgbackrest --stanza=demo archive-push %p'

Here is what happens internally:

                         WAL segment reaches 16MB
                                   ↓
                   PostgreSQL executes archive_command
                                   ↓
                      pgBackRest uploads WAL segment
                                   ↓
                      WAL stored in backup repository

This ensures WAL segments are continuously archived.

Running a Base Backup

Once pgBackRest is configured, a base backup can be taken using:

pgbackrest --stanza=demo backup --type=full

pgBackRest supports multiple backup types.

Type	Description
Full	Complete backup of the database
Differential	Backup of changes since last full backup
Incremental	Backup of changes since last backup

These options allow flexible backup strategies.

Simulating Data Loss

To verify the backup pipeline, a sample table was created.

CREATE TABLE users(
  id serial primary key,
  name text
);

Sample data was inserted:

INSERT INTO users(name)
VALUES ('Alice'), ('Bob'), ('Charlie');

After confirming the data, a full backup was taken.

Next, the table was intentionally removed to simulate a failure scenario.

DROP TABLE users;

At this point, the table and its data no longer existed.

Restoring the Database

To restore the database, PostgreSQL must first be stopped.

However, in this setup PostgreSQL and pgBackRest are running inside the same container.
If the PostgreSQL process stops, the container also stops, which means we can no longer access pgBackRest inside that container.

To work around this, a temporary container was started using the same data volume. This allowed us to run the restore operation while PostgreSQL remained stopped.

The restore process then involved the following steps:

                            Restore base backup
                                   ↓
                         Replay archived WAL segments
                                   ↓
                      Database reaches consistent state

Inside the temporary container, the restore command was executed:

pgbackrest --stanza=demo restore --type=immediate

After the restore completed, the original PostgreSQL container was started again.

At first glance, it seemed like the restore had worked.
However, the result was not exactly what we expected - something interesting was happening behind the scenes.

We'll see why in a moment.

A PITR Behaviour Observed During Testing

The following sequence occurred:

                             table created
                                   ↓
                              backup taken
                                   ↓
                             table deleted
                                   ↓
                           restore base backup
                                   ↓
                               WAL replay
                                   ↓
                           table still deleted

Even after restoring the backup, the table remained deleted. This happened because PostgreSQL continued replaying WAL segments during recovery. Since the delete operation was recorded in WAL, it was replayed again during restore.

Note: PostgreSQL recovery is "greedy" by default—it will try to replay every piece of WAL it can find in the archive to bring you to the most recent state possible. By using --type=immediate, we tell pgBackRest to tell PostgreSQL: "Stop as soon as you reach a consistent state," effectively ignoring any changes (like our accidental DELETE) that happened after the backup was finalized.

To prevent this, the restore command used:

# Inside the temporary container
pgbackrest --stanza=demo restore --type=immediate

Understanding how WAL replay works is important when performing PITR.

Automating Backups with Cron

Backups are usually automated using scheduled jobs.

In this setup, a full backup is taken weekly, and differential backups are taken on the remaining days.

For example:

0 2 * * 0 pgbackrest --stanza=demo backup --type=full
0 2 * * 1-6 pgbackrest --stanza=demo backup --type=diff

This configuration creates:

one full backup per week
daily differential backups

This strategy balances storage efficiency and recovery speed.

Final Thoughts

Throughout this series, we’ve seen how Write-Ahead Logging (WAL) forms the foundation for several critical PostgreSQL features.

The same WAL mechanism powers:

crash recovery
replication
change data capture
backups
point-in-time recovery

Understanding WAL makes it easier to understand how PostgreSQL maintains reliability and data safety in production environments.

Tools like pgBackRest build on top of this mechanism to provide robust backup and restore workflows.

DEV Community