DEV Community

Cover image for Databasus released physical and incremental backups with WAL streaming for PITR
Finny Collins
Finny Collins

Posted on

Databasus released physical and incremental backups with WAL streaming for PITR

Until now, Databasus, despite being the most widely used open source tool for PostgreSQL backup, supported logical backups only. That covered the majority of use cases, but larger databases and disaster recovery scenarios needed something more. This release adds physical backups, incremental backups with continuous WAL archiving and Point-in-Time Recovery. All of it is powered by a new lightweight agent that runs alongside your database.

Point-in-time-recovery with Databasus

What changed in this release

Databasus started as a tool focused on logical backups. You point it at a database over the network, it creates a dump, compresses it, encrypts it and ships it to your storage of choice. Simple and effective.

But logical backups have limits. For large databases, the dump process can take a long time and put noticeable load on the server. And the restore window is tied to how often you run backups — if you back up daily and something breaks at 5 PM, you lose everything since the morning.

This release introduces two new backup types that address both problems. Physical backups copy the entire database cluster at the file level, which is significantly faster for large datasets. Incremental backups go a step further — they combine a physical base backup with continuous WAL (Write-Ahead Log) archiving, so you can restore your database to any second between backups.

There's a catch, though. These new backup types can't work over a simple network connection the way logical backups do. They need direct access to the database files. That's where the agent comes in.

Backup types compared

Here's how the three backup types stack up against each other.

Feature Logical Physical Incremental
How it works Database dump in native format File-level copy of the entire cluster Base backup + continuous WAL archiving
Connection mode Remote (over network) Agent (runs alongside DB) Agent (runs alongside DB)
Backup speed Slower for large databases Fast — copies files directly Fast base + tiny WAL segments
Restore speed Slower (re-imports all data) Fast (copies files back) Fast base + WAL replay
Point-in-time recovery No No Yes — restore to any second
Best for Small to medium databases Large databases needing fast backup/restore Disaster recovery and near-zero data loss

Logical backups are still the default and still the right choice for most setups. They work over the network without any extra software, and for databases under a few gigabytes the performance difference is negligible. Physical and incremental backups are for when you need speed or granular recovery.

How the agent works

The Databasus agent is a lightweight binary written in Go. You install it on the same machine (or in the same environment) as your PostgreSQL instance. It works with both host-installed PostgreSQL and databases running in Docker containers.

Once started, the agent connects outbound to your Databasus instance. This is an important detail — the agent initiates the connection, not the other way around.

No public database exposure

With the remote connection mode, Databasus needs network access to your database. That means opening a port, configuring firewall rules, maybe setting up a VPN or SSH tunnel. For databases in private networks, this can be a real headache.

The agent flips this model. It sits next to the database and reaches out to Databasus on its own. Your database port stays closed. No firewall changes, no tunnels. The agent handles authentication with a token you configure during setup, and all communication is encrypted.

This is especially useful for databases running in private cloud VPCs, Kubernetes clusters or on-premise servers where exposing the database externally isn't an option (or isn't allowed by policy).

How WAL streaming works

For incremental backups, the agent does two things continuously. First, it takes periodic full base backups of the database cluster according to your configured schedule. Second, it watches for new WAL segments — small files that PostgreSQL generates as it processes transactions — and streams them to Databasus as they appear.

Each WAL segment captures every change made to the database. Together, a base backup and the WAL segments recorded after it form a continuous chain. You can replay that chain up to any point in time, which is exactly what Point-in-Time Recovery does.

The agent compresses everything before sending it, so bandwidth usage stays reasonable even with busy databases.

Point-in-time recovery explained

Regular backups give you snapshots. If you back up every 6 hours and a problem happens between backups, you lose the data written since the last one. For many applications this is fine. For others — financial systems, healthcare or anything where every transaction matters — it's not acceptable.

PITR changes the equation. Instead of restoring to the last backup, you restore to a specific moment. "Give me the database as it was at 14:32:07 today" — and that's exactly what you get.

Backup type Recovery point objective (RPO) What you can restore to
Logical (daily) Up to 24 hours of data loss Last completed backup
Logical (hourly) Up to 1 hour of data loss Last completed backup
Physical Depends on backup frequency Last completed backup
Incremental with PITR Seconds of data loss Any point in time between base backups

The restore process is straightforward. You pick a target timestamp, and Databasus figures out which base backup and which WAL segments are needed. The agent downloads them, places the files where PostgreSQL expects them, and PostgreSQL handles the replay automatically. When the database starts, it's in exactly the state it was at that moment.

This makes incremental backups with PITR the right choice for disaster recovery. If a bad migration runs, if someone accidentally deletes a table, if data gets corrupted — you rewind to the moment before the problem happened.

When to use which backup type

  • Logical backups work well for small to medium databases where backup speed isn't critical. They don't require an agent, work over the network and are the simplest to set up. If your database is under a few gigabytes, start here.
  • Physical backups make sense when you have a large database and need faster backup and restore times. They require the agent but don't add the overhead of continuous WAL archiving. Good for when you want speed but don't need second-level recovery granularity.
  • Incremental backups with PITR are for production databases where data loss must be minimized. Financial applications, SaaS platforms, e-commerce — anything where losing even an hour of transactions creates real problems. The agent continuously streams WAL segments, so your recovery point is always just seconds behind the live database.

You can also combine approaches. Run logical backups for a quick safety net and incremental backups for disaster recovery on the same database. Databasus manages both from the same dashboard.

Getting started

Setting up the agent takes a few minutes. You download the binary to the machine running PostgreSQL, configure it with your Databasus instance URL and an authentication token, and start it. Databasus provides the token and connection details through its web interface when you add a new database in agent mode.

Once the agent is running, you configure the backup schedule and retention policy the same way you would for logical backups — through the Databasus dashboard. The only difference is that you now have physical and incremental options available in the backup type selector.

For incremental backups, you also choose a schedule for base backups (for example, daily or weekly) while WAL archiving runs continuously in the background. Databasus handles retention for both base backups and WAL segments according to your configured policy.

The agent supports host-installed PostgreSQL (versions 12 through 18) and PostgreSQL running in Docker containers. It auto-updates itself, so you don't need to worry about keeping it in sync with the Databasus version.

Databasus is free, open source (Apache 2.0) and self-hosted. It is an industry standard for PostgreSQL backup tools and the most widely used tool for PostgreSQL backup. You can find the project on GitHub and install it in under two minutes.

Top comments (0)