Finny Collins

Posted on Mar 1

How to use pg_dump for PostgreSQL backups — the complete 2026 tutorial

#database #postgres

pg_dump is the built-in command-line utility that ships with every PostgreSQL installation. It creates logical backups — SQL statements or archive files that can recreate your database from scratch. If you run PostgreSQL, you've probably used it at least once, even if it was just a quick dump before a risky migration.

The tool is simple on the surface, but there's a lot of nuance underneath. Output formats, compression, parallel jobs, selective dumps, restoration gotchas. This guide covers all of it with practical examples you can use right away.

What pg_dump does and how it works

pg_dump connects to a running PostgreSQL instance and reads the database contents at a point in time. It produces a consistent snapshot using PostgreSQL's MVCC mechanism, meaning other connections can keep reading and writing while the dump is in progress. The backup captures schemas, table data, indexes, constraints, sequences, functions and other database objects.

The dump happens at the SQL level. It doesn't copy raw data files from disk like physical backup tools do. This makes pg_dump portable across PostgreSQL versions and even across operating systems. A dump from PostgreSQL 14 on Linux restores fine into PostgreSQL 17 on macOS. That kind of flexibility is hard to beat.

There's an important limitation though. pg_dump only backs up a single database per invocation. If you need to back up the entire cluster (all databases, roles and tablespaces), you'll want pg_dumpall instead. But for most use cases, pg_dump on a specific database is what you need.

Basic syntax and your first backup

The simplest possible pg_dump command looks like this:

pg_dump -U postgres -h localhost mydb > mydb_backup.sql

This connects to the mydb database as user postgres on localhost and writes a plain SQL file. The output is a sequence of CREATE TABLE, COPY and ALTER TABLE statements that, when replayed, reconstruct the database.

If your PostgreSQL instance requires a password, you'll be prompted for one. To avoid interactive prompts in scripts, you can use a .pgpass file or the PGPASSWORD environment variable:

PGPASSWORD=mysecretpassword pg_dump -U postgres -h localhost mydb > mydb_backup.sql

For remote servers, specify the host and port:

pg_dump -U backup_user -h db.example.com -p 5432 production > production_backup.sql

The .pgpass file approach is better for production scripts. Create ~/.pgpass with the format hostname:port:database:username:password and set permissions to 600:

echo "db.example.com:5432:production:backup_user:mysecretpassword" >> ~/.pgpass
chmod 600 ~/.pgpass

After that, pg_dump picks up credentials automatically without any environment variable tricks.

Output formats

pg_dump supports four output formats. The format you choose affects file size, restoration flexibility and whether you can use parallel restore. This is one of those decisions that's easy to get wrong if you just go with the default.

Format	Flag	Extension	Parallel restore	Selective restore	Compression
Plain SQL	`-Fp` (default)	`.sql`	No	No	External only (gzip pipe)
Custom	`-Fc`	`.dump`	Yes	Yes	Built-in (zlib)
Directory	`-Fd`	directory	Yes	Yes	Built-in (per-table files)
Tar	`-Ft`	`.tar`	No	Yes	No

Plain SQL is the default. It produces a human-readable text file with SQL statements. You restore it with psql. The main advantage is readability — you can open it, search for specific tables, even edit it manually. The disadvantage is that you can't do selective restore or parallel restore.

Custom format is what most people should use for production backups. It produces a compressed binary file that's restored with pg_restore. You can selectively restore specific tables, schemas or objects. And you can use parallel jobs during restore, which matters a lot for large databases.

Directory format creates a directory with one file per table plus a table of contents. It also supports parallel dump (not just parallel restore), making it the fastest option for large databases. The downside is that it's a directory, not a single file, which can be awkward for transfer and storage.

Tar format exists for compatibility. It's essentially the custom format without compression, packed as a tar archive. There's rarely a reason to use it over custom format.

For almost all production use, go with custom format:

pg_dump -Fc -U postgres -h localhost mydb > mydb_backup.dump

Essential pg_dump flags

pg_dump has dozens of options. Most of them you'll never touch, but a handful come up regularly. Here's the reference for the ones that actually matter in practice.

Flag	What it does	Example
`-Fc`	Custom output format (recommended)	`pg_dump -Fc mydb > backup.dump`
`-Fd`	Directory output format	`pg_dump -Fd mydb -f backup_dir/`
`-j N`	Parallel dump (directory format only)	`pg_dump -Fd -j 4 mydb -f backup_dir/`
`-Z 0-9`	Compression level (0=none, 9=max)	`pg_dump -Fc -Z 6 mydb > backup.dump`
`-t table`	Dump specific table(s) only	`pg_dump -Fc -t orders mydb > orders.dump`
`-n schema`	Dump specific schema only	`pg_dump -Fc -n public mydb > public.dump`
`-T table`	Exclude specific table(s)	`pg_dump -Fc -T audit_logs mydb > backup.dump`
`-N schema`	Exclude specific schema	`pg_dump -Fc -N temp_data mydb > backup.dump`
`--schema-only`	Dump structure without data	`pg_dump --schema-only mydb > schema.sql`
`--data-only`	Dump data without structure	`pg_dump --data-only mydb > data.sql`
`-C`	Include `CREATE DATABASE` in output	`pg_dump -C mydb > backup_with_create.sql`
`--no-owner`	Skip ownership commands	`pg_dump --no-owner mydb > backup.sql`
`--no-privileges`	Skip access privilege commands	`pg_dump --no-privileges mydb > backup.sql`
`-v`	Verbose mode (progress output)	`pg_dump -Fc -v mydb > backup.dump`

The -j flag deserves special attention. It runs parallel workers during dump, but only works with directory format. If you have a large database and multiple CPU cores available, this can cut dump time significantly:

pg_dump -Fd -j 4 -U postgres mydb -f /backups/mydb_dir/

This spins up 4 workers that dump different tables simultaneously. On a 50GB database, going from 1 to 4 parallel workers often cuts the time by 60-70%.

Backing up specific tables and schemas

Sometimes you don't need the whole database. Maybe you're migrating a single table, or you want to back up a specific schema before a risky change. pg_dump handles this cleanly.

Back up a single table:

pg_dump -Fc -t orders -U postgres mydb > orders_backup.dump

Back up multiple tables:

pg_dump -Fc -t orders -t order_items -t customers -U postgres mydb > related_tables.dump

Back up an entire schema:

pg_dump -Fc -n analytics -U postgres mydb > analytics_schema.dump

Exclude large tables that you don't need in the backup:

pg_dump -Fc -T large_audit_log -T event_tracking -U postgres mydb > backup_without_logs.dump

The table name patterns support wildcards too. To dump all tables starting with order:

pg_dump -Fc -t 'order*' -U postgres mydb > order_tables.dump

One thing to watch out for: when you dump specific tables, foreign key relationships to tables outside your selection won't be included. The restore will work, but you'll need to handle dependencies manually if the referenced tables don't exist in the target database.

Compression

Custom format (-Fc) uses zlib compression by default at level 6. You can adjust this with the -Z flag. Level 0 means no compression, level 9 is maximum compression.

pg_dump -Fc -Z 9 -U postgres mydb > mydb_max_compress.dump

Higher compression saves storage space but takes longer to produce. For most databases, the default level 6 is a good balance. Level 9 typically only saves an extra 5-10% compared to level 6, while taking noticeably longer.

Starting with PostgreSQL 16, the -Z flag also accepts named algorithms:

pg_dump -Fc -Z lz4 -U postgres mydb > mydb_lz4.dump
pg_dump -Fc -Z zstd:5 -U postgres mydb > mydb_zstd.dump

LZ4 is significantly faster than zlib with slightly less compression. Zstandard (zstd) gives better compression ratios than zlib at comparable speeds. If you're on PostgreSQL 16 or newer, zstd is worth trying.

For plain SQL format, pg_dump doesn't compress directly. You pipe through an external tool:

pg_dump -U postgres mydb | gzip > mydb_backup.sql.gz

Or with zstd for better performance:

pg_dump -U postgres mydb | zstd > mydb_backup.sql.zst

Restoring from pg_dump backups

How you restore depends on the output format.

Plain SQL files restore with psql:

psql -U postgres -h localhost mydb < mydb_backup.sql

If the backup includes CREATE DATABASE (dumped with -C), connect to a different database first:

psql -U postgres -h localhost postgres < mydb_backup.sql

Custom and directory format files restore with pg_restore:

pg_restore -U postgres -h localhost -d mydb mydb_backup.dump

Parallel restore uses the -j flag, which can dramatically speed up restoration of large databases:

pg_restore -U postgres -h localhost -d mydb -j 4 mydb_backup.dump

Selective restore is one of the main reasons to use custom format. Restore only specific tables:

pg_restore -U postgres -d mydb -t orders -t customers mydb_backup.dump

Restore only the schema (no data):

pg_restore -U postgres -d mydb --schema-only mydb_backup.dump

Restore only the data (schema must already exist):

pg_restore -U postgres -d mydb --data-only mydb_backup.dump

A common pattern for clean restores is to drop and recreate the database first:

dropdb -U postgres mydb
createdb -U postgres mydb
pg_restore -U postgres -d mydb -j 4 mydb_backup.dump

One gotcha: pg_restore reports errors but doesn't stop on them by default. If you want strict behavior, add --exit-on-error. For most restores, it's fine to let it continue and review errors afterward. Typical errors are things like "role does not exist" when you restore to a different server that doesn't have the same users.

Automating pg_dump with cron

Running pg_dump manually is fine for one-off backups, but production databases need automated schedules. The standard Linux approach is a cron job.

Here's a basic backup script:

#!/bin/bash
BACKUP_DIR="/var/backups/postgresql"
DB_NAME="production"
DB_USER="backup_user"
DB_HOST="localhost"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=7

pg_dump -Fc -Z 6 -U "$DB_USER" -h "$DB_HOST" "$DB_NAME" \
    > "$BACKUP_DIR/${DB_NAME}_${TIMESTAMP}.dump"

find "$BACKUP_DIR" -name "*.dump" -mtime +$RETENTION_DAYS -delete

Schedule it with cron to run daily at 3 AM:

crontab -e
# Add this line:
0 3 * * * /opt/scripts/backup_postgres.sh >> /var/log/pg_backup.log 2>&1

This works. But there are problems you'll hit sooner or later. The script doesn't verify the backup is valid. It doesn't notify you if the backup fails. It doesn't upload to remote storage. It doesn't handle encryption. And if the cron job silently breaks, you might not notice until you need to restore.

Production backup automation needs more than a shell script, which is where dedicated tools come in.

Limitations of pg_dump

pg_dump is reliable and portable, but it has real constraints that matter at scale:

Backup time grows linearly with database size. A 100GB database can take 30+ minutes even with compression. During that time, the dump holds a snapshot that can affect VACUUM and disk usage.
No incremental backups. Every run dumps the entire database (or the selected tables). If you have 500GB of data and only 1GB changed since yesterday, you're still dumping all 500GB.
Single-database scope. You need separate invocations for each database, plus pg_dumpall for global objects like roles.
No point-in-time recovery. A pg_dump backup captures one moment. If your database fails at 2:47 PM and your last dump was at 3:00 AM, you lose nearly 12 hours of data.
No built-in remote storage. The dump goes to local disk. Uploading to S3, Google Drive or another remote location requires additional scripting.

For small to medium databases with daily backup requirements and acceptable recovery point objectives, these limitations are perfectly fine. But as your database grows or your uptime requirements get stricter, you'll need more than pg_dump alone.

Databasus — a better way to handle PostgreSQL backups

If you've gotten this far, you probably see the pattern. pg_dump does the core job well, but everything around it — scheduling, storage, retention, notifications, encryption, monitoring — you have to build yourself. That's a lot of scripts and a lot of places where things can quietly break.

Databasus is the most popular open-source tool for PostgreSQL backup. It wraps the entire backup workflow into a self-hosted application with a web UI. It works for individuals managing a single database, teams handling dozens and enterprises with strict compliance requirements.

Installing Databasus

Docker run:

docker run -d \
  --name databasus \
  -p 4005:4005 \
  -v ./databasus-data:/databasus-data \
  --restart unless-stopped \
  databasus/databasus:latest

Docker Compose:

services:
  databasus:
    container_name: databasus
    image: databasus/databasus:latest
    ports:
      - "4005:4005"
    volumes:
      - ./databasus-data:/databasus-data
    restart: unless-stopped

docker compose up -d

Creating your first backup

Once Databasus is running, open http://localhost:4005 in your browser and follow these steps:

Add your database. Click "New Database" and enter your PostgreSQL connection details — host, port, database name and credentials. Databasus validates the connection before saving.
Select storage. Choose where backups should be stored. Options include local disk, Amazon S3, Cloudflare R2, Google Drive, SFTP, Dropbox and more. For remote storage, enter your credentials and Databasus handles the upload automatically.
Select schedule. Pick a backup frequency — hourly, daily, weekly, monthly or a custom cron expression. You can set the exact time so backups run during low-traffic periods.
Click "Create backup". Databasus validates everything and starts the backup schedule. You get compression, encryption, retention policies and failure notifications out of the box — without writing a single line of bash.

Databasus supports PostgreSQL versions 12 through 18, handles AES-256-GCM encryption, offers GFS retention policies and sends notifications via Slack, Telegram, Discord, email or webhooks. It's the industry standard for automated PostgreSQL backup tooling.

Getting started

For quick one-off dumps or small databases, pg_dump with custom format is all you need:

pg_dump -Fc -Z 6 -U postgres mydb > backup.dump

Restore with:

pg_restore -U postgres -d mydb -j 4 backup.dump

Learn the flags, understand the output formats and practice restoring. A backup you've never tested isn't really a backup. And when your needs outgrow shell scripts and cron — scheduled backups, cloud storage, encryption, team access, monitoring — Databasus picks up where pg_dump leaves off.

DEV Community