DEV Community

Cover image for Top 5 Backup Formats and When to Use Them for PostgreSQL
Grig
Grig

Posted on

Top 5 Backup Formats and When to Use Them for PostgreSQL

Choosing the right backup format can mean the difference between a 10-minute recovery and an all-day ordeal. PostgreSQL's pg_dump utility offers multiple output formats, each optimized for different scenarios — from quick development snapshots to enterprise disaster recovery. Understanding these formats helps you build a backup strategy that balances storage efficiency, restoration speed and operational flexibility. This guide breaks down the five key backup formats and shows exactly when to use each one.

pg_dump formats

1. Custom Format (-F c) — The All-Around Champion

The custom format is PostgreSQL's most versatile backup option and the recommended choice for most production environments. It combines compression, selective restoration capabilities and parallel processing support into a single archive file. When you need reliability without overthinking your backup strategy, custom format delivers.

Custom format backups are created with the -F c flag and produce .dump files that can only be restored using pg_restore. The format compresses data automatically, typically reducing backup size by 70-80% compared to plain SQL. More importantly, it allows you to restore specific tables, schemas or data without processing the entire backup.

# Create a custom format backup
pg_dump -h localhost -U postgres -F c -d myapp -f backup.dump

# Restore the entire database
pg_restore -h localhost -U postgres -d myapp_restored backup.dump

# Restore only specific tables
pg_restore -h localhost -U postgres -d myapp_restored -t users -t orders backup.dump
Enter fullscreen mode Exit fullscreen mode
  • Automatic compression — reduces storage requirements by 70-80%
  • Selective restoration — restore individual tables or schemas without full recovery
  • Parallel restore support — use multiple CPU cores for faster restoration

Custom format is ideal for databases ranging from megabytes to hundreds of gigabytes. The only scenario where you might choose differently is when you need human-readable output or must share backups with non-PostgreSQL systems.

2. Plain SQL Format (-F p) — Human-Readable and Portable

Plain SQL format produces a text file containing standard SQL statements that recreate your database. It's the most transparent format — you can open the file in any text editor, review the exact commands and even modify them before restoration. This makes it invaluable for debugging, auditing and cross-platform migrations.

The plain format is the default when you don't specify -F and outputs directly to stdout or a .sql file. While it lacks compression and parallel restore capabilities, its simplicity and portability make it the right choice for specific scenarios.

# Create a plain SQL backup
pg_dump -h localhost -U postgres -d myapp > backup.sql

# With explicit format flag
pg_dump -h localhost -U postgres -F p -d myapp -f backup.sql

# Restore using psql
psql -h localhost -U postgres -d myapp_restored < backup.sql

# Compress manually for storage efficiency
pg_dump -h localhost -U postgres -d myapp | gzip > backup.sql.gz
Enter fullscreen mode Exit fullscreen mode
Aspect Plain SQL Custom Format
File size Large (uncompressed) Small (compressed)
Readability Human-readable Binary archive
Restore tool psql pg_restore
Selective restore No Yes
Parallel restore No Yes

Plain SQL format excels when you need to inspect backup contents, share database structures with team members or migrate data to other database systems that understand standard SQL. For automated production backups, custom format is usually the better choice.

3. Directory Format (-F d) — Built for Large Databases

Directory format is designed for very large databases where parallel backup and restore operations become essential. Instead of creating a single file, it produces a directory containing multiple files — one per table — plus metadata files that describe the database structure. This architecture enables both parallel dumping and parallel restoration using multiple CPU cores.

The format shines on modern multi-core servers where you can dedicate 4, 8 or more parallel jobs to backup and restore operations. For databases over 100GB, directory format can reduce backup and recovery times by 60-80% compared to single-threaded alternatives.

# Create a directory format backup with 4 parallel jobs
pg_dump -h localhost -U postgres -F d -j 4 -d myapp -f backup_dir/

# Restore with parallel jobs
pg_restore -h localhost -U postgres -d myapp_restored -j 4 backup_dir/

# List contents of the backup directory
ls backup_dir/
# Output: toc.dat, 3456.dat.gz, 3457.dat.gz, 3458.dat.gz, ...
Enter fullscreen mode Exit fullscreen mode
  • Parallel backup — use -j flag to dump multiple tables simultaneously
  • Parallel restore — restore with multiple jobs for faster recovery
  • Per-table files — easier to manage and verify individual table backups

Directory format requires more storage management than single-file formats since you're dealing with a folder structure rather than one archive. It's overkill for databases under 50GB but becomes increasingly valuable as database size grows into the hundreds of gigabytes or terabyte range.

4. Tar Format (-F t) — Archive Compatibility

Tar format packages the backup into a standard Unix tape archive file. While less commonly used than custom or directory formats, it provides compatibility with tar-based workflows and archival systems. The format produces a .tar file that contains the same internal structure as directory format but bundled into a single archive.

The main trade-off is that tar format doesn't support parallel restoration and doesn't compress data internally. You'll typically pipe the output through gzip or another compression tool for storage efficiency.

# Create a tar format backup
pg_dump -h localhost -U postgres -F t -d myapp -f backup.tar

# Create compressed tar backup
pg_dump -h localhost -U postgres -F t -d myapp | gzip > backup.tar.gz

# Restore from tar format
pg_restore -h localhost -U postgres -d myapp_restored backup.tar

# Extract and inspect contents
tar -tvf backup.tar
Enter fullscreen mode Exit fullscreen mode
Format Compression Parallel Dump Parallel Restore Best For
Custom (-F c) Built-in No Yes Most use cases
Directory (-F d) Built-in Yes Yes Large databases
Tar (-F t) Manual No No Archive systems
Plain (-F p) Manual No No Readability

Tar format is primarily useful when integrating with existing archive management systems or when organizational policies require standard archive formats. For new backup implementations, custom or directory formats are almost always preferable.

5. Compressed Plain SQL — Simple and Space-Efficient

While not a native pg_dump format, piping plain SQL output through compression tools like gzip or zstd is a practical approach for many scenarios. This combines the human-readable nature of SQL dumps with significant storage savings, making it popular for development backups and smaller production databases.

The approach works by streaming pg_dump output directly to a compression utility, avoiding the need to write uncompressed data to disk. Restoration reverses the process, decompressing the stream into psql.

# Backup with gzip compression
pg_dump -h localhost -U postgres -d myapp | gzip > backup.sql.gz

# Backup with zstd (faster compression)
pg_dump -h localhost -U postgres -d myapp | zstd > backup.sql.zst

# Restore from gzip
gunzip -c backup.sql.gz | psql -h localhost -U postgres -d myapp_restored

# Restore from zstd
zstd -d -c backup.sql.zst | psql -h localhost -U postgres -d myapp_restored
Enter fullscreen mode Exit fullscreen mode
  • gzip — universal compatibility, good compression ratio
  • zstd — faster compression and decompression, slightly better ratios
  • lz4 — fastest compression, lower ratios, ideal for speed-critical backups

This approach is ideal for development environments, CI/CD pipelines and scenarios where you want inspectable backups without sacrificing too much storage space. The main limitation is that you lose selective restoration capabilities — it's all or nothing with compressed SQL.

Choosing the Right Format for Your Needs

Selecting a backup format depends on your database size, recovery time objectives and operational requirements. Most teams will use custom format as their default and only switch to alternatives for specific needs.

  • Databases under 10GB — Custom format or compressed plain SQL
  • Databases 10GB-100GB — Custom format with scheduled backups
  • Databases over 100GB — Directory format with parallel jobs
  • Development and CI/CD — Compressed plain SQL for simplicity
  • Compliance and auditing — Plain SQL for human-readable records

For teams managing multiple PostgreSQL databases, manually choosing formats and managing backups across environments becomes tedious. Postgresus automates PostgreSQL backup by selecting optimal settings based on your database size, handling compression, encryption (AES-256-GCM) and multi-destination storage (S3, Google Drive, Dropbox, NAS) — all through a web interface that takes minutes to configure.

Conclusion

PostgreSQL's backup format options give you flexibility to optimize for size, speed or compatibility depending on your specific situation. Custom format covers 90% of production use cases with its balance of compression, selective restore and parallel processing. Directory format steps in for very large databases where parallel operations are essential. Plain SQL and tar formats serve niche requirements around readability and archive compatibility. Whatever format you choose, the critical factor is having consistent, tested backups stored safely away from your production infrastructure — because the best backup format in the world is worthless if you don't have a backup when disaster strikes.

Top comments (0)