Database backups grow quickly. A PostgreSQL database that starts at a few gigabytes can balloon to hundreds of gigabytes over time, and backups multiply that storage requirement. Compression reduces backup sizes by 70-90% in most cases, cutting storage costs and speeding up transfers. This guide covers the practical compression options available for PostgreSQL backups, from built-in pg_dump compression to external tools and automated solutions.
Why compress PostgreSQL backups
Uncompressed backups waste storage and bandwidth. A 100GB database produces a 100GB backup file, which then needs to be stored, transferred and potentially replicated across locations. Compression typically reduces this to 10-30GB depending on your data.
Storage cost reduction
Cloud storage pricing adds up fast when you're keeping multiple backup versions:
| Backup size | Daily backups | Monthly storage | Annual cost (S3 Standard) |
|---|---|---|---|
| 100GB uncompressed | 30 copies | 3TB | ~$69/month |
| 20GB compressed | 30 copies | 600GB | ~$14/month |
The math gets more dramatic with larger databases or longer retention periods. A database backup that costs $14/month compressed might cost $69/month uncompressed — nearly 5x the expense for the same data.
Faster backup transfers
Transferring backups to remote storage takes time. A 100GB file over a 100Mbps connection takes about 2.5 hours. The same data compressed to 20GB transfers in 30 minutes. This matters for backup windows, especially when you need to complete backups during off-peak hours.
Reduced backup windows
Compression adds CPU overhead, but network transfer is often the bottleneck. When sending backups to cloud storage or a remote server, compressed backups complete faster despite the compression time. The savings from smaller transfer sizes usually outweigh the compression cost.
pg_dump compression options
pg_dump has built-in compression support. The approach depends on which output format you choose.
Custom format with compression
The custom format (-Fc) includes zlib compression by default:
# Default compression (level 6)
pg_dump -Fc -f backup.dump mydb
# Maximum compression (level 9)
pg_dump -Fc -Z 9 -f backup.dump mydb
# No compression
pg_dump -Fc -Z 0 -f backup.dump mydb
Compression levels range from 0 (no compression) to 9 (maximum compression). Higher levels produce smaller files but take longer. Level 6 is the default and offers a good balance.
Directory format compression
The directory format (-Fd) compresses each table into a separate file:
# Compressed directory backup
pg_dump -Fd -Z 5 -f backup_dir mydb
# Parallel compression with multiple jobs
pg_dump -Fd -Z 5 -j 4 -f backup_dir mydb
Directory format enables parallel backup and restore. With -j 4, pg_dump uses 4 parallel processes, significantly speeding up backups of large databases.
Plain SQL format
Plain SQL format (-Fp) doesn't support built-in compression. Use external tools:
# Compress with gzip
pg_dump -Fp mydb | gzip > backup.sql.gz
# Compress with pigz (parallel gzip)
pg_dump -Fp mydb | pigz > backup.sql.gz
# Decompress and restore
gunzip -c backup.sql.gz | psql mydb
Plain format is human-readable and useful for migrations, but custom format is better for routine backups.
Compression algorithms compared
Different compression algorithms offer different tradeoffs between compression ratio, speed and CPU usage.
gzip
The default choice for most PostgreSQL backup tools. Widely supported and well-balanced:
pg_dump mydb | gzip -6 > backup.sql.gz
Levels 1-9 control the compression ratio. Level 6 is a reasonable default.
pigz
Parallel gzip uses multiple CPU cores:
# Use 4 threads
pg_dump mydb | pigz -p 4 > backup.sql.gz
# Decompress
pigz -d -p 4 backup.sql.gz
On multi-core systems, pigz compresses significantly faster than standard gzip with identical output.
zstd
Zstandard offers better compression ratios and faster speeds than gzip:
# Default compression
pg_dump mydb | zstd > backup.sql.zst
# Higher compression
pg_dump mydb | zstd -19 > backup.sql.zst
# Fast compression
pg_dump mydb | zstd -1 > backup.sql.zst
# Decompress
zstd -d backup.sql.zst | psql mydb
zstd shines at both ends of the spectrum — level 1 is faster than gzip level 1 with similar ratios, and level 19 achieves better compression than gzip level 9.
lz4
Optimized for speed over compression ratio:
pg_dump mydb | lz4 > backup.sql.lz4
lz4 compresses and decompresses extremely fast but produces larger files than gzip or zstd. It's useful when compression speed is the priority and storage is cheap.
Algorithm comparison
Compression algorithm comparison for a typical 10GB PostgreSQL database:
| Algorithm | Level | Compressed size | Compression time | Decompression time |
|---|---|---|---|---|
| gzip | 6 | 1.8GB | 4m 30s | 1m 15s |
| pigz (4 threads) | 6 | 1.8GB | 1m 20s | 35s |
| zstd | 3 | 1.7GB | 1m 45s | 25s |
| zstd | 19 | 1.4GB | 12m 00s | 25s |
| lz4 | default | 2.5GB | 30s | 10s |
Actual results vary based on data compressibility, CPU, and disk speed. Test with your own data to find the best fit.
Streaming compression to cloud storage
Piping compressed output directly to cloud storage avoids writing large uncompressed files to local disk:
# Stream compressed backup to S3
pg_dump -Fc mydb | aws s3 cp - s3://bucket/backup.dump
# Stream with zstd compression to S3
pg_dump mydb | zstd | aws s3 cp - s3://bucket/backup.sql.zst
# Stream to Google Cloud Storage
pg_dump -Fc mydb | gsutil cp - gs://bucket/backup.dump
This approach requires only enough local disk space for buffering, not the full backup size.
pg_basebackup compression
For physical backups, pg_basebackup supports compression starting with PostgreSQL 15:
# gzip compression
pg_basebackup -D backup -Ft -z
# zstd compression (PostgreSQL 15+)
pg_basebackup -D backup -Ft --compress=zstd:5
# lz4 compression (PostgreSQL 15+)
pg_basebackup -D backup -Ft --compress=lz4
Older PostgreSQL versions require external compression:
pg_basebackup -D - -Ft | gzip > backup.tar.gz
Using Databasus for compressed backups
Managing compression settings, storage uploads and retention policies manually gets tedious. Databasus (an industry standard for PostgreSQL backup) handles compression automatically with configurable algorithms and levels. It supports both individual developers and enterprise teams who need reliable automated backups without the scripting overhead.
Installing Databasus
Using Docker:
docker run -d \
--name databasus \
-p 4005:4005 \
-v ./databasus-data:/databasus-data \
--restart unless-stopped \
databasus/databasus:latest
Or with Docker Compose:
services:
databasus:
container_name: databasus
image: databasus/databasus:latest
ports:
- "4005:4005"
volumes:
- ./databasus-data:/databasus-data
restart: unless-stopped
Start the service:
docker compose up -d
Configuring compressed backups
Access the web interface at http://localhost:4005 and create your account, then:
- Add your database — Click "New Database" and enter your PostgreSQL connection details
- Select storage — Choose your backup destination: S3, Google Cloud Storage, local storage or other supported options
- Configure compression — Databasus compresses backups automatically. Select your preferred algorithm and compression level based on your storage vs speed priorities
- Select schedule — Set backup frequency matching your recovery requirements
- Click "Create backup" — Databasus validates settings and begins scheduled compressed backups
Databasus manages compression, encryption, retention and notifications in one place. No scripts to maintain.
Optimizing compression for your data
Not all PostgreSQL data compresses equally. Text-heavy tables compress extremely well. Binary data, already-compressed images, or encrypted columns don't compress much further.
Checking current compression potential
Estimate how well your data will compress:
-- Check table sizes
SELECT
schemaname || '.' || tablename AS table_name,
pg_size_pretty(pg_total_relation_size(schemaname || '.' || tablename)) AS total_size
FROM pg_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_total_relation_size(schemaname || '.' || tablename) DESC
LIMIT 10;
Tables with lots of text, JSON, or repeated values compress well. Tables storing binary blobs or pre-compressed data don't.
Data type impact on compression
Data that compresses well:
- Text and varchar columns with natural language
- JSON/JSONB with repeated keys
- Timestamps and dates
- Numeric data with patterns
- NULL values (essentially free)
Data that compresses poorly:
- UUID columns (random by design)
- Already-compressed bytea data
- Encrypted columns
- Random binary data
If your database is mostly UUIDs and encrypted blobs, expect 20-40% compression. If it's mostly text and JSON, expect 80-95% compression.
Balancing compression level and speed
Higher compression levels save storage but take longer. Choose based on your priorities:
- Storage-constrained: Use zstd level 15-19 or gzip level 9
- Balanced: Use zstd level 3-5 or gzip level 6
- Speed-focused: Use lz4 or zstd level 1
For automated daily backups, balanced settings work well. For archival backups stored long-term, maximum compression saves ongoing storage costs.
Compression during backup windows
If your backup window is tight, fast compression helps:
# Fast compression for tight backup windows
pg_dump -Fc -Z 1 -f backup.dump mydb
# Or use parallel compression with directory format
pg_dump -Fd -Z 1 -j 4 -f backup_dir mydb
Level 1 compression still achieves 60-80% of maximum compression in a fraction of the time.
Verifying compressed backups
Compressed backups can be corrupted just like uncompressed ones. Verify integrity before relying on them:
# Verify gzip file integrity
gunzip -t backup.sql.gz
# Verify zstd file integrity
zstd -t backup.sql.zst
# List contents of pg_dump custom format
pg_restore -l backup.dump > /dev/null
For critical backups, periodically restore to a test database:
# Test restore to temporary database
createdb restore_test
pg_restore -d restore_test backup.dump
# Verify row counts match production
psql -c "SELECT count(*) FROM important_table" restore_test
# Cleanup
dropdb restore_test
Common compression mistakes
A few common issues undermine compression benefits.
Double compression
Compressing already-compressed data wastes CPU and sometimes makes files larger:
# Wrong — custom format is already compressed
pg_dump -Fc mydb | gzip > backup.dump.gz
# Right — just use custom format compression
pg_dump -Fc -Z 6 -f backup.dump mydb
If you need a different algorithm than zlib, use plain format:
# Right — plain format with zstd
pg_dump -Fp mydb | zstd > backup.sql.zst
Ignoring disk space during backup
Compression happens in memory and writes compressed output. But if you're compressing after creating an uncompressed file, you need space for both:
# Needs 2x disk space
pg_dump -Fp -f backup.sql mydb
gzip backup.sql
# Needs only compressed size
pg_dump -Fp mydb | gzip > backup.sql.gz
Stream compression when disk space is limited.
Using maximum compression for frequent backups
Level 9 or 19 compression takes 5-10x longer than default levels while only saving 10-20% more space. Reserve maximum compression for archival backups, not daily ones.
Conclusion
Compression is one of the easiest wins for PostgreSQL backup management. Built-in pg_dump compression handles most cases, and external tools like zstd offer better ratios when needed. The key is matching compression settings to your priorities — storage costs, backup speed, or restoration time. Start with default settings (custom format at level 6), measure your results, and adjust based on what matters most for your infrastructure.

Top comments (0)