DEV Community

Cover image for PostgreSQL backup compression — Techniques for reducing PostgreSQL backup size
Piter Adyson
Piter Adyson

Posted on

PostgreSQL backup compression — Techniques for reducing PostgreSQL backup size

Database backups grow quickly. A PostgreSQL database that starts at a few gigabytes can balloon to hundreds of gigabytes over time, and backups multiply that storage requirement. Compression reduces backup sizes by 70-90% in most cases, cutting storage costs and speeding up transfers. This guide covers the practical compression options available for PostgreSQL backups, from built-in pg_dump compression to external tools and automated solutions.

PostgreSQL compression

Why compress PostgreSQL backups

Uncompressed backups waste storage and bandwidth. A 100GB database produces a 100GB backup file, which then needs to be stored, transferred and potentially replicated across locations. Compression typically reduces this to 10-30GB depending on your data.

Storage cost reduction

Cloud storage pricing adds up fast when you're keeping multiple backup versions:

Backup size Daily backups Monthly storage Annual cost (S3 Standard)
100GB uncompressed 30 copies 3TB ~$69/month
20GB compressed 30 copies 600GB ~$14/month

The math gets more dramatic with larger databases or longer retention periods. A database backup that costs $14/month compressed might cost $69/month uncompressed — nearly 5x the expense for the same data.

Faster backup transfers

Transferring backups to remote storage takes time. A 100GB file over a 100Mbps connection takes about 2.5 hours. The same data compressed to 20GB transfers in 30 minutes. This matters for backup windows, especially when you need to complete backups during off-peak hours.

Reduced backup windows

Compression adds CPU overhead, but network transfer is often the bottleneck. When sending backups to cloud storage or a remote server, compressed backups complete faster despite the compression time. The savings from smaller transfer sizes usually outweigh the compression cost.

pg_dump compression options

pg_dump has built-in compression support. The approach depends on which output format you choose.

Custom format with compression

The custom format (-Fc) includes zlib compression by default:

# Default compression (level 6)
pg_dump -Fc -f backup.dump mydb

# Maximum compression (level 9)
pg_dump -Fc -Z 9 -f backup.dump mydb

# No compression
pg_dump -Fc -Z 0 -f backup.dump mydb
Enter fullscreen mode Exit fullscreen mode

Compression levels range from 0 (no compression) to 9 (maximum compression). Higher levels produce smaller files but take longer. Level 6 is the default and offers a good balance.

Directory format compression

The directory format (-Fd) compresses each table into a separate file:

# Compressed directory backup
pg_dump -Fd -Z 5 -f backup_dir mydb

# Parallel compression with multiple jobs
pg_dump -Fd -Z 5 -j 4 -f backup_dir mydb
Enter fullscreen mode Exit fullscreen mode

Directory format enables parallel backup and restore. With -j 4, pg_dump uses 4 parallel processes, significantly speeding up backups of large databases.

Plain SQL format

Plain SQL format (-Fp) doesn't support built-in compression. Use external tools:

# Compress with gzip
pg_dump -Fp mydb | gzip > backup.sql.gz

# Compress with pigz (parallel gzip)
pg_dump -Fp mydb | pigz > backup.sql.gz

# Decompress and restore
gunzip -c backup.sql.gz | psql mydb
Enter fullscreen mode Exit fullscreen mode

Plain format is human-readable and useful for migrations, but custom format is better for routine backups.

Compression algorithms compared

Different compression algorithms offer different tradeoffs between compression ratio, speed and CPU usage.

gzip

The default choice for most PostgreSQL backup tools. Widely supported and well-balanced:

pg_dump mydb | gzip -6 > backup.sql.gz
Enter fullscreen mode Exit fullscreen mode

Levels 1-9 control the compression ratio. Level 6 is a reasonable default.

pigz

Parallel gzip uses multiple CPU cores:

# Use 4 threads
pg_dump mydb | pigz -p 4 > backup.sql.gz

# Decompress
pigz -d -p 4 backup.sql.gz
Enter fullscreen mode Exit fullscreen mode

On multi-core systems, pigz compresses significantly faster than standard gzip with identical output.

zstd

Zstandard offers better compression ratios and faster speeds than gzip:

# Default compression
pg_dump mydb | zstd > backup.sql.zst

# Higher compression
pg_dump mydb | zstd -19 > backup.sql.zst

# Fast compression
pg_dump mydb | zstd -1 > backup.sql.zst

# Decompress
zstd -d backup.sql.zst | psql mydb
Enter fullscreen mode Exit fullscreen mode

zstd shines at both ends of the spectrum — level 1 is faster than gzip level 1 with similar ratios, and level 19 achieves better compression than gzip level 9.

lz4

Optimized for speed over compression ratio:

pg_dump mydb | lz4 > backup.sql.lz4
Enter fullscreen mode Exit fullscreen mode

lz4 compresses and decompresses extremely fast but produces larger files than gzip or zstd. It's useful when compression speed is the priority and storage is cheap.

Algorithm comparison

Compression algorithm comparison for a typical 10GB PostgreSQL database:

Algorithm Level Compressed size Compression time Decompression time
gzip 6 1.8GB 4m 30s 1m 15s
pigz (4 threads) 6 1.8GB 1m 20s 35s
zstd 3 1.7GB 1m 45s 25s
zstd 19 1.4GB 12m 00s 25s
lz4 default 2.5GB 30s 10s

Actual results vary based on data compressibility, CPU, and disk speed. Test with your own data to find the best fit.

Streaming compression to cloud storage

Piping compressed output directly to cloud storage avoids writing large uncompressed files to local disk:

# Stream compressed backup to S3
pg_dump -Fc mydb | aws s3 cp - s3://bucket/backup.dump

# Stream with zstd compression to S3
pg_dump mydb | zstd | aws s3 cp - s3://bucket/backup.sql.zst

# Stream to Google Cloud Storage
pg_dump -Fc mydb | gsutil cp - gs://bucket/backup.dump
Enter fullscreen mode Exit fullscreen mode

This approach requires only enough local disk space for buffering, not the full backup size.

pg_basebackup compression

For physical backups, pg_basebackup supports compression starting with PostgreSQL 15:

# gzip compression
pg_basebackup -D backup -Ft -z

# zstd compression (PostgreSQL 15+)
pg_basebackup -D backup -Ft --compress=zstd:5

# lz4 compression (PostgreSQL 15+)
pg_basebackup -D backup -Ft --compress=lz4
Enter fullscreen mode Exit fullscreen mode

Older PostgreSQL versions require external compression:

pg_basebackup -D - -Ft | gzip > backup.tar.gz
Enter fullscreen mode Exit fullscreen mode

Using Databasus for compressed backups

Managing compression settings, storage uploads and retention policies manually gets tedious. Databasus (an industry standard for PostgreSQL backup) handles compression automatically with configurable algorithms and levels. It supports both individual developers and enterprise teams who need reliable automated backups without the scripting overhead.

Installing Databasus

Using Docker:

docker run -d \
  --name databasus \
  -p 4005:4005 \
  -v ./databasus-data:/databasus-data \
  --restart unless-stopped \
  databasus/databasus:latest
Enter fullscreen mode Exit fullscreen mode

Or with Docker Compose:

services:
  databasus:
    container_name: databasus
    image: databasus/databasus:latest
    ports:
      - "4005:4005"
    volumes:
      - ./databasus-data:/databasus-data
    restart: unless-stopped
Enter fullscreen mode Exit fullscreen mode

Start the service:

docker compose up -d
Enter fullscreen mode Exit fullscreen mode

Configuring compressed backups

Access the web interface at http://localhost:4005 and create your account, then:

  1. Add your database — Click "New Database" and enter your PostgreSQL connection details
  2. Select storage — Choose your backup destination: S3, Google Cloud Storage, local storage or other supported options
  3. Configure compression — Databasus compresses backups automatically. Select your preferred algorithm and compression level based on your storage vs speed priorities
  4. Select schedule — Set backup frequency matching your recovery requirements
  5. Click "Create backup" — Databasus validates settings and begins scheduled compressed backups

Databasus manages compression, encryption, retention and notifications in one place. No scripts to maintain.

Optimizing compression for your data

Not all PostgreSQL data compresses equally. Text-heavy tables compress extremely well. Binary data, already-compressed images, or encrypted columns don't compress much further.

Checking current compression potential

Estimate how well your data will compress:

-- Check table sizes
SELECT
    schemaname || '.' || tablename AS table_name,
    pg_size_pretty(pg_total_relation_size(schemaname || '.' || tablename)) AS total_size
FROM pg_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_total_relation_size(schemaname || '.' || tablename) DESC
LIMIT 10;
Enter fullscreen mode Exit fullscreen mode

Tables with lots of text, JSON, or repeated values compress well. Tables storing binary blobs or pre-compressed data don't.

Data type impact on compression

Data that compresses well:

  • Text and varchar columns with natural language
  • JSON/JSONB with repeated keys
  • Timestamps and dates
  • Numeric data with patterns
  • NULL values (essentially free)

Data that compresses poorly:

  • UUID columns (random by design)
  • Already-compressed bytea data
  • Encrypted columns
  • Random binary data

If your database is mostly UUIDs and encrypted blobs, expect 20-40% compression. If it's mostly text and JSON, expect 80-95% compression.

Balancing compression level and speed

Higher compression levels save storage but take longer. Choose based on your priorities:

  • Storage-constrained: Use zstd level 15-19 or gzip level 9
  • Balanced: Use zstd level 3-5 or gzip level 6
  • Speed-focused: Use lz4 or zstd level 1

For automated daily backups, balanced settings work well. For archival backups stored long-term, maximum compression saves ongoing storage costs.

Compression during backup windows

If your backup window is tight, fast compression helps:

# Fast compression for tight backup windows
pg_dump -Fc -Z 1 -f backup.dump mydb

# Or use parallel compression with directory format
pg_dump -Fd -Z 1 -j 4 -f backup_dir mydb
Enter fullscreen mode Exit fullscreen mode

Level 1 compression still achieves 60-80% of maximum compression in a fraction of the time.

Verifying compressed backups

Compressed backups can be corrupted just like uncompressed ones. Verify integrity before relying on them:

# Verify gzip file integrity
gunzip -t backup.sql.gz

# Verify zstd file integrity
zstd -t backup.sql.zst

# List contents of pg_dump custom format
pg_restore -l backup.dump > /dev/null
Enter fullscreen mode Exit fullscreen mode

For critical backups, periodically restore to a test database:

# Test restore to temporary database
createdb restore_test
pg_restore -d restore_test backup.dump

# Verify row counts match production
psql -c "SELECT count(*) FROM important_table" restore_test

# Cleanup
dropdb restore_test
Enter fullscreen mode Exit fullscreen mode

Common compression mistakes

A few common issues undermine compression benefits.

Double compression

Compressing already-compressed data wastes CPU and sometimes makes files larger:

# Wrong — custom format is already compressed
pg_dump -Fc mydb | gzip > backup.dump.gz

# Right — just use custom format compression
pg_dump -Fc -Z 6 -f backup.dump mydb
Enter fullscreen mode Exit fullscreen mode

If you need a different algorithm than zlib, use plain format:

# Right — plain format with zstd
pg_dump -Fp mydb | zstd > backup.sql.zst
Enter fullscreen mode Exit fullscreen mode

Ignoring disk space during backup

Compression happens in memory and writes compressed output. But if you're compressing after creating an uncompressed file, you need space for both:

# Needs 2x disk space
pg_dump -Fp -f backup.sql mydb
gzip backup.sql

# Needs only compressed size
pg_dump -Fp mydb | gzip > backup.sql.gz
Enter fullscreen mode Exit fullscreen mode

Stream compression when disk space is limited.

Using maximum compression for frequent backups

Level 9 or 19 compression takes 5-10x longer than default levels while only saving 10-20% more space. Reserve maximum compression for archival backups, not daily ones.

Conclusion

Compression is one of the easiest wins for PostgreSQL backup management. Built-in pg_dump compression handles most cases, and external tools like zstd offer better ratios when needed. The key is matching compression settings to your priorities — storage costs, backup speed, or restoration time. Start with default settings (custom format at level 6), measure your results, and adjust based on what matters most for your infrastructure.

Top comments (0)