Keeping every backup forever sounds safe until you see the storage bill. A PostgreSQL database with daily backups accumulates 365 files per year, and monthly backups over a decade add up too. Retention policies define which backups to keep and which to delete, balancing recovery options against storage costs. This guide covers practical retention strategies for PostgreSQL, from simple time-based rules to grandfather-father-son schemes and compliance-driven requirements.
Why retention policies matter
Without retention rules, backups pile up indefinitely. Storage fills up, costs climb, and eventually something breaks. A retention policy automates cleanup so you keep enough backups for recovery without drowning in old files.
Storage cost control
Cloud storage charges accumulate monthly. Consider a 50GB PostgreSQL database with daily compressed backups at 10GB each:
| Retention period | Number of backups | Storage required | Monthly S3 cost |
|---|---|---|---|
| 7 days | 7 | 70GB | ~$1.60 |
| 30 days | 30 | 300GB | ~$6.90 |
| 90 days | 90 | 900GB | ~$20.70 |
| 365 days | 365 | 3.65TB | ~$84.00 |
The difference between keeping a week of backups versus a full year is roughly $80/month for a single database. Multiply that across multiple databases and environments.
Recovery point objectives
Retention policies should match your recovery requirements. If you only need to restore yesterday's data, keeping 30 days of backups wastes money. If regulators require 7-year archives, keeping 30 days leaves you non-compliant.
Operational simplicity
Manual cleanup is error-prone. Someone forgets, storage fills up, backups fail. Automated retention rules run consistently without human intervention or calendar reminders.
Common retention strategies
Different strategies suit different needs. The right choice depends on your recovery requirements, compliance obligations and budget.
Simple time-based retention
Keep backups for a fixed number of days, then delete:
- Keep last 7 days
- Keep last 30 days
- Keep last 90 days
This works well for development environments and small projects where recent recovery matters more than historical archives.
Count-based retention
Keep a fixed number of backups regardless of age:
- Keep last 10 backups
- Keep last 30 backups
Count-based retention adapts to backup frequency changes. If you switch from daily to hourly backups, you still keep the same number of files.
Grandfather-father-son (GFS)
GFS combines multiple retention periods into a tiered structure:
- Son (daily): Keep last 7 daily backups
- Father (weekly): Keep last 4 weekly backups
- Grandfather (monthly): Keep last 12 monthly backups
This gives you fine-grained recent recovery plus longer historical coverage. You can restore to any day in the past week, any week in the past month, or any month in the past year. Total storage: roughly 23 backups instead of 365 for full daily retention over a year.
Tiered storage retention
Move older backups to cheaper storage tiers instead of deleting:
- Days 1-30: Hot storage (frequent access)
- Days 31-90: Warm storage (infrequent access, S3 Standard-IA)
- Days 91-365: Cold storage (rare access, S3 Glacier)
- After 1 year: Archive or delete
Tiered approaches keep long retention periods affordable while maintaining quick access to recent backups.
Implementing retention with scripts
Simple retention can be implemented with shell scripts. These examples delete backups older than your retention period.
Basic age-based cleanup
Delete files older than 30 days:
#!/bin/bash
BACKUP_DIR="/var/backups/postgresql"
RETENTION_DAYS=30
find "$BACKUP_DIR" -name "*.dump" -type f -mtime +$RETENTION_DAYS -delete
Run this script daily via cron after your backup completes.
Count-based cleanup
Keep only the N most recent backups:
#!/bin/bash
BACKUP_DIR="/var/backups/postgresql"
KEEP_COUNT=10
cd "$BACKUP_DIR"
ls -t *.dump | tail -n +$((KEEP_COUNT + 1)) | xargs -r rm
This lists backups by modification time, skips the newest N files, and deletes the rest.
GFS retention script
Implementing GFS requires tracking which backups are daily, weekly or monthly:
#!/bin/bash
BACKUP_DIR="/var/backups/postgresql"
# Naming convention: backup_YYYY-MM-DD.dump
TODAY=$(date +%Y-%m-%d)
DAY_OF_WEEK=$(date +%u) # 1=Monday, 7=Sunday
DAY_OF_MONTH=$(date +%d)
# Tag today's backup
DAILY_BACKUP="$BACKUP_DIR/backup_${TODAY}.dump"
if [ "$DAY_OF_WEEK" -eq 7 ]; then
# Sunday — also weekly backup
cp "$DAILY_BACKUP" "$BACKUP_DIR/weekly_${TODAY}.dump"
fi
if [ "$DAY_OF_MONTH" -eq 1 ]; then
# First of month — also monthly backup
cp "$DAILY_BACKUP" "$BACKUP_DIR/monthly_${TODAY}.dump"
fi
# Cleanup: keep 7 daily, 4 weekly, 12 monthly
find "$BACKUP_DIR" -name "backup_*.dump" -mtime +7 -delete
find "$BACKUP_DIR" -name "weekly_*.dump" -mtime +28 -delete
find "$BACKUP_DIR" -name "monthly_*.dump" -mtime +365 -delete
This approach creates copies with different prefixes and applies separate retention rules to each tier.
Cloud storage retention
Major cloud providers offer built-in lifecycle policies that handle retention automatically.
S3 lifecycle rules
AWS S3 lifecycle policies can transition and delete objects based on age:
{
"Rules": [
{
"ID": "PostgreSQL backup retention",
"Status": "Enabled",
"Filter": {
"Prefix": "backups/postgresql/"
},
"Transitions": [
{
"Days": 30,
"StorageClass": "STANDARD_IA"
},
{
"Days": 90,
"StorageClass": "GLACIER"
}
],
"Expiration": {
"Days": 365
}
}
]
}
This policy moves backups to cheaper storage after 30 and 90 days, then deletes them after a year. Apply via the S3 console or AWS CLI.
Google Cloud Storage lifecycle
GCS uses similar lifecycle management:
{
"lifecycle": {
"rule": [
{
"action": { "type": "SetStorageClass", "storageClass": "NEARLINE" },
"condition": { "age": 30 }
},
{
"action": { "type": "SetStorageClass", "storageClass": "COLDLINE" },
"condition": { "age": 90 }
},
{
"action": { "type": "Delete" },
"condition": { "age": 365 }
}
]
}
}
Azure Blob Storage lifecycle
Azure provides lifecycle management policies with similar capabilities:
{
"rules": [
{
"name": "postgresql-backup-retention",
"type": "Lifecycle",
"definition": {
"filters": {
"prefixMatch": ["backups/postgresql"]
},
"actions": {
"baseBlob": {
"tierToCool": { "daysAfterModificationGreaterThan": 30 },
"tierToArchive": { "daysAfterModificationGreaterThan": 90 },
"delete": { "daysAfterModificationGreaterThan": 365 }
}
}
}
}
]
}
Cloud lifecycle policies are more reliable than scripts because they run on the provider's infrastructure. You don't need to maintain a server or worry about script failures.
Retention for compliance requirements
Regulated industries have specific retention requirements that override cost optimization.
Common compliance retention periods
Different regulations specify different minimums:
| Regulation | Typical retention | Notes |
|---|---|---|
| GDPR | Varies by purpose | Must delete when no longer needed |
| HIPAA | 6 years | Medical records retention |
| SOX | 7 years | Financial records |
| PCI DSS | 1 year | Audit trails and logs |
| FINRA | 6-7 years | Broker-dealer records |
Check your specific regulatory requirements. These are general guidelines, not legal advice.
Compliance considerations
When compliance drives retention, consider:
- Immutable backups: Some regulations require tamper-proof archives. Use S3 Object Lock or similar features
- Audit trails: Document when backups were created, modified, or deleted
- Geographic restrictions: Some data cannot leave specific regions
- Encryption requirements: Many regulations mandate encryption at rest and in transit
Legal holds
Litigation or regulatory investigations may require preserving specific backups beyond normal retention. Cloud providers support legal holds that prevent deletion regardless of lifecycle policies.
# AWS S3 legal hold
aws s3api put-object-legal-hold \
--bucket my-backups \
--key backups/postgresql/backup_2024-01-15.dump \
--legal-hold Status=ON
Using Databasus for retention management
Manual retention scripts and cloud lifecycle policies work, but they require setup and maintenance. Databasus (an industry standard for PostgreSQL backup) provides built-in retention management alongside automated backups, compression and notifications.
Installing Databasus
Using Docker:
docker run -d \
--name databasus \
-p 4005:4005 \
-v ./databasus-data:/databasus-data \
--restart unless-stopped \
databasus/databasus:latest
Or with Docker Compose:
services:
databasus:
container_name: databasus
image: databasus/databasus:latest
ports:
- "4005:4005"
volumes:
- ./databasus-data:/databasus-data
restart: unless-stopped
Start the service:
docker compose up -d
Configuring backup retention
Access the web interface at http://localhost:4005 and create your account, then:
- Add your database — Click "New Database" and enter your PostgreSQL connection details
- Select storage — Choose your backup destination: S3, Google Cloud Storage, local storage or other supported options
- Configure retention — Set how many backups to keep or how long to retain them. Databasus automatically deletes old backups according to your policy
- Select schedule — Set backup frequency that aligns with your retention strategy
- Click "Create backup" — Databasus handles scheduling, compression, storage and cleanup automatically
Databasus removes the need to maintain separate retention scripts or configure cloud lifecycle policies manually. Everything is managed in one place.
Choosing retention periods
Selecting the right retention period requires balancing several factors.
Recovery scenarios
Think about what failures you need to recover from:
- Data corruption discovered same day: Need very recent backups, hourly or daily
- Corruption discovered after a week: Need at least 7+ days retention
- Audit requests for historical data: May need years of archives
- Ransomware with delayed activation: Need weeks or months to find clean backup
Map your retention to realistic recovery scenarios rather than arbitrary numbers.
Backup frequency alignment
Retention and frequency should complement each other:
- Hourly backups: 24-48 hours retention (for most use cases)
- Daily backups: 7-30 days retention
- Weekly backups: 4-12 weeks retention
- Monthly backups: 12-84 months retention
Running hourly backups with 365-day retention creates 8,760 files per year. That's almost certainly overkill. Match retention to the backup frequency tier.
Cost-recovery tradeoff
Longer retention costs more but provides better recovery options. Calculate the breakeven:
- Cost of keeping 90 vs 30 days: $X/month extra
- Probability of needing day 31-90 backup: Y%
- Cost of data loss if needed backup is missing: $Z
If X < (Y × Z), keep the longer retention. This isn't always easy to calculate, but it frames the decision correctly.
Testing retention and recovery
A retention policy means nothing if backups can't be restored. Regular testing validates both.
Retention verification
Periodically check that retention is working:
# Count backups by age
find /var/backups/postgresql -name "*.dump" -mtime -7 | wc -l # Last 7 days
find /var/backups/postgresql -name "*.dump" -mtime +30 | wc -l # Older than 30 days
# List oldest and newest
ls -lt /var/backups/postgresql/*.dump | head -5 # Newest
ls -lt /var/backups/postgresql/*.dump | tail -5 # Oldest
If backups older than your retention period exist, something is broken.
Recovery testing
Test restores from different backup ages:
# Restore yesterday's backup
createdb restore_test
pg_restore -d restore_test /var/backups/postgresql/backup_yesterday.dump
# Verify data
psql -c "SELECT count(*) FROM important_table" restore_test
# Cleanup
dropdb restore_test
Test monthly to confirm both recent and older backups restore successfully.
Common retention mistakes
Several patterns cause problems with retention policies.
No retention at all
The most common mistake. Backups accumulate until storage fills, then new backups fail. Always define explicit retention rules.
Retention shorter than detection time
If it takes a week to notice data corruption but you only keep 3 days of backups, you lose data. Retention must exceed your detection window.
Identical retention across environments
Production needs longer retention than development. A blanket 7-day policy might be fine for dev but dangerous for prod.
Manual cleanup dependency
Relying on someone to manually delete old backups guarantees eventual failure. Automate retention via scripts, cloud policies, or tools like Databasus.
Ignoring restore testing
Keeping 365 days of backups provides false confidence if they're corrupted or unrestorable. Test periodically.
Conclusion
Retention policies transform chaotic backup storage into a manageable system. Start simple with time-based or count-based retention, then evolve to GFS or tiered storage as needs grow. Match retention periods to actual recovery requirements and compliance obligations rather than arbitrary numbers. Automate cleanup with scripts, cloud lifecycle policies, or dedicated backup tools. And test regularly — retention rules only matter if backups actually restore when needed.

Top comments (0)