I recently upgraded a production PostgreSQL database from version 13.20 to 17.7 on AWS RDS. This post documents the entire journey - including the mistakes I made, the approaches I considered, and how I achieved a switchover with minimal downtime.
The Problem
My application runs on AWS with PostgreSQL 13.20 on RDS. PostgreSQL 13 reaches end of standard support in November 2025, and I wanted to get ahead of forced upgrades. More importantly, PostgreSQL 17 brings significant performance improvements and features I wanted to leverage.
The challenge: my database serves a production application with active users. Traditional major version upgrades on RDS can take 10-30 minutes of downtime depending on database size - unacceptable for my use case.
I needed to answer one question: How do I upgrade PostgreSQL by 4 major versions with minimal disruption?
Understanding the Upgrade Options
Before diving into implementation, I evaluated three approaches:
Option 1: In-Place Upgrade (Native RDS)
The simplest approach - modify the engine_version in Terraform and apply.
resource "aws_db_instance" "postgres_rds" {
engine_version = "17.7" # Changed from 13.20
}
Problem: RDS performs a full pg_upgrade which:
- Takes the database offline during the entire upgrade
- Duration depends on database size (10-30+ minutes typically)
- No rollback if something goes wrong post-upgrade
I ruled this out immediately.
Option 2: Manual Blue-Green with Read Replica
Create a read replica, upgrade it, then promote and switch DNS.
Problem:
- PostgreSQL read replicas cannot be upgraded independently on RDS
- The replica must match the primary's major version
- This approach works for MySQL on RDS, not PostgreSQL
Option 3: AWS Blue-Green Deployments (My Choice)
AWS introduced Blue-Green Deployments for RDS in late 2022. This creates a synchronized staging environment (Green) that mirrors your production database (Blue), performs the upgrade on Green, then switches over with minimal downtime.
Why I chose this:
- AWS handles replication automatically using logical replication
- Switchover is fast (typically under 1 minute)
- Built-in rollback capability before switchover
- Supported for PostgreSQL major version upgrades
The Mistake: RDS Proxy Doesn't Mix with Blue-Green
My initial plan included RDS Proxy to minimize connection disruption during switchover. You might be wondering - why would I need a proxy for a database upgrade? I'll explain the reasoning later, but for now, just know that I expected it to help during the critical switchover moment.
I added RDS Proxy to my Terraform configuration:
resource "aws_db_proxy" "rds_proxy" {
name = "staging-proxy"
engine_family = "POSTGRESQL"
require_tls = true
vpc_security_group_ids = [aws_security_group.proxy_sg.id]
vpc_subnet_ids = data.aws_subnets.private.ids
auth {
auth_scheme = "SECRETS"
secret_arn = aws_secretsmanager_secret.proxy_credentials.arn
iam_auth = "DISABLED"
}
}
resource "aws_db_proxy_default_target_group" "proxy_target" {
db_proxy_name = aws_db_proxy.rds_proxy.name
}
resource "aws_db_proxy_target" "proxy_target" {
db_proxy_name = aws_db_proxy.rds_proxy.name
target_group_name = aws_db_proxy_default_target_group.proxy_target.name
db_instance_identifier = aws_db_instance.postgres_rds.identifier
}
When I attempted to create the Blue-Green deployment, AWS returned this error:
Databases using RDS Proxy are not currently supported for Blue Green Deployments
This is a hard limitation. You cannot create a Blue-Green deployment for any RDS instance that has an RDS Proxy attached to it.
Lesson learned: Always verify service compatibility before architecting a solution. I assumed these two features would work together because both aim to reduce downtime. They don't.
Prerequisites for Blue-Green Deployments
Before you can create a Blue-Green deployment for PostgreSQL, your source database needs specific configuration:
1. Automated Backups Enabled
Blue-Green uses logical replication which requires point-in-time recovery capability.
resource "aws_db_instance" "postgres_rds" {
backup_retention_period = 7 # Must be > 0
backup_window = "03:00-04:00"
}
2. Logical Replication Enabled
The source database needs a parameter group with logical replication enabled:
resource "aws_db_parameter_group" "postgres13_blue_green" {
family = "postgres13"
name = "postgres13-blue-green"
parameter {
name = "rds.logical_replication"
value = "1"
apply_method = "pending-reboot"
}
parameter {
name = "max_replication_slots"
value = "10"
apply_method = "pending-reboot"
}
parameter {
name = "max_wal_senders"
value = "10"
apply_method = "pending-reboot"
}
}
Important: Enabling rds.logical_replication requires a database reboot. Plan for this before your upgrade window.
3. Valid Upgrade Path
Not all version combinations are valid. Check available upgrade paths:
aws rds describe-db-engine-versions \
--engine postgres \
--engine-version 13.20 \
--query 'DBEngineVersions[0].ValidUpgradeTarget[?MajorEngineVersion==`17`]' \
--region ap-southeast-2
I initially targeted 17.2 and got this error:
Cannot find upgrade path from 13.20 to 17.2
The valid target for me was 17.7.
The Upgrade Process
Here's the step-by-step process I followed:
Step 1: Detach RDS Proxy (if attached)
Since I had already deployed RDS Proxy, I had to detach it first:
# Remove the proxy target
aws rds deregister-db-proxy-targets \
--db-proxy-name staging-proxy \
--db-instance-identifiers dev-postgres-rds \
--region ap-southeast-2
Step 2: Create the Blue-Green Deployment
aws rds create-blue-green-deployment \
--blue-green-deployment-name "pg13-to-pg17-upgrade" \
--source "arn:aws:rds:ap-southeast-2:123456789:db:dev-postgres-rds" \
--target-engine-version "17.7" \
--target-db-parameter-group-name "default.postgres17" \
--region ap-southeast-2
This kicks off the following process:
- AWS creates a new RDS instance (Green) with PostgreSQL 17.7
- Takes a snapshot of your Blue database
- Restores it to the Green instance
- Sets up logical replication from Blue to Green
- Syncs all changes
Step 3: Wait for Provisioning
The Green environment takes time to provision. Monitor the status:
aws rds describe-blue-green-deployments \
--blue-green-deployment-identifier "pg13-to-pg17-upgrade" \
--query 'BlueGreenDeployments[0].Status' \
--region ap-southeast-2
Status progression:
-
PROVISIONING- Creating the Green environment -
AVAILABLE- Ready for switchover
For my 5GB database, this took approximately 35 minutes.
Step 4: Verify Green Environment
Before switching, verify the Green instance:
# Check the Green instance details
aws rds describe-blue-green-deployments \
--blue-green-deployment-identifier "pg13-to-pg17-upgrade" \
--query 'BlueGreenDeployments[0].SwitchoverDetails' \
--region ap-southeast-2
Confirm:
- Engine version shows 17.7
- Replication lag is minimal
- Status is "AVAILABLE"
Step 5: Execute Switchover
This is the critical moment. The switchover:
- Stops writes to the Blue database
- Waits for replication to catch up
- Promotes Green to primary
- Renames instances (Blue becomes
-old, Green takes the original name) - Updates the endpoint
aws rds switchover-blue-green-deployment \
--blue-green-deployment-identifier "pg13-to-pg17-upgrade" \
--switchover-timeout 300 \
--region ap-southeast-2
My measured downtime: 20 seconds
The switchover started at 16:26:42 and completed at 16:27:02 (UTC).
What Happens During Those 20 Seconds?
This is the part I promised to explain earlier - why I wanted RDS Proxy in the first place.
During the switchover window, your database is essentially unreachable. Any write operation attempted during this time will fail - not queue, not wait, just fail. Your application will receive errors like connection refused or connection reset.
This is "minimal downtime," not "zero downtime." The difference matters.
Why I Wanted RDS Proxy
RDS Proxy sits between your application and database, maintaining connection pools. The theory was:
- Proxy holds active connections during the switchover
- Buffers requests briefly while the endpoint changes
- Redirects connections to the new (Green) instance seamlessly
- Writes would wait instead of fail outright
This would have turned that 20-second failure window into a 20-second "slow response" window - much more graceful.
The Unfortunate Reality
AWS doesn't allow this. Blue-Green deployments and RDS Proxy are mutually exclusive. You must choose:
Blue-Green (no proxy)
- Downtime: ~20 seconds
- Write behavior: writes fail during switchover, retry logic required
RDS Proxy (no Blue-Green)
- Downtime: 10–30 minutes (in-place upgrade)
- Write behavior: connections handled gracefully during upgrade
I chose Blue-Green because 20 seconds of failed writes is better than 30 minutes of total unavailability. But your application needs to handle those transient failures - implement retry logic or return appropriate errors to users.
Step 6: Cleanup
After verifying the upgrade:
# Delete the old instance
aws rds delete-db-instance \
--db-instance-identifier dev-postgres-rds-old1 \
--skip-final-snapshot \
--region ap-southeast-2
# Delete the Blue-Green deployment
aws rds delete-blue-green-deployment \
--blue-green-deployment-identifier "pg13-to-pg17-upgrade" \
--region ap-southeast-2
What About Terraform State?
If you manage your RDS instance with Terraform, the Blue-Green switchover creates a state mismatch. The instance identifier remains the same, but AWS has essentially replaced the instance.
After switchover, run:
terraform plan
You'll likely see Terraform wanting to modify parameters to match your configuration. Review the plan carefully - most changes should be no-ops or minor parameter adjustments.
I also cleaned up my Terraform config post-upgrade:
- Removed the postgres13 parameter group (no longer needed)
- Updated
engine_versionto "17.7" - Removed Blue-Green specific configuration
resource "aws_db_instance" "postgres_rds" {
identifier = "dev-postgres-rds"
engine = "postgres"
engine_version = "17.7"
instance_class = "db.t3.micro"
allocated_storage = 5
# Backups - good practice to keep enabled
backup_retention_period = 7
backup_window = "03:00-04:00"
# ... rest of config
}
Trade-offs and Limitations
What Blue-Green Deployments Handle Well
- Major version upgrades with minimal downtime
- Safe rollback option (before switchover)
- Automatic data synchronization
What They Don't Handle
- Databases with RDS Proxy attached
- Multi-AZ DB clusters (as of early 2024)
- Databases with more than 100 databases inside the instance
- Cross-region scenarios
Hidden Costs
- You pay for two RDS instances during the Blue-Green provisioning period
- My upgrade window cost approximately 35 minutes of double billing
- For larger databases, this could be significant
Lessons Learned
Verify service compatibility before architecting - RDS Proxy and Blue-Green Deployments don't work together. I wasted time setting up proxy infrastructure I had to tear down.
Check valid upgrade paths early - Not every version combination is valid.
aws rds describe-db-engine-versionsis your friend.Logical replication requires a reboot - Enabling
rds.logical_replicationneeds a database restart. Factor this into your planning.Minimal downtime is achievable - For a 5GB database, the actual switchover was remarkably fast. Your application should handle brief connection interruptions gracefully.
Blue-Green works via CLI better than Terraform - The Terraform
blue_green_updateblock exists but using AWS CLI gives you more control over timing and verification steps.Clean up immediately - The old instance keeps running and billing you. Delete it promptly after verification.
Key Takeaways
- AWS Blue-Green Deployments are the best option for PostgreSQL major version upgrades with minimal downtime
- RDS Proxy is incompatible with Blue-Green Deployments - choose one approach
- Expect ~30-45 minutes of provisioning time, but only seconds of actual downtime
- Always verify upgrade paths and prerequisites before starting
- Have a rollback plan, even though I didn't need mine
References I Followed
- AWS RDS Blue-Green Deployments Overview
- Switching Over a Blue-Green Deployment
- Blue-Green Deployment Limitations
- RDS Proxy Documentation
- Upgrading PostgreSQL DB Engine Versions
This post documents a real production upgrade performed in January 2026. Your mileage may vary based on database size, AWS region, and specific configuration.
If you're planning a similar upgrade, feel free to ask questions in the comments.
Top comments (0)