Learn how we achieved zero-downtime PostgreSQL 13 to 15 migration, replicating 60 million rows in 40 minutes and now handling 220,000 rows/second with sub-500ms latency using Helyx.
The 2-Hour Maintenance Window That Changed Everything
"We have a hard deadline. The analytics platform migration from PostgreSQL 13 to 15 needs to happen next week. The maintenance window? Two hours. Total."
My heart sank. We were looking at:
6 Crore (60 million) rows of time-series data
2 TB of total database size
Zero tolerance for data loss
Minimal performance impact on live services
Traditional pg_dump/pg_restore would take 6-8 hours. Logical replication with native PostgreSQL tools would require hours of configuration and still risk falling behind. We needed a different approach.
Why This Migration Was Particularly Tricky
The Technical Challenges:
Version Jump: Moving from PostgreSQL 13 โ 15 introduces new data types and behavior changes
Live Data: The database was receiving 50,000+ writes per second from production applications
Complex Schema: Multiple hypertables, complex indexes, and JSONB columns
Business Critical: This analytics platform drove real-time dashboards for enterprise clients
Why Native Tools Fell Short:
pg_dump: ~4 hours
network transfer: ~1 hour
pg_restore: ~3 hours
total: ~8 hours + replication catch-up time
We had 2 hours. Not 8.
The Helyx Solution: A Two-Phase Approach
Phase 1: The Initial Bulk Migration
We decided to use Helyx for both the initial migration and ongoing replication. The setup was surprisingly simple.
The moment of truth: We triggered the initial sync at 2:00 PM during our maintenance window.
2:10 PM: 1 Crore rows replicated
2:25 PM: 3 Crore rows replicated
2:40 PM: 6 Crore rows replicated - DONE
Yes, you read that correctly. 6 Crore rows migrated in 40 minutes flat.
Phase 2: Real-Time Replication Activation
Once the initial sync completed, we seamlessly switched to Change Data Capture (CDC) mode.
The Performance Numbers That Made Us Smile
Migration Performance
Metric Result
Total Rows Migrated 6,00,00,000 (60 Million)
Migration Time 40 minutes
Effective Throughput ~2.5 Million rows/minute
Peak Memory Usage 4.2 GB
CPU Utilization 65%
Ongoing Replication Performance:
Metric Result
Current Throughput 220,000 rows/second
Replication Latency < 500 ms (p95)
CPU Overhead on Source < 3%
Network Bandwidth ~180 MBps
Key technical insights:
Parallel Workers: Helyx used 8 parallel workers to maximize I/O throughput
Smart Batching: 50,000-row batches optimized network transfer efficiency
Memory Mapping: Efficient memory usage prevented OOM errors
Continuous Validation: Real-time checksum validation ensured data integrity
Take the Next Step
If you're facing an upcoming database migration or building a high-volume replication system, I'd recommend:
๐ Try Helyx with your own dataset - Start Free Trial
๐ Read our technical deep dive - PostgreSQL CDC Architecture Guide
๐ Schedule a custom PoC - Book a Demo (https://helyx.quobotic.com/)
What's your experience with large database migrations? Have you faced similar challenges or found other creative solutions? Share your stories in the comments below - let's learn from each other!
Top comments (0)