Day 15: Replication & High Availability – The Buddy System for PostgreSQL

#tutorial #database #career #beginners

We made it! After two weeks of digging into data types, transactions, security and backups, it's time for the grand finale: replication and high availability. Imagine your database as a solo adventurer. What happens if they sprain an ankle? With replication, you give them a trusty sidekick (or three) so the journey continues.

High availability (HA) keeps your database operational and accessible during system failures【455222259006887†L41-L69】. It combines replication, failover, load balancing and monitoring so your data is always ready【455222259006887†L52-L74】. Let's explore the options in an easy, human way.

🤝 Physical Replication – Cloning the Database

Physical replication copies every byte from the primary server to standby servers via the write‑ahead log (WAL). This log‑shipping (also called warm standby) configuration has the primary running in continuous archiving mode and standbys in continuous recovery mode【167357874626432†L46-L56】. WAL files are shipped after each transaction commit, so there’s a small window of potential data loss【167357874626432†L69-L76】.

Setting Up Streaming Replication

Here’s a simplified workflow to get you started:

On the primary server:

-- In postgresql.conf
wal_level = replica
max_wal_senders = 10
archive_mode = on
archive_command = 'cp %p /var/backups/wal/%f'
hot_standby = on

-- Create a replication user
CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'securepass';

On the standby server:

# Take a base backup
pg_basebackup -h primary_host -D /var/lib/postgresql/15/main -U replicator -Fp -Xs -P

# Create standby.signal to enter standby mode
touch /var/lib/postgresql/15/main/standby.signal

# Edit postgresql.conf
primary_conninfo = 'host=primary_host user=replicator password=securepass'

Restart both servers. The standby will start streaming WAL changes from the primary and can serve read‑only queries (hot standby)【167357874626432†L83-L85】. For synchronous replication, set synchronous_standby_names on the primary to ensure commits wait until a standby acknowledges them, eliminating data loss at the cost of latency【455222259006887†L96-L104】.

🧠 Logical Replication – Selective Data Sharing

Logical replication publishes changes at the table level rather than copying the entire data directory. It uses a publisher/subscriber model to replicate chosen tables, ideal for multi‑region deployments and zero‑downtime upgrades【455222259006887†L109-L117】.

-- On the publisher (primary)
CREATE PUBLICATION mypub FOR TABLE customers, orders;

-- On the subscriber
CREATE SUBSCRIPTION mysub
    CONNECTION 'dbname=primary_db host=primary_host user=replicator password=securepass'
    PUBLICATION mypub;

Logical replication is asynchronous but offers fine‑grained control over data and schema. Use it when you need different tables on different nodes or want to migrate gradually.

⚖️ Failover & Load Balancing – Plan B

Replication alone isn’t enough. You need tools to handle failures and distribute traffic:

Failover managers like repmgr or patroni automate promoting a standby to primary when the current primary fails. They monitor node health and update application connection strings.
Load balancers like pgpool‑II or pgbouncer route read queries to standbys and manage connection pooling, improving throughput【455222259006887†L65-L66】.
Monitoring is essential to detect replication lag, disk space issues and network problems【455222259006887†L68-L71】.

Choose synchronous vs asynchronous replication based on your tolerance for data loss vs performance【167357874626432†L69-L76】. Synchronous replication eliminates data loss but slows down commits; asynchronous offers speed but may lose the last few transactions.

🚀 Quick Start Checklist

Component	Purpose	Example Tool
Streaming Replication	Keep a byte‑for‑byte copy of your database	Built‑in WAL shipping & `pg_basebackup`
Failover Management	Detect failures and promote standbys	`repmgr`, `patroni`
Load Balancing & Connection Pooling	Spread read traffic & manage connections	`pgpool‑II`, `pgbouncer`
Logical Replication	Selectively sync tables for upgrades or multi‑region	Built‑in `CREATE PUBLICATION`/`CREATE SUBSCRIPTION`

🎯 Your Final Challenge

Set up a simple streaming replication pair on two Postgres instances.
Experiment with synchronous vs asynchronous replication and watch how commits behave.
Use logical replication to replicate a single table to another database for reporting.

Thank you for staying with me through this journey! I’m sorry the series wasn’t always punctual — life happens and posts got delayed. But we made it together, and I’m grateful for your patience and enthusiasm. This wraps up our 15‑day PostgreSQL series. Keep experimenting, keep asking questions, and keep having fun with your databases. 🧡