DEV Community

丁久
丁久

Posted on • Originally published at dingjiu1989-hue.github.io

Read Replicas: Scaling Reads, Replication Lag, and Failover

This article was originally published on AI Study Room. For the full version with working code examples and related articles, visit the original post.

Read Replicas: Scaling Reads, Replication Lag, and Failover

Read Replicas: Scaling Reads, Replication Lag, and Failover

Read Replicas: Scaling Reads, Replication Lag, and Failover

Read Replicas: Scaling Reads, Replication Lag, and Failover

Read Replicas: Scaling Reads, Replication Lag, and Failover

Read Replicas: Scaling Reads, Replication Lag, and Failover

Read Replicas: Scaling Reads, Replication Lag, and Failover

Read Replicas: Scaling Reads, Replication Lag, and Failover

Read Replicas: Scaling Reads, Replication Lag, and Failover

Read Replicas: Scaling Reads, Replication Lag, and Failover

Read Replicas: Scaling Reads, Replication Lag, and Failover

Read Replicas: Scaling Reads, Replication Lag, and Failover

Read Replicas: Scaling Reads, Replication Lag, and Failover

Read Replicas: Scaling Reads, Replication Lag, and Failover

Read replicas are copies of your primary database that serve read queries. They are the most common and cost-effective way to scale database read throughput. This article covers setup, load balancing, lag monitoring, and failover strategies.

How Read Replicas Work

The primary database streams changes to replicas via the write-ahead log (WAL). In PostgreSQL, this is called streaming replication. The replica continuously applies WAL records to stay current.

PostgreSQL Streaming Replication Setup

On the primary:

postgresql.conf

wal_level = replica

max_wal_senders = 5

wal_keep_size = 1024

Create a replication user:

CREATE ROLE replicator WITH LOGIN REPLICATION PASSWORD 'secure_password';

Allow replication in pg_hba.conf:

host replication replicator replica-host/32 md5

On the replica:

pg_basebackup -h primary-host -D /var/lib/postgresql/data \

-U replicator -P -v --wal-method=stream

Create a standby.signal file and start PostgreSQL. The replica streams continuously.

MySQL Replica Setup

my.cnf on primary

server_id = 1

log_bin = /var/log/mysql/mysql-bin.log

binlog_format = ROW

CREATE USER 'replicator'@'%' IDENTIFIED BY 'secure_password';

GRANT REPLICATION SLAVE ON . TO 'replicator'@'%';

On the replica:

CHANGE MASTER TO

MASTER_HOST='primary-host',

MASTER_USER='replicator',

MASTER_PASSWORD='secure_password',

MASTER_LOG_FILE='mysql-bin.000001',

MASTER_LOG_POS=0;

START SLAVE;

SHOW SLAVE STATUS\G;

Load Balancing Strategies

Application-level read/write splitting is the most common pattern:

import psycopg2

class DatabaseRouter:

def init(self, primary_dsn, replica_dsns):

self.primary = psycopg2.connect(primary_dsn)

self.replicas = [psycopg2.connect(dsn) for dsn in replica_dsns]

self.round_robin = 0

def get_connection(self, read_only=False):

if read_only and self.replicas:

conn = self.replicas[self.round_robin % len(self.replicas)]

self.round_robin += 1

return conn

return self.primary

def execute_read(self, query, params=None):

conn = self.get_connection(read_only=True)

with conn.cursor() as cur:

cur.execute(query, params or ())

return cur.fetchall()

def execute_write(self, query, params=None):

conn = self.get_connection(read_only=False)

with conn.cursor() as cur:

cur.execute(query, params or ())

conn.commit()

A proxy layer like PgBouncer, ProxySQL, or HAProxy handles routing transparently:

ProxySQL query rules

mysql_query_rules:

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\- rule_id: 1

active: 1

match: "^SELECT .*"

destination_hostgroup: 1 # replicas

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\- rule_id: 2

active: 1

match: "^(INSERT|UPDATE|DELETE|CREATE|DROP|ALTER) .*"

destination_hostgroup: 0 # primary

Replication Lag

Replication lag is the time between a commit on the primary and its visibility on a replica. Causes include:

  • Large transactions on the primary that must be applied in full on replicas.

  • Replica hardware that is slower than the primary.

  • Network latency between primary and replica.

  • Long-running queries on the replica competing for I/O.

Monitoring Lag

PostgreSQL offers precise lag metrics:

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- On the replica

SELECT pg_last_wal_receive_lsn(),

pg_last_wal_replay_lsn(),

pg_wal_lsn_diff(pg_last_wal_receive_lsn(),

pg_last_wal_replay_lsn()) AS replay_lag_bytes;

In MySQL:

SHOW SLAVE STATUS\G

\\\\\\\\\\\\\\\\\\\\\\\


Read the full article on AI Study Room for complete code examples, comparison tables, and related resources.

Found this useful? Check out more developer guides and tool comparisons on AI Study Room.

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.