DEV Community

Cover image for PostgreSQL logical replication — 5 steps to set up real-time data sync across servers
Finny Collins
Finny Collins

Posted on

PostgreSQL logical replication — 5 steps to set up real-time data sync across servers

PostgreSQL has had logical replication built in since version 10, and it remains one of the most practical ways to keep data in sync between two or more servers. Unlike physical replication, which copies the entire database cluster byte-for-byte, logical replication works at the row level. You pick which tables to replicate, and PostgreSQL streams the changes in real time.

This article walks through the full setup in five steps. By the end you'll have a working publisher-subscriber pair and know how to monitor it.

PostgreSQL logical replication

What is logical replication and when do you need it

Logical replication uses a publish-subscribe model. The source database (publisher) defines a publication — a set of tables whose changes should be broadcast. The target database (subscriber) creates a subscription that connects to the publisher, pulls the initial data snapshot and then receives a continuous stream of INSERT, UPDATE and DELETE operations.

The key difference from physical (streaming) replication is granularity. Physical replication mirrors the entire cluster, operates at the WAL byte level and requires identical PostgreSQL major versions on both sides. Logical replication works per-table, decodes WAL into logical change events and allows different major versions or even different schemas on the subscriber.

This makes logical replication useful in a range of scenarios that physical replication simply cannot handle.

Logical replication Physical replication
Granularity Per-table Entire cluster
Cross-version support Yes No (same major version required)
Subscriber writes Allowed Read-only standby
Schema differences Allowed (column subset) Must be identical
DDL replication No Yes (WAL-level)
Typical use case Selective sync, migrations, consolidation High availability, failover

Use logical replication when you need to replicate a subset of tables, consolidate data from multiple sources into one reporting database, migrate between PostgreSQL major versions with minimal downtime or feed changes into a data warehouse alongside normal writes.

Prerequisites

Before you start, make sure you have the following in place:

  • Two PostgreSQL instances running version 10 or later (publisher and subscriber). They can be on different major versions
  • Network connectivity between the two servers on the PostgreSQL port (default 5432)
  • A superuser or a user with the REPLICATION role on the publisher
  • The tables you want to replicate must have a primary key or a REPLICA IDENTITY set. Without one, UPDATE and DELETE operations will fail on the subscriber

If both instances are on the same machine for testing, just use different ports. The rest of the setup is identical.

Step 1 — configure the publisher database

Logical replication requires the WAL level to be set to logical. By default PostgreSQL uses replica, so you need to change this in postgresql.conf on the publisher.

Open the config file and set:

wal_level = logical
Enter fullscreen mode Exit fullscreen mode

You also need at least one replication slot available. Check that max_replication_slots is high enough. The default of 10 is fine for most setups, but if you already use slots for physical replication or other subscribers, increase it:

max_replication_slots = 10
max_wal_senders = 10
Enter fullscreen mode Exit fullscreen mode

Next, update pg_hba.conf to allow the subscriber to connect with replication privileges. Add a line like this:

host    all    replication_user    192.168.1.0/24    md5
Enter fullscreen mode Exit fullscreen mode

Replace the IP range with your subscriber's actual address. After editing both files, restart PostgreSQL:

sudo systemctl restart postgresql
Enter fullscreen mode Exit fullscreen mode

You can verify the WAL level is active by running:

SHOW wal_level;
Enter fullscreen mode Exit fullscreen mode

It should return logical. If it still shows replica, the restart didn't pick up the config change. Double-check the file path and try again.

Step 2 — create a publication

On the publisher database, create a publication for the tables you want to replicate. Connect to the database as a superuser or the replication user and run:

CREATE PUBLICATION my_publication FOR TABLE users, orders;
Enter fullscreen mode Exit fullscreen mode

This publishes changes from the users and orders tables. If you want to publish all tables in the database, use:

CREATE PUBLICATION my_publication FOR ALL TABLES;
Enter fullscreen mode Exit fullscreen mode

You can also control which operations are published. By default all of them (INSERT, UPDATE, DELETE, TRUNCATE) are included. To limit it:

CREATE PUBLICATION my_publication FOR TABLE users
    WITH (publish = 'insert,update');
Enter fullscreen mode Exit fullscreen mode

To check what publications exist:

SELECT * FROM pg_publication;
Enter fullscreen mode Exit fullscreen mode

And to see which tables are in a publication:

SELECT * FROM pg_publication_tables;
Enter fullscreen mode Exit fullscreen mode

Publications are lightweight metadata objects. Creating one doesn't start any replication by itself. That happens when a subscriber connects.

Step 3 — prepare the subscriber database

The subscriber needs to have the same table structure as the publisher. Logical replication does not copy DDL, so you must create the tables manually before starting.

The easiest way is to dump the schema from the publisher and restore it on the subscriber:

pg_dump -h publisher_host -U postgres -s -t users -t orders mydb > schema.sql
psql -h subscriber_host -U postgres -d mydb -f schema.sql
Enter fullscreen mode Exit fullscreen mode

The -s flag dumps schema only, no data. The initial data will come through the replication snapshot when the subscription is created.

Make sure the target tables are empty. If they already contain data, you'll get duplicate key errors during the initial sync. Either truncate them or create the subscription with copy_data = false (more on that in the next step).

The subscriber user needs permissions to write to these tables. If you're using the same superuser, that's already covered.

Step 4 — create a subscription

On the subscriber database, create a subscription that points to the publisher:

CREATE SUBSCRIPTION my_subscription
    CONNECTION 'host=publisher_host port=5432 dbname=mydb user=replication_user password=secret'
    PUBLICATION my_publication;
Enter fullscreen mode Exit fullscreen mode

As soon as you run this, PostgreSQL will:

  1. Connect to the publisher
  2. Create a replication slot on the publisher
  3. Copy the initial table data (snapshot)
  4. Start streaming live changes

If the tables on the subscriber already have data and you only want to start streaming from now, skip the initial copy:

CREATE SUBSCRIPTION my_subscription
    CONNECTION 'host=publisher_host port=5432 dbname=mydb user=replication_user password=secret'
    PUBLICATION my_publication
    WITH (copy_data = false);
Enter fullscreen mode Exit fullscreen mode

To check the subscription status:

SELECT * FROM pg_stat_subscription;
Enter fullscreen mode Exit fullscreen mode

You should see a row with a non-null pid, which means the subscription worker is running and connected.

Step 5 — verify and monitor replication

After creating the subscription, verify that data is actually flowing. Insert a row on the publisher and check if it appears on the subscriber:

-- On publisher
INSERT INTO users (name, email) VALUES ('test_user', 'test@example.com');

-- On subscriber
SELECT * FROM users WHERE name = 'test_user';
Enter fullscreen mode Exit fullscreen mode

If the row shows up, replication is working. For ongoing monitoring, these views are your main tools.

On the publisher, check replication slot status and lag:

SELECT slot_name, active, restart_lsn, confirmed_flush_lsn
FROM pg_replication_slots;
Enter fullscreen mode Exit fullscreen mode
SELECT client_addr, state, sent_lsn, write_lsn, flush_lsn, replay_lsn
FROM pg_stat_replication;
Enter fullscreen mode Exit fullscreen mode

On the subscriber, check subscription state and table sync progress:

SELECT subname, pid, received_lsn, latest_end_lsn, latest_end_time
FROM pg_stat_subscription;
Enter fullscreen mode Exit fullscreen mode
SELECT srsubid, srrelid, srsublsn
FROM pg_subscription_rel;
Enter fullscreen mode Exit fullscreen mode

If pg_stat_replication shows a growing gap between sent_lsn and replay_lsn, the subscriber is falling behind. This usually means the subscriber is under heavy load or the network is slow. Check the subscriber's PostgreSQL logs for errors.

Common issues and how to fix them

Most problems with logical replication happen during setup or when the publisher schema changes. Here are the ones you'll run into most often.

Problem Cause Solution
ERROR: logical decoding requires wal_level >= logical wal_level not set or restart not done Set wal_level = logical in postgresql.conf and restart PostgreSQL
Initial sync stuck or very slow Large tables being copied Monitor pg_stat_subscription. Consider increasing max_sync_workers_per_subscription
ERROR: could not create replication slot All slots in use Increase max_replication_slots and restart
UPDATE or DELETE fails on subscriber Table has no primary key Add a primary key or set REPLICA IDENTITY FULL on the publisher table
Subscriber stops receiving changes Network issue or publisher restart Check pg_stat_subscription for errors. The subscription auto-reconnects in most cases
Duplicate key errors during initial sync Target table already has data Truncate the table or use copy_data = false when creating subscription
Schema mismatch after ALTER TABLE DDL changes are not replicated Apply the same DDL on the subscriber manually before the change takes effect

When something breaks, the subscriber logs are the first place to look. PostgreSQL is usually specific about what went wrong.

Logical replication limitations

Logical replication covers a lot, but it has clear boundaries worth knowing before you commit to it:

  • DDL changes (CREATE TABLE, ALTER TABLE, DROP) are not replicated. You need to apply schema changes on both sides manually or use a migration tool
  • Sequences are not synced. If you fail over to the subscriber, sequence values will be out of date. You'll need to reset them manually
  • Large objects (the lo type) are not supported
  • TRUNCATE replication was added in PostgreSQL 11. Earlier versions don't replicate it
  • There's no built-in conflict resolution. If the same row is modified on both publisher and subscriber, you get an error and replication stops until you fix it

These limitations are manageable for most use cases. But if you need full cluster replication with automatic failover, physical replication or a tool like Patroni is a better fit.

Keeping replicated data safe

Replication is not a backup. If someone runs a bad DELETE on the publisher, that delete gets replicated too. You need actual backups alongside replication. Databasus is the most widely used tool for PostgreSQL backup and the industry standard for managing scheduled backups. It supports logical, physical and incremental backups with point-in-time recovery, handles multiple storage destinations and takes a few minutes to set up through its web UI.

Setting up logical replication takes about 15 minutes once you've done it a couple of times. The five steps above cover the core flow. From there, you can add more tables to the publication, create additional subscribers or combine logical replication with physical standby servers for both selective sync and high availability.

Top comments (0)