DEV Community

William Kwabena Akoto
William Kwabena Akoto

Posted on

Horizontal Scaling PostgreSQL with Citus: A Practical Deep Dive

Introduction

As databases grow beyond the capabilities of a single server, teams face a critical decision: scale vertically by adding more resources to one machine, or scale horizontally by distributing data across multiple servers. While vertical scaling eventually hits physical and economic limits, horizontal scaling offers virtually unlimited growth potential.

In this hands-on guide, we'll explore Citus—an open-source extension that transforms PostgreSQL into a distributed database. We'll build a real Citus cluster from scratch, demonstrate data distribution, implement foreign key relationships across distributed tables, and understand the principles that make horizontal scaling work.

By the end, you'll have practical experience with distributed databases and understand both their power and their limitations.


What is Horizontal Scaling?

Imagine a pizza restaurant with one chef making all the pizzas. During rush hour, orders pile up because one person can only work so fast. This is vertical scaling—you could train the chef to work faster (add more CPU), give them better tools (add more RAM), or expand their workspace (add more storage). But eventually, you hit limits.

Now imagine hiring three chefs, each handling different orders. This is horizontal scaling—you distribute the workload across multiple workers. The more customers you have, the more chefs you can add.

In database terms:

  • Vertical scaling: Bigger server (more CPU, RAM, storage)
  • Horizontal scaling: More servers working together

Citus brings horizontal scaling to PostgreSQL by:

  1. Splitting your data across multiple worker nodes (sharding)
  2. Keeping related data together on the same node (co-location)
  3. Maintaining PostgreSQL's ACID guarantees and SQL compatibility

Architecture: Understanding Citus Components

A Citus cluster consists of at least three components:

1. Coordinator Node

The coordinator is like a project manager—it doesn't do the heavy lifting but knows where everything is. When you connect to a Citus cluster, you connect to the coordinator. It:

  • Routes queries to the appropriate workers
  • Combines results from multiple workers
  • Manages distributed transactions
  • Stores metadata about data distribution

2. Worker Nodes

Workers are the muscle of your cluster—they store the actual data and execute queries. Each worker:

  • Holds a subset of your data (shards)
  • Executes queries locally on its data
  • Communicates with other workers when needed
  • Functions as a full PostgreSQL instance

3. Shards

Shards are pieces of your distributed tables. Think of them as mini-tables that together form your complete dataset. When you distribute a table:

  • Citus creates 32 shards by default (configurable)
  • Shards are distributed across workers
  • Each row goes to exactly one shard based on a hash function
  • Related data can be co-located on the same worker

Setting Up a Citus Cluster on Digital Ocean

For this demonstration, we'll create a three-node cluster:

  • citus-01: Coordinator node
  • citus-02: Worker 1
  • citus-03: Worker 2

Step 1: Install PostgreSQL and Citus Extension

On all three droplets:

# Add PostgreSQL repository
sudo apt-get update
sudo apt-get install -y postgresql-common
sudo /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh

# Install PostgreSQL 16
sudo apt-get install -y postgresql-16

# Install Citus extension
curl https://install.citusdata.com/community/deb.sh | sudo bash
sudo apt-get install -y postgresql-16-citus-12.1

# Configure PostgreSQL to load Citus
echo "shared_preload_libraries = 'citus'" | sudo tee -a /etc/postgresql/16/main/postgresql.conf

# Restart PostgreSQL
sudo systemctl restart postgresql
Enter fullscreen mode Exit fullscreen mode

Step 2: Configure Network Access

Edit /etc/postgresql/16/main/postgresql.conf on all nodes:

listen_addresses = '*'
Enter fullscreen mode Exit fullscreen mode

Edit /etc/postgresql/16/main/pg_hba.conf on all nodes to allow connections:

host    all             all             0.0.0.0/0               md5
Enter fullscreen mode Exit fullscreen mode

Restart PostgreSQL:

sudo systemctl restart postgresql
Enter fullscreen mode Exit fullscreen mode

Step 3: Initialize the Coordinator

On the coordinator node (citus-01):

-- Connect as postgres user
sudo -u postgres psql

-- Create the Citus extension
CREATE EXTENSION citus;

-- Add worker nodes to the cluster
SELECT citus_add_node('10.114.0.11', 5432);
SELECT citus_add_node('10.114.0.12', 5432);

-- Verify worker registration
SELECT * FROM citus_get_active_worker_nodes();
Enter fullscreen mode Exit fullscreen mode

You should see both worker nodes listed:

  node_name  | node_port 
-------------+-----------
 10.114.0.11 |      5432
 10.114.0.12 |      5432
Enter fullscreen mode Exit fullscreen mode

Step 4: Initialize Workers

On each worker node (citus-02 and citus-03):

sudo -u postgres psql
CREATE EXTENSION citus;
Enter fullscreen mode Exit fullscreen mode

Your Citus cluster is now operational!


Demonstration 1: Basic Data Distribution

Let's create a simple distributed table and observe how data spreads across workers.

Creating a Distributed Table

-- On the coordinator
CREATE TABLE citus_demo (
    id SERIAL,
    user_name TEXT,
    city TEXT,
    age INT,
    created_at TIMESTAMP DEFAULT NOW()
);

-- Distribute the table by 'id' column
SELECT create_distributed_table('citus_demo', 'id');
Enter fullscreen mode Exit fullscreen mode

The create_distributed_table function tells Citus to:

  1. Create 32 shards (mini-tables) by default
  2. Distribute these shards across available workers
  3. Route future queries based on the distribution column (id)

Inserting Data

INSERT INTO citus_demo (user_name, city, age) VALUES
    ('Alice', 'New York', 25),
    ('Bob', 'Chicago', 30),
    ('Charlie', 'Seattle', 27),
    ('David', 'Boston', 35),
    ('Emma', 'Denver', 22),
    ('Frank', 'Portland', 29),
    ('Grace', 'Miami', 28),
    ('Henry', 'Atlanta', 31),
    ('Iris', 'Phoenix', 26),
    ('Jack', 'Dallas', 33);
Enter fullscreen mode Exit fullscreen mode

Examining Distribution

Check how shards are distributed:

SELECT * FROM citus_shards WHERE table_name::text = 'citus_demo';
Enter fullscreen mode Exit fullscreen mode

Result (abbreviated):

 table_name | shardid |    shard_name     | nodename    | nodeport | shard_size 
------------+---------+-------------------+-------------+----------+------------
 citus_demo |  102488 | citus_demo_102488 | 10.114.0.11 |     5432 |      16384
 citus_demo |  102489 | citus_demo_102489 | 10.114.0.12 |     5432 |      16384
 citus_demo |  102490 | citus_demo_102490 | 10.114.0.11 |     5432 |       8192
 ... (32 shards total)
Enter fullscreen mode Exit fullscreen mode

We see 32 shards split between two workers. Some shards have data (16 KB), others are empty (8 KB).

Understanding Shard Assignment

See which shard each record belongs to:

SELECT id, user_name, city,
       get_shard_id_for_distribution_column('citus_demo', id) as shard_id
FROM citus_demo
ORDER BY id;
Enter fullscreen mode Exit fullscreen mode

Result:

 id | user_name |   city   | shard_id 
----+-----------+----------+----------
  1 | Alice     | New York |   102489
  2 | Bob       | Chicago  |   102512
  3 | Charlie   | Seattle  |   102503
  4 | David     | Boston   |   102496
  5 | Emma      | Denver   |   102494
  6 | Frank     | Portland |   102508
  7 | Grace     | Miami    |   102496
  8 | Henry     | Atlanta  |   102488
  9 | Iris      | Phoenix  |   102516
 10 | Jack      | Dallas   |   102492
Enter fullscreen mode Exit fullscreen mode

Notice that David (id=4) and Grace (id=7) share shard 102496—this happens when their IDs hash to the same shard. With only 10 records across 32 shards, collisions and empty shards are normal.

Viewing Data on Individual Workers

Connect directly to Worker 1 (10.114.0.11):

-- Query a specific shard
SELECT * FROM citus_demo_102488;
Enter fullscreen mode Exit fullscreen mode

Result:

 id | user_name |  city   | age |         created_at         
----+-----------+---------+-----+----------------------------
  8 | Henry     | Atlanta |  31 | 2026-01-23 22:42:50.523619
Enter fullscreen mode Exit fullscreen mode

Worker 1 only stores the data in its shards. Query other shards on this worker:

SELECT * FROM citus_demo_102512;  -- Bob
SELECT * FROM citus_demo_102496;  -- David and Grace
Enter fullscreen mode Exit fullscreen mode

Connect to Worker 2 (10.114.0.12) and query:

SELECT * FROM citus_demo_102489;  -- Alice
SELECT * FROM citus_demo_102503;  -- Charlie
Enter fullscreen mode Exit fullscreen mode

Key insight: Each worker only stores a subset of the data. When you query the main table through the coordinator, Citus automatically fetches data from all workers and combines the results.


Demonstration 2: Foreign Keys and Co-location

One of the biggest challenges in distributed databases is maintaining relationships between tables. Let's explore how Citus handles foreign keys through co-location.

The Challenge

Imagine you have users and their orders:

  • Users table: Alice, Bob, Charlie
  • Orders table: Order 1 (Alice's pizza), Order 2 (Bob's burger), Order 3 (Alice's soda)

In a distributed setup, what if:

  • Alice's user record is on Worker 1
  • Alice's orders are on Worker 2

When inserting an order for Alice, Worker 2 must verify Alice exists—but Alice is on Worker 1! This requires expensive cross-node communication for every foreign key check.

The Solution: Co-location

Co-location ensures related data lives on the same worker. Citus achieves this by distributing both tables using the same column.

Creating Co-located Tables

-- Create users table
CREATE TABLE users (
    user_id INT PRIMARY KEY, 
    name TEXT
);

-- Create orders table with composite primary key
CREATE TABLE orders (
    order_id SERIAL,
    user_id INT,
    item TEXT,
    created_at TIMESTAMP DEFAULT NOW(),
    PRIMARY KEY (order_id, user_id)  -- Must include distribution column
);

-- Distribute both tables by user_id
SELECT create_distributed_table('users', 'user_id');
SELECT create_distributed_table('orders', 'user_id');

-- Add foreign key constraint
ALTER TABLE orders 
ADD CONSTRAINT fk_user 
FOREIGN KEY (user_id) REFERENCES users(user_id);
Enter fullscreen mode Exit fullscreen mode

Important: The orders primary key includes user_id because Citus requires the distribution column in unique constraints. This ensures uniqueness can be verified locally on each worker.

Inserting Related Data

INSERT INTO users (user_id, name) VALUES
    (1, 'Alice'),
    (2, 'Bob'),
    (3, 'Charlie'),
    (4, 'David'),
    (5, 'Emma');

INSERT INTO orders (user_id, item) VALUES
    (1, 'Pizza'),
    (1, 'Soda'),
    (1, 'Ice Cream'),
    (2, 'Burger'),
    (2, 'Fries'),
    (3, 'Taco'),
    (3, 'Burrito'),
    (3, 'Nachos'),
    (4, 'Pasta'),
    (5, 'Salad');
Enter fullscreen mode Exit fullscreen mode

Verifying Co-location

Check which shards contain user and order data:

SELECT 
    u.user_id,
    u.name,
    o.item,
    get_shard_id_for_distribution_column('users', u.user_id) as user_shard,
    get_shard_id_for_distribution_column('orders', o.user_id) as order_shard
FROM users u
JOIN orders o ON u.user_id = o.user_id
ORDER BY u.user_id, o.order_id;
Enter fullscreen mode Exit fullscreen mode

Result:

 user_id |  name   |   item    | user_shard | order_shard 
---------+---------+-----------+------------+-------------
       1 | Alice   | Pizza     |     102521 |      102553
       1 | Alice   | Soda      |     102521 |      102553
       1 | Alice   | Ice Cream |     102521 |      102553
       2 | Bob     | Burger    |     102544 |      102576
       2 | Bob     | Fries     |     102544 |      102576
       3 | Charlie | Taco      |     102535 |      102567
       3 | Charlie | Burrito   |     102535 |      102567
       3 | Charlie | Nachos    |     102535 |      102567
Enter fullscreen mode Exit fullscreen mode

Wait, the shard IDs are different! This might seem wrong, but it's actually correct.

Understanding Different Shard IDs

The users table has its own set of shards (102521, 102544, 102535...), and the orders table has different shard IDs (102553, 102576, 102567...). However, what matters is that they're on the same physical worker node.

Verify this:

SELECT 
    table_name,
    shardid,
    nodename
FROM citus_shards 
WHERE shardid IN (102521, 102553)  -- Alice's user and order shards
ORDER BY nodename, table_name;
Enter fullscreen mode Exit fullscreen mode

Result:

 table_name | shardid |  nodename   
------------+---------+-------------
 orders     |  102553 | 10.114.0.11
 users      |  102521 | 10.114.0.11
Enter fullscreen mode Exit fullscreen mode

Both shards are on Worker 1 (10.114.0.11)! This is co-location—different shard tables, same physical location.

Think of it like two filing cabinets (users and orders) in the same office. Even though they're separate cabinets with different drawer labels, they're in the same room, so finding related information is instant.

Testing Foreign Key Constraints

Try inserting an order for a non-existent user:

INSERT INTO orders (user_id, item) VALUES (999, 'Ghost Pizza');
Enter fullscreen mode Exit fullscreen mode

Error:

ERROR:  insert or update on table "orders" violates foreign key constraint "orders_user_id_fkey"
DETAIL:  Key (user_id)=(999) is not present in table "users".
Enter fullscreen mode Exit fullscreen mode

Perfect! The foreign key constraint works. Because user_id=999 doesn't exist, and thanks to co-location, the worker can check this locally without network calls to other workers.

Performing JOINs

Count orders per user:

SELECT 
    u.user_id, 
    u.name, 
    COUNT(o.order_id) as num_orders
FROM users u
LEFT JOIN orders o ON u.user_id = o.user_id
GROUP BY u.user_id, u.name
ORDER BY u.user_id;
Enter fullscreen mode Exit fullscreen mode

Result:

 user_id |  name   | num_orders 
---------+---------+------------
       1 | Alice   |          3
       2 | Bob     |          2
       3 | Charlie |          3
       4 | David   |          1
       5 | Emma    |          1
Enter fullscreen mode Exit fullscreen mode

This JOIN is fast because each worker can join its local user and order data without coordinating with other workers. This is the power of co-location.


Key Concepts and Best Practices

1. Distribution Column Selection

Choose your distribution column carefully:

Good choices:

  • user_id for multi-tenant applications (each tenant's data together)
  • account_id for SaaS applications
  • customer_id for e-commerce

Bad choices:

  • created_at (causes temporal hotspots)
  • Low-cardinality columns (poor distribution)
  • Columns that change frequently

Rule of thumb: Pick a column that:

  • Appears in most queries (for query efficiency)
  • Enables co-location of related tables
  • Has high cardinality (many unique values)
  • Remains stable (doesn't change)

2. Co-location Requirements

For foreign keys to work across distributed tables:

  1. Same distribution column: Both parent and child tables must be distributed by the same column
  2. Include in constraints: The distribution column must be part of PRIMARY KEY and UNIQUE constraints
  3. Foreign key on distribution column: The foreign key must reference the distribution column

3. Shard Count

Default: 32 shards per table

  • Too few shards: Limits parallelism and future scalability
  • Too many shards: Increases metadata overhead and query planning time

For most applications, 32-128 shards works well. Adjust based on:

  • Expected data size
  • Number of workers (more workers = more shards beneficial)
  • Query patterns

4. When to Use Citus

Good use cases:

  • Multi-tenant SaaS applications
  • Real-time analytics on time-series data
  • High-throughput transactional workloads
  • Applications that exceed single-server capacity
  • Workloads with natural sharding keys

Not ideal for:

  • Small databases (< 100GB)
  • Workloads requiring many cross-shard JOINs
  • Applications with no clear distribution key
  • Use cases where a single PostgreSQL instance suffices

Performance Considerations

Query Routing

Citus routes queries differently based on their scope:

1. Single-shard queries (fastest):

-- Routed to one worker because user_id=1 uniquely determines shard
SELECT * FROM orders WHERE user_id = 1;
Enter fullscreen mode Exit fullscreen mode

2. Co-located JOINs (fast):

-- Each worker joins its local data, then results are combined
SELECT u.name, COUNT(o.*)
FROM users u
JOIN orders o ON u.user_id = o.user_id
GROUP BY u.name;
Enter fullscreen mode Exit fullscreen mode

3. Broadcast queries (slower):

-- Must query all shards and combine results
SELECT * FROM orders WHERE item = 'Pizza';
Enter fullscreen mode Exit fullscreen mode

4. Cross-shard queries (slowest):

-- Requires moving data between workers
SELECT o.item, p.product_name
FROM orders o
JOIN products p ON o.product_id = p.id
WHERE o.user_id = 1;
-- Only works if products is a reference table or co-located
Enter fullscreen mode Exit fullscreen mode

Reference Tables

For small lookup tables (countries, products, categories), use reference tables:

CREATE TABLE countries (
    code CHAR(2) PRIMARY KEY,
    name TEXT
);

-- Replicate to all workers
SELECT create_reference_table('countries');
Enter fullscreen mode Exit fullscreen mode

Reference tables are fully replicated to every worker, enabling efficient JOINs with distributed tables.


Monitoring and Maintenance

Checking Cluster Health

-- View active workers
SELECT * FROM citus_get_active_worker_nodes();

-- Check shard distribution
SELECT nodename, COUNT(*) as shard_count
FROM citus_shards
GROUP BY nodename;

-- Monitor distributed queries
SELECT * FROM citus_stat_statements;

-- Check replication lag (if using HA setup)
SELECT * FROM citus_replication_status();
Enter fullscreen mode Exit fullscreen mode

Rebalancing Shards

As you add workers, rebalance shards:

-- Move shards to balance load
SELECT citus_rebalance_start();

-- Check rebalance progress
SELECT * FROM citus_rebalance_status();
Enter fullscreen mode Exit fullscreen mode


Conclusion

We've successfully demonstrated horizontal scaling with Citus by:

  1. Building a distributed cluster with one coordinator and two workers
  2. Creating distributed tables and observing how data spreads across shards
  3. Implementing foreign key relationships through co-location
  4. Verifying co-location by checking that related data resides on the same workers
  5. Testing foreign key constraints to ensure data integrity
  6. Performing distributed JOINs that execute efficiently due to co-location

Citus transforms PostgreSQL into a horizontally scalable database while maintaining SQL compatibility and ACID guarantees. By distributing data intelligently and co-locating related tables, it achieves impressive performance even as datasets grow beyond single-server capacity.

The Elephant in the Room: Single Points of Failure

However, our current setup has a critical limitation: every node is a single point of failure.

Consider these scenarios:

  • Coordinator failure: If citus-01 goes down, your entire application stops. No queries can be routed, no data can be accessed, even though workers are healthy.
  • Worker failure: If citus-02 crashes, all data on that worker becomes unavailable. Half your users suddenly can't access their orders.
  • Maintenance downtime: Need to apply a security patch? You'll have to take the coordinator offline, causing application downtime.

In production environments, single points of failure are unacceptable. Your database must survive:

  • Hardware failures
  • Network issues
  • Planned maintenance
  • Software crashes
  • Data center outages

What's Next: High Availability with Patroni

In our next article, we'll solve these availability challenges by integrating Patroni—an open-source high availability solution for PostgreSQL. We'll transform our vulnerable single-node cluster into a highly available system:

Architecture we'll build:

  • Coordinator cluster: Primary + Standby (automatic failover)
  • Worker 1 cluster: Primary + Standby
  • Worker 2 cluster: Primary + Standby
  • etcd cluster: Distributed consensus for leader election

What you'll learn:

  • Setting up Patroni with etcd for distributed consensus
  • Configuring automatic failover for coordinators and workers
  • Testing failure scenarios (simulating crashes)
  • Monitoring cluster health and replication lag
  • Performing switchovers for maintenance
  • Understanding trade-offs between availability and consistency

By combining Citus's horizontal scalability with Patroni's high availability, you'll have a production-ready distributed database that can:

  • Scale to billions of rows
  • Handle millions of queries per second
  • Survive node failures automatically
  • Support zero-downtime maintenance
  • Provide strong consistency guarantees

Stay tuned for "High Availability for Citus: Implementing Automatic Failover with Patroni" where we'll make this scalable database truly resilient.


Resources


Special thanks to the Citus team for building such an elegant distributed database solution.

Top comments (0)