Introduction
As databases grow beyond the capabilities of a single server, teams face a critical decision: scale vertically by adding more resources to one machine, or scale horizontally by distributing data across multiple servers. While vertical scaling eventually hits physical and economic limits, horizontal scaling offers virtually unlimited growth potential.
In this hands-on guide, we'll explore Citus—an open-source extension that transforms PostgreSQL into a distributed database. We'll build a real Citus cluster from scratch, demonstrate data distribution, implement foreign key relationships across distributed tables, and understand the principles that make horizontal scaling work.
By the end, you'll have practical experience with distributed databases and understand both their power and their limitations.
What is Horizontal Scaling?
Imagine a pizza restaurant with one chef making all the pizzas. During rush hour, orders pile up because one person can only work so fast. This is vertical scaling—you could train the chef to work faster (add more CPU), give them better tools (add more RAM), or expand their workspace (add more storage). But eventually, you hit limits.
Now imagine hiring three chefs, each handling different orders. This is horizontal scaling—you distribute the workload across multiple workers. The more customers you have, the more chefs you can add.
In database terms:
- Vertical scaling: Bigger server (more CPU, RAM, storage)
- Horizontal scaling: More servers working together
Citus brings horizontal scaling to PostgreSQL by:
- Splitting your data across multiple worker nodes (sharding)
- Keeping related data together on the same node (co-location)
- Maintaining PostgreSQL's ACID guarantees and SQL compatibility
Architecture: Understanding Citus Components
A Citus cluster consists of at least three components:
1. Coordinator Node
The coordinator is like a project manager—it doesn't do the heavy lifting but knows where everything is. When you connect to a Citus cluster, you connect to the coordinator. It:
- Routes queries to the appropriate workers
- Combines results from multiple workers
- Manages distributed transactions
- Stores metadata about data distribution
2. Worker Nodes
Workers are the muscle of your cluster—they store the actual data and execute queries. Each worker:
- Holds a subset of your data (shards)
- Executes queries locally on its data
- Communicates with other workers when needed
- Functions as a full PostgreSQL instance
3. Shards
Shards are pieces of your distributed tables. Think of them as mini-tables that together form your complete dataset. When you distribute a table:
- Citus creates 32 shards by default (configurable)
- Shards are distributed across workers
- Each row goes to exactly one shard based on a hash function
- Related data can be co-located on the same worker
Setting Up a Citus Cluster on Digital Ocean
For this demonstration, we'll create a three-node cluster:
- citus-01: Coordinator node
- citus-02: Worker 1
- citus-03: Worker 2
Step 1: Install PostgreSQL and Citus Extension
On all three droplets:
# Add PostgreSQL repository
sudo apt-get update
sudo apt-get install -y postgresql-common
sudo /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh
# Install PostgreSQL 16
sudo apt-get install -y postgresql-16
# Install Citus extension
curl https://install.citusdata.com/community/deb.sh | sudo bash
sudo apt-get install -y postgresql-16-citus-12.1
# Configure PostgreSQL to load Citus
echo "shared_preload_libraries = 'citus'" | sudo tee -a /etc/postgresql/16/main/postgresql.conf
# Restart PostgreSQL
sudo systemctl restart postgresql
Step 2: Configure Network Access
Edit /etc/postgresql/16/main/postgresql.conf on all nodes:
listen_addresses = '*'
Edit /etc/postgresql/16/main/pg_hba.conf on all nodes to allow connections:
host all all 0.0.0.0/0 md5
Restart PostgreSQL:
sudo systemctl restart postgresql
Step 3: Initialize the Coordinator
On the coordinator node (citus-01):
-- Connect as postgres user
sudo -u postgres psql
-- Create the Citus extension
CREATE EXTENSION citus;
-- Add worker nodes to the cluster
SELECT citus_add_node('10.114.0.11', 5432);
SELECT citus_add_node('10.114.0.12', 5432);
-- Verify worker registration
SELECT * FROM citus_get_active_worker_nodes();
You should see both worker nodes listed:
node_name | node_port
-------------+-----------
10.114.0.11 | 5432
10.114.0.12 | 5432
Step 4: Initialize Workers
On each worker node (citus-02 and citus-03):
sudo -u postgres psql
CREATE EXTENSION citus;
Your Citus cluster is now operational!
Demonstration 1: Basic Data Distribution
Let's create a simple distributed table and observe how data spreads across workers.
Creating a Distributed Table
-- On the coordinator
CREATE TABLE citus_demo (
id SERIAL,
user_name TEXT,
city TEXT,
age INT,
created_at TIMESTAMP DEFAULT NOW()
);
-- Distribute the table by 'id' column
SELECT create_distributed_table('citus_demo', 'id');
The create_distributed_table function tells Citus to:
- Create 32 shards (mini-tables) by default
- Distribute these shards across available workers
- Route future queries based on the distribution column (id)
Inserting Data
INSERT INTO citus_demo (user_name, city, age) VALUES
('Alice', 'New York', 25),
('Bob', 'Chicago', 30),
('Charlie', 'Seattle', 27),
('David', 'Boston', 35),
('Emma', 'Denver', 22),
('Frank', 'Portland', 29),
('Grace', 'Miami', 28),
('Henry', 'Atlanta', 31),
('Iris', 'Phoenix', 26),
('Jack', 'Dallas', 33);
Examining Distribution
Check how shards are distributed:
SELECT * FROM citus_shards WHERE table_name::text = 'citus_demo';
Result (abbreviated):
table_name | shardid | shard_name | nodename | nodeport | shard_size
------------+---------+-------------------+-------------+----------+------------
citus_demo | 102488 | citus_demo_102488 | 10.114.0.11 | 5432 | 16384
citus_demo | 102489 | citus_demo_102489 | 10.114.0.12 | 5432 | 16384
citus_demo | 102490 | citus_demo_102490 | 10.114.0.11 | 5432 | 8192
... (32 shards total)
We see 32 shards split between two workers. Some shards have data (16 KB), others are empty (8 KB).
Understanding Shard Assignment
See which shard each record belongs to:
SELECT id, user_name, city,
get_shard_id_for_distribution_column('citus_demo', id) as shard_id
FROM citus_demo
ORDER BY id;
Result:
id | user_name | city | shard_id
----+-----------+----------+----------
1 | Alice | New York | 102489
2 | Bob | Chicago | 102512
3 | Charlie | Seattle | 102503
4 | David | Boston | 102496
5 | Emma | Denver | 102494
6 | Frank | Portland | 102508
7 | Grace | Miami | 102496
8 | Henry | Atlanta | 102488
9 | Iris | Phoenix | 102516
10 | Jack | Dallas | 102492
Notice that David (id=4) and Grace (id=7) share shard 102496—this happens when their IDs hash to the same shard. With only 10 records across 32 shards, collisions and empty shards are normal.
Viewing Data on Individual Workers
Connect directly to Worker 1 (10.114.0.11):
-- Query a specific shard
SELECT * FROM citus_demo_102488;
Result:
id | user_name | city | age | created_at
----+-----------+---------+-----+----------------------------
8 | Henry | Atlanta | 31 | 2026-01-23 22:42:50.523619
Worker 1 only stores the data in its shards. Query other shards on this worker:
SELECT * FROM citus_demo_102512; -- Bob
SELECT * FROM citus_demo_102496; -- David and Grace
Connect to Worker 2 (10.114.0.12) and query:
SELECT * FROM citus_demo_102489; -- Alice
SELECT * FROM citus_demo_102503; -- Charlie
Key insight: Each worker only stores a subset of the data. When you query the main table through the coordinator, Citus automatically fetches data from all workers and combines the results.
Demonstration 2: Foreign Keys and Co-location
One of the biggest challenges in distributed databases is maintaining relationships between tables. Let's explore how Citus handles foreign keys through co-location.
The Challenge
Imagine you have users and their orders:
- Users table: Alice, Bob, Charlie
- Orders table: Order 1 (Alice's pizza), Order 2 (Bob's burger), Order 3 (Alice's soda)
In a distributed setup, what if:
- Alice's user record is on Worker 1
- Alice's orders are on Worker 2
When inserting an order for Alice, Worker 2 must verify Alice exists—but Alice is on Worker 1! This requires expensive cross-node communication for every foreign key check.
The Solution: Co-location
Co-location ensures related data lives on the same worker. Citus achieves this by distributing both tables using the same column.
Creating Co-located Tables
-- Create users table
CREATE TABLE users (
user_id INT PRIMARY KEY,
name TEXT
);
-- Create orders table with composite primary key
CREATE TABLE orders (
order_id SERIAL,
user_id INT,
item TEXT,
created_at TIMESTAMP DEFAULT NOW(),
PRIMARY KEY (order_id, user_id) -- Must include distribution column
);
-- Distribute both tables by user_id
SELECT create_distributed_table('users', 'user_id');
SELECT create_distributed_table('orders', 'user_id');
-- Add foreign key constraint
ALTER TABLE orders
ADD CONSTRAINT fk_user
FOREIGN KEY (user_id) REFERENCES users(user_id);
Important: The orders primary key includes user_id because Citus requires the distribution column in unique constraints. This ensures uniqueness can be verified locally on each worker.
Inserting Related Data
INSERT INTO users (user_id, name) VALUES
(1, 'Alice'),
(2, 'Bob'),
(3, 'Charlie'),
(4, 'David'),
(5, 'Emma');
INSERT INTO orders (user_id, item) VALUES
(1, 'Pizza'),
(1, 'Soda'),
(1, 'Ice Cream'),
(2, 'Burger'),
(2, 'Fries'),
(3, 'Taco'),
(3, 'Burrito'),
(3, 'Nachos'),
(4, 'Pasta'),
(5, 'Salad');
Verifying Co-location
Check which shards contain user and order data:
SELECT
u.user_id,
u.name,
o.item,
get_shard_id_for_distribution_column('users', u.user_id) as user_shard,
get_shard_id_for_distribution_column('orders', o.user_id) as order_shard
FROM users u
JOIN orders o ON u.user_id = o.user_id
ORDER BY u.user_id, o.order_id;
Result:
user_id | name | item | user_shard | order_shard
---------+---------+-----------+------------+-------------
1 | Alice | Pizza | 102521 | 102553
1 | Alice | Soda | 102521 | 102553
1 | Alice | Ice Cream | 102521 | 102553
2 | Bob | Burger | 102544 | 102576
2 | Bob | Fries | 102544 | 102576
3 | Charlie | Taco | 102535 | 102567
3 | Charlie | Burrito | 102535 | 102567
3 | Charlie | Nachos | 102535 | 102567
Wait, the shard IDs are different! This might seem wrong, but it's actually correct.
Understanding Different Shard IDs
The users table has its own set of shards (102521, 102544, 102535...), and the orders table has different shard IDs (102553, 102576, 102567...). However, what matters is that they're on the same physical worker node.
Verify this:
SELECT
table_name,
shardid,
nodename
FROM citus_shards
WHERE shardid IN (102521, 102553) -- Alice's user and order shards
ORDER BY nodename, table_name;
Result:
table_name | shardid | nodename
------------+---------+-------------
orders | 102553 | 10.114.0.11
users | 102521 | 10.114.0.11
Both shards are on Worker 1 (10.114.0.11)! This is co-location—different shard tables, same physical location.
Think of it like two filing cabinets (users and orders) in the same office. Even though they're separate cabinets with different drawer labels, they're in the same room, so finding related information is instant.
Testing Foreign Key Constraints
Try inserting an order for a non-existent user:
INSERT INTO orders (user_id, item) VALUES (999, 'Ghost Pizza');
Error:
ERROR: insert or update on table "orders" violates foreign key constraint "orders_user_id_fkey"
DETAIL: Key (user_id)=(999) is not present in table "users".
Perfect! The foreign key constraint works. Because user_id=999 doesn't exist, and thanks to co-location, the worker can check this locally without network calls to other workers.
Performing JOINs
Count orders per user:
SELECT
u.user_id,
u.name,
COUNT(o.order_id) as num_orders
FROM users u
LEFT JOIN orders o ON u.user_id = o.user_id
GROUP BY u.user_id, u.name
ORDER BY u.user_id;
Result:
user_id | name | num_orders
---------+---------+------------
1 | Alice | 3
2 | Bob | 2
3 | Charlie | 3
4 | David | 1
5 | Emma | 1
This JOIN is fast because each worker can join its local user and order data without coordinating with other workers. This is the power of co-location.
Key Concepts and Best Practices
1. Distribution Column Selection
Choose your distribution column carefully:
Good choices:
-
user_idfor multi-tenant applications (each tenant's data together) -
account_idfor SaaS applications -
customer_idfor e-commerce
Bad choices:
-
created_at(causes temporal hotspots) - Low-cardinality columns (poor distribution)
- Columns that change frequently
Rule of thumb: Pick a column that:
- Appears in most queries (for query efficiency)
- Enables co-location of related tables
- Has high cardinality (many unique values)
- Remains stable (doesn't change)
2. Co-location Requirements
For foreign keys to work across distributed tables:
- Same distribution column: Both parent and child tables must be distributed by the same column
- Include in constraints: The distribution column must be part of PRIMARY KEY and UNIQUE constraints
- Foreign key on distribution column: The foreign key must reference the distribution column
3. Shard Count
Default: 32 shards per table
- Too few shards: Limits parallelism and future scalability
- Too many shards: Increases metadata overhead and query planning time
For most applications, 32-128 shards works well. Adjust based on:
- Expected data size
- Number of workers (more workers = more shards beneficial)
- Query patterns
4. When to Use Citus
Good use cases:
- Multi-tenant SaaS applications
- Real-time analytics on time-series data
- High-throughput transactional workloads
- Applications that exceed single-server capacity
- Workloads with natural sharding keys
Not ideal for:
- Small databases (< 100GB)
- Workloads requiring many cross-shard JOINs
- Applications with no clear distribution key
- Use cases where a single PostgreSQL instance suffices
Performance Considerations
Query Routing
Citus routes queries differently based on their scope:
1. Single-shard queries (fastest):
-- Routed to one worker because user_id=1 uniquely determines shard
SELECT * FROM orders WHERE user_id = 1;
2. Co-located JOINs (fast):
-- Each worker joins its local data, then results are combined
SELECT u.name, COUNT(o.*)
FROM users u
JOIN orders o ON u.user_id = o.user_id
GROUP BY u.name;
3. Broadcast queries (slower):
-- Must query all shards and combine results
SELECT * FROM orders WHERE item = 'Pizza';
4. Cross-shard queries (slowest):
-- Requires moving data between workers
SELECT o.item, p.product_name
FROM orders o
JOIN products p ON o.product_id = p.id
WHERE o.user_id = 1;
-- Only works if products is a reference table or co-located
Reference Tables
For small lookup tables (countries, products, categories), use reference tables:
CREATE TABLE countries (
code CHAR(2) PRIMARY KEY,
name TEXT
);
-- Replicate to all workers
SELECT create_reference_table('countries');
Reference tables are fully replicated to every worker, enabling efficient JOINs with distributed tables.
Monitoring and Maintenance
Checking Cluster Health
-- View active workers
SELECT * FROM citus_get_active_worker_nodes();
-- Check shard distribution
SELECT nodename, COUNT(*) as shard_count
FROM citus_shards
GROUP BY nodename;
-- Monitor distributed queries
SELECT * FROM citus_stat_statements;
-- Check replication lag (if using HA setup)
SELECT * FROM citus_replication_status();
Rebalancing Shards
As you add workers, rebalance shards:
-- Move shards to balance load
SELECT citus_rebalance_start();
-- Check rebalance progress
SELECT * FROM citus_rebalance_status();
Conclusion
We've successfully demonstrated horizontal scaling with Citus by:
- Building a distributed cluster with one coordinator and two workers
- Creating distributed tables and observing how data spreads across shards
- Implementing foreign key relationships through co-location
- Verifying co-location by checking that related data resides on the same workers
- Testing foreign key constraints to ensure data integrity
- Performing distributed JOINs that execute efficiently due to co-location
Citus transforms PostgreSQL into a horizontally scalable database while maintaining SQL compatibility and ACID guarantees. By distributing data intelligently and co-locating related tables, it achieves impressive performance even as datasets grow beyond single-server capacity.
The Elephant in the Room: Single Points of Failure
However, our current setup has a critical limitation: every node is a single point of failure.
Consider these scenarios:
- Coordinator failure: If citus-01 goes down, your entire application stops. No queries can be routed, no data can be accessed, even though workers are healthy.
- Worker failure: If citus-02 crashes, all data on that worker becomes unavailable. Half your users suddenly can't access their orders.
- Maintenance downtime: Need to apply a security patch? You'll have to take the coordinator offline, causing application downtime.
In production environments, single points of failure are unacceptable. Your database must survive:
- Hardware failures
- Network issues
- Planned maintenance
- Software crashes
- Data center outages
What's Next: High Availability with Patroni
In our next article, we'll solve these availability challenges by integrating Patroni—an open-source high availability solution for PostgreSQL. We'll transform our vulnerable single-node cluster into a highly available system:
Architecture we'll build:
- Coordinator cluster: Primary + Standby (automatic failover)
- Worker 1 cluster: Primary + Standby
- Worker 2 cluster: Primary + Standby
- etcd cluster: Distributed consensus for leader election
What you'll learn:
- Setting up Patroni with etcd for distributed consensus
- Configuring automatic failover for coordinators and workers
- Testing failure scenarios (simulating crashes)
- Monitoring cluster health and replication lag
- Performing switchovers for maintenance
- Understanding trade-offs between availability and consistency
By combining Citus's horizontal scalability with Patroni's high availability, you'll have a production-ready distributed database that can:
- Scale to billions of rows
- Handle millions of queries per second
- Survive node failures automatically
- Support zero-downtime maintenance
- Provide strong consistency guarantees
Stay tuned for "High Availability for Citus: Implementing Automatic Failover with Patroni" where we'll make this scalable database truly resilient.
Resources
- Citus Documentation: https://docs.citusdata.com
- Citus GitHub: https://github.com/citusdata/citus
- PostgreSQL Documentation: https://www.postgresql.org/docs/
- Digital Ocean Citus Tutorial: https://docs.digitalocean.com/products/databases/postgresql/
Special thanks to the Citus team for building such an elegant distributed database solution.
Top comments (0)