In a previous post in this series, we discussed the limitations of database sharding. The data model begins hierarchical, centered around a single sharding key, and then expands to incorporate all SQL relational features, including relational integrity and unique constraints within one shard. We used an orders-items model as an example.
In relational databases, a table can have multiple parent tables. The "order_items" table, for instance, references products from a catalog. It cannot share the same sharding key as the "orders" and "order_items" tables. To ensure referential integrity and achieve optimal performance, the "product" table must be duplicated across all shards. This approach is acceptable for small tables that are not frequently updated, as all changes need to be broadcast to every node. This scenario is typical for reference or lookup tables.
Given this context, I will consider the "product" table as a reference table. Therefore, I will define limitless_create_table_mode
as reference
instead of sharded
before creating it:
postgres_limitless=> set rds_aurora.limitless_create_table_mode=reference;
SET
postgres_limitless=> create table product (
id uuid default gen_random_uuid() primary key
, name text
);
CREATE TABLE
postgres_limitless=> alter table orders
add product_id uuid references product
;
ALTER TABLE
Since the table is duplicated across all shards, the foreign key can be defined as it remains a single-shard reference:
postgres_limitless=> \d product
Foreign table "public.product"
Column | Type | Collation | Nullable | Default | FDW options
--------+------+-----------+----------+-------------------+-------------
id | uuid | | | gen_random_uuid() |
name | text | | | |
Server: _rds_aurora_internal_shard_4_0
FDW options: (schema_name 'public', table_name 'product', aurora_table_type 'reference', aurora_replication_tag 'replicationTag')
postgres_limitless=> set rds_aurora.limitless_explain_options=shard_plans, single_shard_optimization;
SET
postgres_limitless=> explain select * from product;
QUERY PLAN
--------------------------------------------------------------------
Foreign Scan (cost=100.00..101.00 rows=100 width=0)
Remote Plans from Shard 4:
Seq Scan on product (cost=0.00..20.70 rows=1070 width=48)
Single Shard Optimized
(4 rows)
The \d
command and the explain analyze indicate that the "product" table belongs to the shard with subcluster_id 4. However, this same table exists on all shards. I can verify this by joining rows from different shards, using different values for the sharding key:
postgres_limitless=> explain
select * from product
join orders on product.id=orders.product_id
where orders.year=2024 and orders.order_id=1
;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Foreign Scan (cost=100.00..101.00 rows=100 width=0)
Remote Plans from Shard 5:
Nested Loop (cost=0.30..16.36 rows=1 width=96)
-> Index Scan using orders_ts00231_year_order_id_key on orders_ts00231 orders (cost=0.15..8.17 rows=1 width=48)
Index Cond: ((year = 2024) AND (order_id = 1))
-> Index Scan using product_pkey on product (cost=0.15..8.17 rows=1 width=48)
Index Cond: (id = orders.product_id)
Single Shard Optimized
(8 rows)
postgres_limitless=> explain
select * from product
join orders on product.id=orders.product_id
where orders.year=2025 and orders.order_id=1
;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Foreign Scan (cost=100.00..101.00 rows=100 width=0)
Remote Plans from Shard 4:
Nested Loop (cost=0.30..16.36 rows=1 width=96)
-> Index Scan using orders_ts00401_year_order_id_key on orders_ts00401 orders (cost=0.15..8.17 rows=1 width=48)
Index Cond: ((year = 2025) AND (order_id = 1))
-> Index Scan using product_pkey on product (cost=0.15..8.17 rows=1 width=48)
Index Cond: (id = orders.product_id)
Single Shard Optimized
(8 rows)
I can show the whole picture with a generic plan which must cover all shards:
postgres_limitless=> explain (generic_plan)
select * from product
join orders on product.id=orders.product_id
where orders.year=$1 and orders.id=$2
;
QUERY PLAN
-----------------------------------------------------------------------------------------
Append (cost=100.00..603.66 rows=24 width=96)
-> Async Foreign Scan (cost=100.00..150.89 rows=6 width=96)
Remote Plans from Shard 7:
Hash Join (cost=42.71..1653.68 rows=128 width=96)
Hash Cond: (r4.product_id = r1.id)
-> Append (cost=8.63..1619.26 rows=128 width=48)
Subplans Removed: 128
-> Hash (cost=20.70..20.70 rows=1070 width=48)
-> Seq Scan on product r1 (cost=0.00..20.70 rows=1070 width=48)
Relations: (orders_fs00004 orders_1) INNER JOIN (product)
-> Async Foreign Scan (cost=100.00..150.89 rows=6 width=96)
Remote Plans from Shard 5:
Hash Join (cost=42.71..1653.68 rows=128 width=96)
Hash Cond: (r5.product_id = r1.id)
-> Append (cost=8.63..1619.26 rows=128 width=48)
Subplans Removed: 128
-> Hash (cost=20.70..20.70 rows=1070 width=48)
-> Seq Scan on product r1 (cost=0.00..20.70 rows=1070 width=48)
Relations: (orders_fs00001 orders_2) INNER JOIN (product)
-> Async Foreign Scan (cost=100.00..150.89 rows=6 width=96)
Remote Plans from Shard 6:
Hash Join (cost=42.71..1653.68 rows=128 width=96)
Hash Cond: (r6.product_id = r1.id)
-> Append (cost=8.63..1619.26 rows=128 width=48)
Subplans Removed: 128
-> Hash (cost=20.70..20.70 rows=1070 width=48)
-> Seq Scan on product r1 (cost=0.00..20.70 rows=1070 width=48)
Relations: (orders_fs00003 orders_3) INNER JOIN (product)
-> Async Foreign Scan (cost=100.00..150.89 rows=6 width=96)
Remote Plans from Shard 4:
Hash Join (cost=42.71..1653.68 rows=128 width=96)
Hash Cond: (r7.product_id = r1.id)
-> Append (cost=8.63..1619.26 rows=128 width=48)
Subplans Removed: 128
-> Hash (cost=20.70..20.70 rows=1070 width=48)
-> Seq Scan on product r1 (cost=0.00..20.70 rows=1070 width=48)
Relations: (orders_fs00002 orders_4) INNER JOIN (product)
(37 rows)
Generic plans are not ideal in a shared database because they read from all shards. However, since the reference table is duplicated across all shards, the join can be performed locally. This can be a hash join, as in this case, or a nested loop, as seen previously.
The duplication of reference tables affects DML performance, as changes must be broadcasted to all shards. For example, inserting a thousand products:
postgres_limitless=> explain (analyze, buffers, wal)
insert into product(name)
select format('Product #%s', generate_series(1,1000))
;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Insert on product (cost=0.00..22.52 rows=0 width=0) (actual time=493.321..493.323 rows=0 loops=1)
Remote Plans from Shard 4:
[plan executed 10 times]
Insert on product (cost=0.00..1.25 rows=0 width=0) (actual time=1.334..1.335 rows=0 loops=1)
Buffers: shared hit=349 written=1
I/O Timings: shared write=0.015
-> Values Scan on "*VALUES*" (cost=0.00..1.25 rows=100 width=48) (actual time=0.002..0.034 rows=100 loops=1)
Remote Plans from Shard 5:
[plan executed 10 times]
Insert on product (cost=0.00..1.25 rows=0 width=0) (actual time=7.888..7.889 rows=0 loops=1)
Buffers: shared hit=325 written=1
I/O Timings: shared write=0.457
-> Values Scan on "*VALUES*" (cost=0.00..1.25 rows=100 width=48) (actual time=0.003..0.045 rows=100 loops=1)
Remote Plans from Shard 7:
[plan executed 10 times]
Insert on product (cost=0.00..1.25 rows=0 width=0) (actual time=1.259..1.260 rows=0 loops=1)
Buffers: shared hit=325 written=1
I/O Timings: shared write=0.011
-> Values Scan on "*VALUES*" (cost=0.00..1.25 rows=100 width=48) (actual time=0.002..0.034 rows=100 loops=1)
Remote Plans from Shard 6:
[plan executed 10 times]
Insert on product (cost=0.00..1.25 rows=0 width=0) (actual time=9.400..9.400 rows=0 loops=1)
Buffers: shared hit=326 written=1
I/O Timings: shared write=0.011
-> Values Scan on "*VALUES*" (cost=0.00..1.25 rows=100 width=48) (actual time=0.003..0.046 rows=100 loops=1)
-> Subquery Scan on "*SELECT*" (cost=0.00..22.52 rows=1000 width=48) (actual time=0.020..0.961 rows=1000 loops=1)
-> Result (cost=0.00..20.02 rows=1000 width=32) (actual time=0.008..0.450 rows=1000 loops=1)
-> ProjectSet (cost=0.00..5.02 rows=1000 width=4) (actual time=0.003..0.091 rows=1000 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
Planning Time: 0.078 ms
Execution Time: 578.902 ms
(31 rows)
Reference tables are utilized to address the limitations of database sharding, such as challenges with cross-shard joins and maintaining referential integrity. These tables can be accessed through single-shard transactions and allow for join pushdown. However, they should be reserved for infrequent updates. In the next post, we will explore a typical SQL model using PgBench.
Top comments (0)