Philip McClarence

Posted on Apr 12

PostgreSQL Table Partitioning: Range, List & Hash Strategies

#architecture #database #performance #postgres

PostgreSQL Table Partitioning: Range, List & Hash Strategies

Table partitioning splits a large table into smaller physical pieces (partitions) while presenting them as a single logical table. The query planner uses partition pruning to scan only the relevant partitions, making queries on hundreds-of-millions-row tables dramatically faster.

But done wrong, partitioning makes things slower.

When You Need Partitioning

Tables grow. An events table starts at a million rows and works fine. A year later it has 500 million rows and everything hurts:

Sequential scans that took milliseconds now take minutes
VACUUM processes the entire table and takes hours
Index builds lock the table for extended periods
Deleting old data means expensive DELETE statements generating massive WAL

The fundamental issue: PostgreSQL stores all rows in a single table file. Every maintenance operation -- VACUUM, ANALYZE, REINDEX -- operates on the entire table regardless of how much data you actually need.

Partitioning solves this by storing data in independent physical tables. A query for events in March only scans the March partition. VACUUM processes each partition independently. Dropping an old month is DROP TABLE (milliseconds) instead of a multi-hour DELETE.

Finding Partitioning Candidates

-- Large tables that might benefit from partitioning
SELECT
    schemaname,
    relname AS table_name,
    pg_size_pretty(pg_total_relation_size(relid)) AS total_size,
    n_live_tup AS estimated_rows,
    n_dead_tup AS dead_tuples,
    CASE WHEN n_live_tup > 0
        THEN round(100.0 * n_dead_tup / n_live_tup, 1)
        ELSE 0
    END AS dead_tuple_pct
FROM pg_stat_user_tables
WHERE pg_relation_size(relid) > 1073741824  -- > 1 GB
ORDER BY pg_total_relation_size(relid) DESC;

Range Partitioning (The Most Common)

Range partitioning is the standard choice for time-series data:

CREATE TABLE events (
    event_id BIGINT GENERATED ALWAYS AS IDENTITY,
    event_timestamp TIMESTAMPTZ NOT NULL,
    event_type TEXT NOT NULL,
    user_id BIGINT,
    payload JSONB,
    PRIMARY KEY (event_id, event_timestamp)  -- must include partition key
) PARTITION BY RANGE (event_timestamp);

-- Monthly partitions
CREATE TABLE events_2025_01 PARTITION OF events
    FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');
CREATE TABLE events_2025_02 PARTITION OF events
    FOR VALUES FROM ('2025-02-01') TO ('2025-03-01');
CREATE TABLE events_2025_03 PARTITION OF events
    FOR VALUES FROM ('2025-03-01') TO ('2025-04-01');

-- Default partition for data that doesn't match any range
CREATE TABLE events_default PARTITION OF events DEFAULT;

Important: the partition key must be part of any primary key or unique constraint. PostgreSQL can't enforce uniqueness across partitions, so the partition key must be included.

List Partitioning

For data that falls into discrete categories:

CREATE TABLE orders (
    order_id BIGINT GENERATED ALWAYS AS IDENTITY,
    region TEXT NOT NULL,
    order_date DATE NOT NULL,
    customer_id BIGINT,
    order_total NUMERIC(12, 2),
    PRIMARY KEY (order_id, region)
) PARTITION BY LIST (region);

CREATE TABLE orders_us PARTITION OF orders FOR VALUES IN ('us-east', 'us-west');
CREATE TABLE orders_eu PARTITION OF orders FOR VALUES IN ('eu-west', 'eu-central');
CREATE TABLE orders_apac PARTITION OF orders FOR VALUES IN ('ap-southeast', 'ap-northeast');
CREATE TABLE orders_default PARTITION OF orders DEFAULT;

Hash Partitioning

Distributes rows evenly when there's no natural range or list key. Useful for reducing contention on high-insert tables:

CREATE TABLE user_sessions (
    session_id UUID DEFAULT gen_random_uuid(),
    user_id BIGINT NOT NULL,
    started_at TIMESTAMPTZ DEFAULT now(),
    session_data JSONB,
    PRIMARY KEY (session_id, user_id)
) PARTITION BY HASH (user_id);

CREATE TABLE user_sessions_0 PARTITION OF user_sessions
    FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE user_sessions_1 PARTITION OF user_sessions
    FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE user_sessions_2 PARTITION OF user_sessions
    FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE user_sessions_3 PARTITION OF user_sessions
    FOR VALUES WITH (MODULUS 4, REMAINDER 3);

Hash partitioning only supports pruning on equality conditions (WHERE user_id = 123), not range queries. Use it for I/O distribution, not query performance.

Verifying Partition Pruning

EXPLAIN (ANALYZE, BUFFERS)
SELECT count(*)
FROM events
WHERE event_timestamp >= '2025-06-01' AND event_timestamp < '2025-07-01';

Look for Append nodes listing which partitions are scanned. If all partitions are scanned for a single-month query, pruning is failing -- usually because the WHERE clause doesn't directly reference the partition key, or uses a type/expression that prevents compile-time pruning.

Migrating an Existing Table

You can't convert a table to partitioned in place. The process:

-- 1. Create the new partitioned table
CREATE TABLE events_partitioned (
    LIKE events INCLUDING ALL
) PARTITION BY RANGE (event_timestamp);

-- 2. Create partitions covering the data range
CREATE TABLE events_p_2025_01 PARTITION OF events_partitioned
    FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');
-- ... for each month

-- 3. Copy data in batches
INSERT INTO events_partitioned
SELECT * FROM events
WHERE event_timestamp >= '2025-01-01' AND event_timestamp < '2025-02-01';
-- ... repeat per partition

-- 4. Swap in a transaction
BEGIN;
ALTER TABLE events RENAME TO events_old;
ALTER TABLE events_partitioned RENAME TO events;
COMMIT;

-- 5. Verify and drop old table when confident

For zero-downtime migrations, use a trigger or logical replication to capture writes during the copy phase.

Automating Partition Lifecycle

Never rely on manual creation:

-- pg_partman handles creation and retention automatically
CREATE EXTENSION pg_partman;

SELECT partman.create_parent(
    p_parent_table := 'public.events',
    p_control := 'event_timestamp',
    p_interval := '1 month',
    p_premake := 3  -- create 3 future partitions in advance
);

-- Run this daily via cron
SELECT partman.run_maintenance();

Dropping Old Data (The Best Part)

-- Instant removal -- no WAL, no dead tuples, no VACUUM needed
ALTER TABLE events DETACH PARTITION events_2024_01;
DROP TABLE events_2024_01;  -- milliseconds

-- Compare: DELETE FROM events WHERE event_timestamp < '2024-02-01'
-- On 100M rows: ~1 hour, massive WAL, dead tuples, VACUUM required

Partition Count: Keep It Manageable

Partition Interval	5 Years	Planning Impact
Monthly	60	Minimal
Weekly	260	Noticeable
Daily	1,825	Significant

The planner evaluates every partition during query planning. More than a few hundred partitions causes measurable slowdown. If you need daily granularity for retention, consider monthly partitions with daily sub-partitions, or use TimescaleDB (optimized for high partition counts).

The Prevention Checklist

Partition key = your most common WHERE clause column. Verify with EXPLAIN.
Automate creation with pg_partman or cron. Missing partitions cause INSERT failures.
Keep partition count under a few hundred. Monthly > daily for most use cases.
Always have a DEFAULT partition. It catches data outside defined ranges.
Review index usage per partition. Hot partitions may need more indexes than cold historical ones.

DEV Community

PostgreSQL Table Partitioning: Range, List & Hash Strategies

PostgreSQL Table Partitioning: Range, List & Hash Strategies

When You Need Partitioning

Finding Partitioning Candidates

Range Partitioning (The Most Common)

List Partitioning

Hash Partitioning

Verifying Partition Pruning

Migrating an Existing Table

Automating Partition Lifecycle

Dropping Old Data (The Best Part)

Partition Count: Keep It Manageable

The Prevention Checklist

Top comments (0)