1. Introduction
Partitioning splits a large logical table into smaller, independent physical chunks (called partitions) while keeping a single table interface for your application. Queries and inserts target the single partitioned table; PostgreSQL routes them to the correct child partitions behind the scenes.
Partitioning improves performance through three mechanisms:
-
Partition pruning: when a
WHEREclause filters on the partition key, PostgreSQL skips scanning partitions that cannot possibly contain matching rows. A query for a specific month on a monthly-partitioned table touches only the relevant partition instead of the entire table. - Smaller per-partition indexes: a B-tree index on a large unpartitioned table is deep and expensive to traverse. When the table is split into partitions, each partition gets its own smaller, shallower index, which reduces the B-tree depth and makes index scans faster per partition.
- Parallel scans: PostgreSQL can scan multiple partitions in parallel when a query spans several of them. A full-year query on a monthly-partitioned table can distribute the work across multiple worker processes.
Note on terminology: PostgreSQL's default table storage is called a "heap" table (unpartitioned, row-based storage). Indexes themselves use B-tree structures. Throughout this post, "unpartitioned" refers to the original heap-organized table, not a heap data structure.
In PostgreSQL, the most common pattern is PARTITION BY RANGE on a timestamp column:
CREATE TABLE activity_log (
id SERIAL,
description TEXT NOT NULL,
started_at TIMESTAMPTZ NOT NULL,
user_id INTEGER NOT NULL,
...
) PARTITION BY RANGE (started_at);
You then create child partitions for each time bucket: yearly, monthly, or even daily for extremely high throughput.
2. When to Use / When Not To
Use partitioning when:
| Criterion | Why |
|---|---|
| Table exceeds 1-2 million rows | Full scans become expensive; indexes bloat |
| Data is time-series / event-driven | Natural partition key (started_at, created_at, datetime) |
| Most queries filter on the partition key | Enables partition pruning |
| You need efficient bulk deletes |
DROP TABLE activity_log_y_2022 is instant vs. DELETE FROM ... WHERE started_at < '2023-01-01' which bloats the table and runs VACUUM |
Avoid partitioning when:
| Criterion | Why |
|---|---|
| Table is < 100K rows | Overhead exceeds benefit |
| Queries span all partitions without a date filter | Every partition gets scanned, often slower than an unpartitioned table |
| Primary lookup is by a non-partition-key column (e.g., FK) | Partition pruning does nothing; you just pay higher planning time |
| Heavy JOINs that cannot push the partition key down | The planner may end up with a nested loop across every partition |
Quick self-assessment
"Do my queries always include
WHERE <timestamp_column> BETWEEN ...?"
If yes: partition. If the answer is "sometimes" or "only on reports": benchmark first with a copy of production data before committing.
3. Benchmark Example
The benchmark example below comes from a real-world Django + PostgreSQL application. It uses the generic table names activity_log (a log of user activities) and audit_log (a record of status changes).
3.1 Setup
Test data:
- 200 users, 500 applicants, 100 job openings, and other supporting FK tables
-
activity_log_original: 2 million rows spanning 2022-2026 -
audit_log_original: 2 million rows covering the same date range
Partitioned variants:
-
Yearly: 5 partitions (
y_2022throughy_2026) -
Monthly: 60 partitions (
m_2022_01throughm_2026_12)
Both variants copy the same 2M rows from the unpartitioned table, so we compare identical data.
3.2 Queries Tested
We measure three variants UNPARTITIONED (baseline), YEARLY (5 partitions), MONTHLY (60 partitions) across eight query patterns:
| Query | Pattern | What it tests |
|---|---|---|
| Q1 | WHERE started_at BETWEEN '2024-03-01' AND '2024-04-01' |
Single month: ideal pruning case |
| Q2 | WHERE started_at BETWEEN '2024-01-01' AND '2025-01-01' |
Full year: scans multiple partitions |
| Q3 | Q1 + AND user_id = 42 AND status_id IN (2,3)
|
Month filter + FK + status lookup |
| Q4 | Aggregation GROUP BY month over 6 months |
Reporting queries |
| Q5 | WHERE user_id = 42 ORDER BY started_at DESC LIMIT 50 |
Pagination without date filter |
| A1 |
WHERE log_id = 12345 (FK lookup on audit_log) |
Negative control: no date filter |
| A2 |
WHERE datetime BETWEEN ... on audit_log
|
Date range on the "wrong" table |
| A3 | A2 + AND log_id = 12345
|
Combined date + FK |
Timings are collected via EXPLAIN (ANALYZE, BUFFERS), run 10 times per query, and reported as mean plus-minus standard deviation.
3.3 Results
Numbers below are from 10 iterations on a local PostgreSQL 16 instance with 2M rows per table (warm cache). Execution times shown as mean plus-minus std across runs.
activity_log: queries that filter on started_at
| Query | Variant | Plan | Execution (mean plus-minus std) | Planning (mean) | Speedup vs. Unpartitioned |
|---|---|---|---|---|---|
| Q1 Single month | UNPARTITIONED | Bitmap Scan | 255.01 plus-minus 112.11 ms | 1.30 ms | 1x |
| YEARLY | Bitmap Scan | 83.21 plus-minus 44.83 ms | 1.26 ms | 3.1x | |
| MONTHLY | Seq Scan | 15.88 plus-minus 9.95 ms | 1.81 ms | 16.1x | |
| Q2 Full year | UNPARTITIONED | Bitmap Scan | 453.69 plus-minus 244.94 ms | 0.13 ms | 1x |
| YEARLY | Seq Scan | 183.74 plus-minus 97.71 ms | 0.49 ms | 2.5x | |
| MONTHLY | Append (12 partitions) | 229.76 plus-minus 132.12 ms | 7.89 ms | 2.0x | |
| Q3 Month + FK + Status | UNPARTITIONED | Bitmap Scan | 4.97 plus-minus 1.30 ms | 0.41 ms | 1x |
| YEARLY | Bitmap Scan | 2.48 plus-minus 1.00 ms | 0.26 ms | 2.0x | |
| MONTHLY | Bitmap Scan | 0.12 plus-minus 0.05 ms | 0.25 ms | 40.9x | |
| Q4 Aggregation (6 months) | UNPARTITIONED | GroupAggregate | 318.37 plus-minus 314.19 ms | 0.18 ms | 1x |
| YEARLY | GroupAggregate | 235.82 plus-minus 134.10 ms | 0.25 ms | 1.4x | |
| MONTHLY | GroupAggregate | 274.39 plus-minus 171.96 ms | 0.69 ms | 1.2x | |
| Q5 LIMIT 50, no date filter | UNPARTITIONED | Limit | 200.38 plus-minus 80.25 ms | 0.20 ms | 1x |
| YEARLY | Limit | 70.59 plus-minus 32.66 ms | 1.99 ms | 2.8x | |
| MONTHLY | Limit | 47.68 plus-minus 26.70 ms | 24.84 ms | 4.2x |
Key takeaways for activity_log:
-
Q1 gets a 16x improvement with monthly partitioning. The query scans only the
m_2024_03partition instead of the full 2M-row table. -
Q3 shows the strongest win at 41x faster. The date filter prunes to a single partition, then the compound index on
(status_id, started_at)resolves the remaining filters locally. This is the ideal scenario for partitioning. -
Q2 (full year) benefits most from yearly partitioning (2.5x). Monthly is slightly worse here because PostgreSQL must
Appendacross 12 child partitions, and the planning time jumps from 0.13 ms to 7.89 ms. - Q5 (no date filter) is a cautionary tale: monthly execution is faster (4.2x), but planning time skyrockets from 0.20 ms to 24.84 ms (124x worse). If this query runs frequently, the total per-request cost is higher with partitions despite the faster execution.
audit_log: the negative control
| Query | Variant | Plan | Execution (mean plus-minus std) | Planning (mean) |
|---|---|---|---|---|
| A1 FK lookup | UNPARTITIONED | Sort | 0.07 plus-minus 0.01 ms | 0.32 ms |
| YEARLY | Sort | 0.24 plus-minus 0.16 ms | 1.54 ms | |
| MONTHLY | Sort | 2.68 plus-minus 3.14 ms | 18.43 ms | |
| A2 Date range | UNPARTITIONED | Bitmap Scan | 159.18 plus-minus 66.39 ms | 0.12 ms |
| YEARLY | Bitmap Scan | 50.25 plus-minus 24.59 ms | 0.16 ms | |
| MONTHLY | Seq Scan | 13.70 plus-minus 9.59 ms | 0.24 ms | |
| A3 Date + FK | UNPARTITIONED | Index Scan | 0.07 plus-minus 0.04 ms | 0.11 ms |
| YEARLY | Index Scan | 0.05 plus-minus 0.01 ms | 0.21 ms | |
| MONTHLY | Index Scan | 0.04 plus-minus 0.01 ms | 0.20 ms |
Key takeaways for audit_log:
-
A1 (FK lookup on
log_id) demonstrates the cost of unnecessary partitioning. Monthly execution is 38x slower (2.68 ms vs. 0.07 ms) and planning time is 58x worse (18.43 ms vs. 0.32 ms). The planner must consult all 60 partitions to find the matching row. If your application queries by FK alone, partitioning actively degrades performance. - A2 shows pruning does help for date-range queries (monthly is 11.6x faster), but this query pattern does not exist in the application. The important lesson: partition on the column your application actually filters by, not just any timestamp column that happens to exist on the table.
3.4 Running Your Own Benchmark
To reproduce this approach on your own database:
- Set up test data: create a dedicated benchmark database, populate your target table with several million rows of realistic timestamps, and add the indexes your application uses.
-
Create partitioned variants: build yearly and monthly-range-partitioned copies of the same data so you compare identical rows. Let PostgreSQL
ANALYZEall variants. - Run EXPLAIN ANALYZE: execute your actual query patterns against each variant (unpartitioned, yearly, monthly), collect timing data across multiple iterations, and compute mean and standard deviation.
Averaging across runs (e.g., 5-10 iterations) gives you statistically meaningful numbers and reveals whether observed differences are real or just cache warm-up noise.
4. Limitations
4.1 Maintenance Overhead
Partitions do not create themselves. You need:
- New partitions ahead of time: if a new month arrives and the partition does not exist yet, inserts fail with "no partition of relation found". Automate this with a cron job or a management command that creates partitions 3 months ahead.
- VACUUM across all partitions: autovacuum must visit every partition independently. The more partitions, the more vacuum workers compete for resources.
- Index sprawl: each partition gets its own copy of every index. N partitions times M indexes means N x M B-tree structures to maintain, checkpoint, and vacuum.
4.2 Partition Key Restrictions
PostgreSQL enforces strict rules on partitioned tables:
-
PRIMARY KEY must include the partition key. A simple
PRIMARY KEY (id)is invalid on a table partitioned bystarted_at. You must usePRIMARY KEY (id, started_at). This breaks any foreign key that references onlyid. - UNIQUE constraints likewise need the partition key. No single-column unique constraint without including the partition key.
- Foreign Keys from child tables to a partitioned parent work, but FKs from a partitioned table to another table require the partition key in the FK columns unless the referenced table is also partitioned on the same key.
In practice, this means you may need to drop FK constraints at the SQL level and enforce referential integrity in the application layer (e.g., Django's on_delete=CASCADE operates through the ORM's collector, not database-level CASCADE).
4.3 Application-Level Bottlenecks
-
ORM-generated queries without date filters: if a view runs
logs = ActivityLog.objects.filter(user=request.user), the ORM will not add a date filter unless you do. This triggers anAppendacross every partition, often slower than the original unpartitioned table. - Planning time increases with partition count. In our benchmark, unpartitioned planning was 0.32 ms for audit queries; monthly (60 partitions) hit 18.43 ms on A1, a 58x increase. For OLTP workloads doing thousands of small queries per second, this adds up.
- Connection pooling: planning time is incurred per query. If your pooler (PgBouncer) uses transaction pooling, the increased planning overhead applies to every statement.
5. Conclusion
Rule of thumb:
Partition by
RANGEon the date column when your table exceeds roughly 1M rows and the majority of queries filter on that date column. Use monthly partitions for the best pruning granularity; use yearly if you want fewer partitions and simpler maintenance. Do not partition a table if your primary lookup is by a non-partition-key column or you lack a consistent date filter, the overhead will make things slower, not faster.
If you are unsure, benchmark. Clone your production table, partition a copy, run EXPLAIN ANALYZE on your actual query patterns, and let the numbers decide.
Top comments (0)