Choosing the Right chunk_time_interval for Your Workload

#architecture #database #performance #postgres

If there is one TimescaleDB configuration that deserves more attention than it gets, it is chunk_time_interval. This single parameter controls how much time each chunk covers, which cascades into chunk count, chunk size, query planning time, compression granularity, and retention resolution. The default is 7 days, and for most production workloads, it is the wrong value.

Why the Interval Matters So Much

Every chunk in a TimescaleDB hypertable is a separate PostgreSQL table. This is the mechanism that makes time-range queries fast (through constraint exclusion) and retention instant (dropping a chunk is a metadata operation, not a row-by-row DELETE). But the number and size of these chunks has direct performance consequences.

Too many small chunks (e.g., 1-hour intervals on a low-ingest table): Thousands of tiny tables accumulate. The query planner must evaluate every chunk's CHECK constraint during planning, even for simple queries. Planning overhead grows linearly with chunk count and can exceed hundreds of milliseconds.

Too few large chunks (e.g., 30-day intervals on a high-ingest table): Each chunk contains billions of rows. Compression and decompression operations take much longer. Retention can only drop data in 30-day blocks. Memory pressure increases because each chunk must fit in work_mem during operations.

The Sizing Formula

TimescaleDB's documentation recommends targeting approximately 25 million rows per chunk. This size provides enough data for effective compression while keeping individual chunks manageable.

chunk_interval = target_rows_per_chunk / rows_per_second

Worked examples:

Ingest Rate	Rows/Day	Target 25M Rows	Recommended Interval
10 rows/sec	864,000	29 days	30 days
100 rows/sec	8,640,000	2.9 days	3 days
1,000 rows/sec	86,400,000	6.9 hours	6 hours
10,000 rows/sec	864,000,000	42 minutes	1 hour

To measure your actual ingest rate on an existing table:

SELECT
    count(*) AS total_rows,
    round(count(*) / 86400.0, 1) AS rows_per_second,
    round(count(*) / 86400.0 * 25000000 / 3600, 0) AS recommended_interval_hours
FROM sensor_readings
WHERE recorded_at >= now() - INTERVAL '1 day';

The Impact of Getting It Wrong

The same dataset partitioned at different intervals produces dramatically different characteristics. Consider 259 million rows (100 rows/sec over 30 days):

Interval	Chunks/Month	Avg Chunk Size	Planning Overhead	Retention Granularity
1 hour	720	~2 MB	High (~200ms+)	Drop any single hour
1 day	30	~50 MB	Low (~8ms)	Drop any single day
7 days	~4	~350 MB	Minimal (~3ms)	Drop in 7-day blocks

The 1-hour interval creates 720 chunks per month. After a year without retention, that is 8,640 chunks. The query planner evaluates every chunk's constraint for every query, adding hundreds of milliseconds of planning time even to trivial queries.

The 1-day interval is the sweet spot for this ingest rate: manageable chunk counts, daily retention granularity, and planning overhead under 10ms.

You can check planning overhead on your own tables:

EXPLAIN ANALYZE
SELECT count(*) FROM sensor_readings
WHERE recorded_at >= now() - INTERVAL '1 hour';
-- Look at "Planning Time" in the output
-- Under 10ms: healthy
-- Over 50ms: too many chunks

How Interval Affects Compression

Smaller chunks compress faster individually -- a 2 MB chunk compresses in milliseconds. But the compression worker must process more chunks per cycle, adding scheduling overhead. Larger chunks take longer to compress but produce better ratios because the column encoders have more data to find patterns in.

The decompression side matters more for query performance. A 24-hour range query against 1-hour chunks decompresses 24 chunks. Against 1-day chunks, it decompresses 1. The per-segment decompression cost is similar, but the chunk-level overhead (opening tables, reading metadata, building scan plans) multiplies with chunk count.

For compression policies, the compress_after interval should be at least 1-2x your chunk_time_interval:

-- Good: compress_after is 2x the chunk interval
SELECT set_chunk_time_interval('sensor_readings', INTERVAL '1 day');
SELECT add_compression_policy('sensor_readings', compress_after => INTERVAL '2 days');

Compressing a chunk while it is still receiving writes forces repeated decompress-modify-recompress cycles that waste I/O and block other operations.

Changing the Interval on Existing Tables

set_chunk_time_interval() only affects chunks created after the change. Existing chunks retain their original interval permanently.

SELECT set_chunk_time_interval('sensor_readings', INTERVAL '3 days');

After a change, you will have a mixed-interval hypertable: old 7-day chunks alongside new 3-day chunks. This is functionally harmless -- the query planner handles mixed intervals correctly. But the chunk count reduction happens gradually as old chunks age out through retention.

For a complete migration to a new interval (necessary when old chunks are causing planning overhead), the process requires creating a new hypertable and copying data:

-- 1. Create new hypertable with desired interval
CREATE TABLE sensor_readings_new (LIKE sensor_readings INCLUDING ALL);
SELECT create_hypertable('sensor_readings_new', 'recorded_at',
    chunk_time_interval => INTERVAL '3 days');

-- 2. Copy data in time-ordered batches
INSERT INTO sensor_readings_new
SELECT * FROM sensor_readings
WHERE recorded_at >= now() - INTERVAL '30 days'
ORDER BY recorded_at;

-- 3. Swap tables
BEGIN;
ALTER TABLE sensor_readings RENAME TO sensor_readings_old;
ALTER TABLE sensor_readings_new RENAME TO sensor_readings;
COMMIT;

-- 4. Drop old table after verifying
DROP TABLE sensor_readings_old;

Schedule this during a maintenance window. The INSERT...SELECT can be I/O-intensive, and for very large datasets, you should batch the copy in time-bounded segments to keep transaction sizes manageable.

Getting It Right

Calculate from your ingest rate. Measure rows_per_second on production data, target 25M rows per chunk, and round to a clean interval.
Pair with a retention policy. Even with the correct interval, unbounded retention eventually creates too many chunks.
Review when ingest rate changes significantly. A 10x traffic increase means your interval should shrink by roughly 10x.
Monitor planning overhead. Check EXPLAIN ANALYZE planning times periodically. Over 50ms suggests too many chunks.
Set compress_after appropriately. At least 1-2x the chunk interval to avoid compressing active chunks.