DEV Community

丁久
丁久

Posted on • Originally published at dingjiu1989-hue.github.io

Database Compression: Page-Level, Tuple-Level, Columnar, and TOAST

This article was originally published on AI Study Room. For the full version with working code examples and related articles, visit the original post.

Database Compression: Page-Level, Tuple-Level, Columnar, and TOAST

Database Compression: Page-Level, Tuple-Level, Columnar, and TOAST

Database Compression: Page-Level, Tuple-Level, Columnar, and TOAST

Database Compression: Page-Level, Tuple-Level, Columnar, and TOAST

Database Compression: Page-Level, Tuple-Level, Columnar, and TOAST

Database Compression: Page-Level, Tuple-Level, Columnar, and TOAST

Database Compression: Page-Level, Tuple-Level, Columnar, and TOAST

Database Compression: Page-Level, Tuple-Level, Columnar, and TOAST

Database Compression: Page-Level, Tuple-Level, Columnar, and TOAST

Database Compression: Page-Level, Tuple-Level, Columnar, and TOAST

Database Compression: Page-Level, Tuple-Level, Columnar, and TOAST

Database Compression: Page-Level, Tuple-Level, Columnar, and TOAST

Database Compression: Page-Level, Tuple-Level, Columnar, and TOAST

Database Compression: Page-Level, Tuple-Level, Columnar, and TOAST

Database compression reduces storage footprint and, more importantly, improves query performance by reducing the amount of data read from disk. This article covers compression techniques across PostgreSQL, MySQL, and columnar databases.

Why Compression Matters

Compression provides three benefits:

  • Storage savings : Reduce disk costs by 2x-10x depending on data characteristics.

  • I/O reduction : Fewer pages read per query means faster scans.

  • Cache efficiency : More data fits in shared_buffers or the OS page cache.

The trade-off is CPU usage for compression and decompression. Modern hardware makes this trade-off favorable for most workloads.

PostgreSQL Compression Layers

Page-Level Compression

PostgreSQL stores data in 8 KB pages. Page-level compression compresses the entire page as a unit. The page_compression feature (available since PostgreSQL 15 with zstd) reduces page size on disk:

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Enable page compression on a table

CREATE TABLE logs_compressed (

id BIGSERIAL,

payload TEXT,

created_at TIMESTAMPTZ

) WITH (compression = 'zstd');

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Or alter existing table

ALTER TABLE logs SET (compression = 'pglz');

Page compression works transparently: the database decompresses pages when reading and compresses when writing. The overhead is minimal for sequential scans.

TOAST (The Oversized-Attribute Storage Technique)

TOAST is PostgreSQL's built-in mechanism for handling large field values. When a row exceeds the 8 KB page size, PostgreSQL moves oversized values to a secondary TOAST table:

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Check TOAST compression settings

SELECT attname,

CASE attstorage

WHEN 'p' THEN 'plain'

WHEN 'm' THEN 'main'

WHEN 'x' THEN 'extended'

WHEN 'e' THEN 'external'

END AS storage_type

FROM pg_attribute

WHERE attrelid = 'documents'::regclass

AND attnum > 0;

Storage types:

  • PLAIN: No compression, no TOAST. For fixed-width types like INTEGER.

  • EXTENDED (default for TEXT, BYTEA): Try compression first; if still too large, move to TOAST.

  • EXTERNAL: Move to TOAST without compression. Useful for data that is already compressed (e.g., JPEG, JSON).

  • MAIN: Try compression but keep in main table if possible.

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Change storage type for a column

ALTER TABLE documents ALTER COLUMN image SET STORAGE EXTERNAL;

TOAST compression uses a fast, lightweight algorithm (pglz or zstd). It is invisible to queries: SELECT statements decompress transparently.

Tuple-Level Compression

Tuple-level compression compresses individual row values. PostgreSQL's built-in COMPRESSION clause (PostgreSQL 14+) allows per-column compression:

CREATE TABLE events (

id BIGSERIAL,

event_type TEXT COMPRESSION lz4,

payload JSONB COMPRESSION zstd,

created_at TIMESTAMPTZ

);

Supported algorithms: pglz, lz4, zstd (with --with-zstd build flag). LZ4 is fastest with moderate compression; Zstd offers the best ratio.

External Compression with pgstattuple

Measure your current compressio


Read the full article on AI Study Room for complete code examples, comparison tables, and related resources.

Found this useful? Check out more developer guides and tool comparisons on AI Study Room.

Top comments (0)