Mircea Cadariu

Posted on Oct 14 • Edited on Nov 3

Postgres Column Tetris

#database #postgres #webdev #programming

If you've been working with Postgres for a while, you have probably already learned how to write queries and tune them for performance. Today, I'd like to show you a lesser known optimization technique called Column Tetris: the practice of ordering your columns to minimize storage overhead due to CPU alignment requirements.

As far as I can tell, the name of this technique was coined by Erwin Brandstetter here.

Important: The best time to apply this is when you initially create a table, since there's no data to migrate. But as they say, better late than never!

Why bother?

Column Tetris delivers tangible benefits:

More rows per page: Postgres' 8kb pages can fit more rows, reducing I/O operations
Better cache utilization: denser pages mean more data fits in RAM, reducing slow disk reads
Faster sequential scans: Less data to read means faster table scans
Lower backup/restore times: Smaller tables are faster to backup and restore

Convinced? Let's dive in.

Why does column order matter?

In Postgres, the order you define columns in your CREATE TABLE statement affects how much disk space your table consumes. This is because CPUs prefer to read data at memory addresses that are multiples of the data type's size.

An 4-byte integer wants to start at an address divisible by 4
An 8-byte timestamp wants to start at an address divisible by 8
etc

This is called "alignment". Not that alignment, but this one.
When data isn't naturally aligned, Postgres inserts padding bytes to maintain proper alignment. These padding bytes waste space and bloat your storage.

A concrete example

Let's track user logins with a simple schema:

CREATE TABLE logins (
    user_id    INTEGER,        -- 4 bytes
    is_success BOOLEAN,        -- 1 byte (+ 3 bytes padding)
    login_time TIMESTAMP,      -- 8 bytes
    is_mobile  BOOLEAN         -- 1 byte
);

The problem? The login_time TIMESTAMP needs to start at an address divisible by 8. Since is_success (1 byte) comes right before it, PostgreSQL must insert 3 bytes of padding to align the TIMESTAMP properly.
Let's verify this. We know the row header is 24 bytes. Add our data (4 + 1 + 3 padding + 8 + 1 = 17 bytes), and we should get 41 bytes total.

INSERT INTO logins VALUES (12345, true, NOW(), false);

SELECT pg_column_size(logins.*) as row_size_bytes FROM logins;

Result:

 row_size_bytes 
----------------
             41

Confirmed! Now let's optimize by reordering the columns:

CREATE TABLE logins_optimized (
    login_time TIMESTAMP,      -- 8 bytes
    user_id    INTEGER,        -- 4 bytes
    is_success BOOLEAN,        -- 1 byte
    is_mobile  BOOLEAN         -- 1 byte
);

By placing the TIMESTAMP first, it's naturally aligned at the start. The INTEGER fits perfectly after it, and both BOOLEANs last.

INSERT INTO logins_optimized VALUES (NOW(), 12345, true, false);

SELECT pg_column_size(logins_optimized.*) as row_size_bytes FROM logins_optimized;

Result:

 row_size_bytes 
----------------
             38

Savings: 3 bytes per row just from reordering!

Padding between rows

There's more to the story. PostgreSQL also aligns entire tuples to 8-byte boundaries (MAXALIGN on 64-bit systems). This means padding is inserted not just within rows, but also between them.

Let's see this effect at scale by inserting 1 million rows:

TRUNCATE logins, logins_optimized;

-- Insert 1 million rows
INSERT INTO logins (user_id, is_success, login_time, is_mobile)
SELECT 
(random() * 1000000)::INTEGER,
random() > 0.1,
NOW() - (random() * interval '365 days'),
random() > 0.5
FROM generate_series(1, 1000000);

INSERT INTO logins_optimized 
SELECT login_time, user_id, is_success, is_mobile 
FROM logins;

Now let's analyze the storage:

SELECT 
    pg_relation_size('logins') as original_bytes,
    pg_relation_size('logins') / 8192 as original_pages,
    1000000.0 / (pg_relation_size('logins') / 8192.0) as rows_per_page_original,

    pg_relation_size('logins_optimized') as optimized_bytes,
    pg_relation_size('logins_optimized') / 8192 as optimized_pages,
    1000000.0 / (pg_relation_size('logins_optimized') / 8192.0) as rows_per_page_optimized
FROM logins, logins_optimized
LIMIT 1;

Results:

 original_bytes | original_pages | rows_per_page_original  | optimized_bytes | optimized_pages | rows_per_page_optimized 
-------------+-------------+----------------------+-----------------+-----------------+-------------------------
    52183040 |        6370 | 156.9858712715855573 |        44285952 |            5406 |    184.9796522382537921

Breaking this down:

Original:

1,000,000 rows in 6,370 pages
~157 rows per page
52,183,040 bytes / 1,000,000 rows = 52.18 bytes per row (including all overhead)

Optimized:

1,000,000 rows in 5,406 pages
~185 rows per page
44,285,952 bytes / 1,000,000 rows = 44.29 bytes per row (including all overhead)

Difference: 52.18 - 44.29 = 7.89 bytes per row

Notice the difference is larger than the 3 bytes we saved from column alignment alone. The extra savings come from inter-row padding—each tuple is padded to reach an 8-byte boundary, and poorly ordered columns require more padding.

Total savings

Let's see the overall impact:

-- Check the table sizes
SELECT 
    pg_size_pretty(pg_relation_size('logins')) as original_size,
    pg_size_pretty(pg_relation_size('logins_optimized')) as optimized_size,
    pg_size_pretty(
        pg_relation_size('logins) - 
        pg_relation_size('logins_optimized')
    ) as savings;

Results:

 original_size | optimized_size | savings 
------------+----------------+---------
 50 MB      | 42 MB          | 7712 kB

That's a 14% reduction in storage from simply reordering four columns. Not bad, I'll take it!

Guideline

To minimize the padding I showed you in the sections above, order your columns by alignment requirements, largest to smallest:

8-byte types: BIGINT, DOUBLE PRECISION, TIMESTAMP, TIMESTAMPTZ
4-byte types: INTEGER, REAL, DATE
2-byte types: SMALLINT
1-byte types: BOOLEAN
Variable-width types last: TEXT, VARCHAR, BYTEA

When does this matter the most?

Most impactful when you have:

Tables with millions or billions of rows
Tables with many small fixed-width columns
Tables that are frequently scanned in full

Low-impact scenarios:

Small lookup tables (< 1000 rows)
Tables with only a few columns
Tables dominated by large TEXT/VARCHAR fields

For a table with 100 million rows, saving 8 bytes per row translates to ~800 MB less storage, faster scans, better cache utilization, and lower I/O costs.

Final Thoughts

Column Tetris is a simple technique that costs nothing at design time but can yield significant storage savings. Think of it like organizing your closet: arranging items thoughtfully takes the same effort as tossing them in randomly, but the results are much better.
So next time you write a CREATE TABLE statement, take a moment to play Column Tetris. Your database will thank you.

Thanks for reading! Until next time!

DEV Community