DEV Community

Cover image for Postgres or ClickHouse? Row vs Column Storage, and When Each Wins
scubaDEV
scubaDEV

Posted on

Postgres or ClickHouse? Row vs Column Storage, and When Each Wins

PostgreSQL and ClickHouse both speak SQL, both call themselves databases, and both will happily store your data. That surface similarity gets teams into trouble, because under the hood they're built for opposite jobs. Picking the wrong one doesn't show up on day one — it shows up six months later when a query that should take 50ms takes 40 seconds.

The single distinction that explains almost everything is how rows are laid out on disk.

Row storage vs column storage

PostgreSQL is row-oriented. All the fields of one record sit next to each other on disk:

(1, 'error',   'billing', '2026-06-15', 'timeout')
(2, 'info',    'auth',    '2026-06-15', 'login ok')
(3, 'warning', 'billing', '2026-06-15', 'retry')
Enter fullscreen mode Exit fullscreen mode

ClickHouse is column-oriented. Each field is stored as its own contiguous run:

id:        1, 2, 3
level:     error, info, warning
service:   billing, auth, billing
timestamp: 2026-06-15, 2026-06-15, 2026-06-15
message:   timeout, login ok, retry
Enter fullscreen mode Exit fullscreen mode

That's the whole thing. Everything else — performance, compression, what each is good and bad at — falls out of this one choice.

Why this makes them good at opposite jobs

Fetching one whole record is a row store's home turf. SELECT * FROM users WHERE id = 42 touches a single contiguous location and pulls the entire row in one read. Updating that user touches the same one place. This is OLTP — many small, targeted reads and writes, with transactions and consistency. Postgres is built for it.

Scanning a few columns across millions of rows is the column store's home turf. SELECT service, count(*) FROM logs GROUP BY service only needs the service column — so ClickHouse reads only that column off disk and ignores the rest. A row store has to read every full row just to look at one field. This is OLAP — aggregations and scans over huge datasets. ClickHouse is built for it.

Columnar layout also compresses far better, because values in one column are similar to each other (lots of repeated service names, timestamps in a tight range). Better compression means less data read from disk, which compounds the speed advantage. Add vectorized execution — processing whole column blocks at once — and analytical queries that crawl on Postgres fly on ClickHouse.

The trade-offs, stated plainly

PostgreSQL (row) ClickHouse (column)
Best at Transactions, point lookups, updates Aggregations, scans over big data
Single-row read/write Fast Slow / awkward
Updates & deletes Cheap, native Heavy "mutations", avoid
Inserts Row-at-a-time is fine Wants large batches
Transactions / ACID Full support Not its job
Aggregations on millions of rows Slow Extremely fast
Typical use App backend, orders, users Analytics, logs, metrics, events

The ClickHouse weaknesses are the mirror image of its strengths. Updating or deleting a single row is genuinely painful — the column layout that makes scans fast makes targeted edits expensive. And it wants inserts in big batches, not one row per request. If your workload is "update this order's status," ClickHouse is the wrong tool. If it's "show me revenue per region for the last year," Postgres will struggle.

They're complements, not competitors

The mistake I see most is treating this as a versus. In practice the two sit side by side: Postgres runs the transactional core of the app — the data users create, edit, and rely on being consistent — while ClickHouse holds the high-volume, append-only, analytical data: logs, events, metrics, anything you query in aggregate and never update.

The heuristic

Forget the feature lists and ask one question about your dominant query:

  • Few rows, many columns (fetch or change a specific record) → row store, use Postgres.
  • Many rows, few columns (aggregate across a huge table) → column store, use ClickHouse.

Most teams don't need to choose. They need Postgres for the app and, eventually, something columnar for the analytics — and the failure mode is forcing one engine to do the other's job because adding a second database felt like too much. The two layouts exist precisely because no single one is good at both.

Top comments (0)