Picking a Data Store in 2026: 5 Questions Before You Reach for Postgres

#database #postgres #architecture #backend

Book: Database Playbook: Choosing the Right Store for Every System You Build
Also by me: Thinking in Go (2-book series) — Complete Guide to Go Programming + Hexagonal Architecture in Go
My project: Hermes IDE | GitHub — an IDE for developers who ship with Claude Code and other AI coding tools
Me: xgabriel.com | GitHub

You're starting a new service. Someone in the channel says "Postgres, obviously," and they're probably right. Postgres is the correct default for most systems most of the time. That's not in dispute.

The problem is that "Postgres, obviously" skips the screen that would tell you when it's wrong. And the cost of finding out later isn't a config change. It's a migration, a dual-write window, a backfill, and a quarter you don't get back.

So before you type CREATE DATABASE, run five questions. They take about ten minutes. They don't replace the decision, they front-load it.

Question 1: What's the access pattern?

Not "what's the data." What's the read and write shape.

Write the three queries that will run most often in production. Actually write them, in pseudo-SQL or plain English. Then look at how you reach the row.

By primary key, one row at a time, millions of times a second. That's a key-value access pattern. Postgres can do it, but DynamoDB or a Redis-backed store does it cheaper and flatter at that volume.
By a handful of indexed columns, returning small sets, with joins. That's relational. Postgres wins outright.
By scanning huge ranges and aggregating. That's analytical. ClickHouse or DuckDB will be one to two orders of magnitude faster, and Postgres will hold its work_mem hostage trying.
By "find me documents where this nested field contains X." That's document or search. Postgres JSONB with a GIN index covers a lot of it. Elasticsearch covers the rest.

The trap is starting from the data model instead of the access pattern. You can store a shopping cart in five different stores. The query you run against it is what decides.

-- If this is your hot path, Postgres is fine
SELECT * FROM orders
WHERE customer_id = $1
ORDER BY created_at DESC
LIMIT 20;

-- If THIS is your hot path at 100k QPS, ask harder
SELECT * FROM session_blob WHERE key = $1;

Question 2: What consistency do you actually need?

Everyone says "strong consistency" until you ask them to pay for it.

Be specific about each write path. Does a stale read break the product, or just annoy someone? A bank ledger and a "like" counter sit at opposite ends, and people reach for the same store for both out of habit.

Three honest buckets:

Transactional, read-your-writes, multi-row invariants. Money, inventory, bookings, anything with a constraint that two rows must respect together. Postgres gives you SERIALIZABLE and exclusion constraints. This is its home turf.
Eventually consistent is fine. Feeds, counters, recommendations, analytics rollups. A few seconds of staleness costs nothing. You're paying for ACID you'll never use.
Causal or session consistency. "This user must see their own edits, but not necessarily everyone else's, instantly." A surprising amount of product traffic lives here, and it relaxes the storage requirement a lot.

If most of your writes land in bucket two, a single Postgres primary is overkill on the durability side and a bottleneck on the write side. That's a signal, not a verdict.

Question 3: What's the scale shape?

Total size is the wrong number. The shape of the growth is the right one.

A 50GB database that grows 1GB a year is a different animal from a 50GB database that grows 1GB a day. Same size today, completely different store in two years.

Ask these:

How big in 18 months, not today? Multiply your current daily write volume out. If you land above a few terabytes of hot data, single-node Postgres starts asking for partitioning and bigger instances.
Is the write rate spiky or steady? Steady is easy. Spiky (sales, telemetry bursts, viral events) means you size for the peak or you queue.
Is the data append-only or mutable? Append-only time-series data is the case where a single Postgres primary stops being the obvious answer. TimescaleDB extends Postgres into that space; ClickHouse or a dedicated time-series store takes over when retention and ingest rates climb.

-- Append-only, high ingest, range-scanned by time:
-- this is the workload that outgrows vanilla Postgres
CREATE TABLE events (
  id        bigint GENERATED ALWAYS AS IDENTITY,
  ts        timestamptz NOT NULL,
  payload   jsonb NOT NULL
);
-- 1GB/day is fine. 1GB/hour, plan for partitions
-- or a different store from the start.

Postgres scales further than most teams think, with partitioning, read replicas, and careful indexing. The question isn't whether it can. It's whether the headroom you're buying is worth the operational tax you'll pay for it.

Question 4: How complex are the queries?

Count the joins. Count the ad-hoc reports. Count how often product will ask for "just one more breakdown."

This is where relational stores earn their keep and where the NoSQL pitch falls apart. The promise of a document store is that you denormalize once and read fast forever. It holds right up until the business wants a query you didn't denormalize for. Then you're doing joins in application code, which is slower, buggier, and untestable compared to the database doing them.

Rich, evolving, ad-hoc queries with joins and aggregates. Postgres. The query planner is the feature you're paying for.
A fixed, small set of access patterns you can name today and won't change. A key-value or document store fits, and you get flatter scaling in exchange for that rigidity.
Full-text or fuzzy search as a primary feature. Postgres tsvector plus trigram indexes go a long way. A dedicated search engine goes further once relevance tuning becomes a product surface.

A quick gut check: if you can't list your access patterns on one page, your queries are going to keep evolving, and a query-flexible store protects you. Reach for Postgres.

Question 5: What's the ops budget?

This is the question people skip, and it's the one that actually decides.

A "better" store you can't operate is worse than a "worse" store you can. Cassandra scales writes beautifully and will end your weekends if nobody on the team has run it. The right store is partly a function of who's on call.

Be honest about three things:

Headcount and on-call. Self-hosting Cassandra, running a Kafka-backed event store, or operating a sharded cluster is real work. A team of four picking three different databases is a team that maintains none of them well.
Managed vs self-hosted. RDS, Aurora, Neon, Supabase, CloudSQL all take the operational edge off Postgres. The equivalents exist for other stores, but the maturity and the talent pool aren't equal. Postgres has the deepest bench.
Polyglot tax. Every new store is a backup story, a monitoring story, a migration story, a hire-for-this-skill story. The second store costs far more than its license. Two well-run stores beat five badly-run ones every time.

The cheapest store to operate is the one your team already runs. That bias is correct more often than not. Fight it only when one of the first four questions gives you a hard reason.

The verdict

Run the five questions and one of three things happens.

Reach for Postgres when the access pattern is relational, the consistency needs are real, the scale shape is "large but bounded," the queries keep evolving, and your team already runs it. That covers the large majority of systems being built right now, which is exactly why it's the default.

Reach for something else when one question screams. A pure key-value hot path at six-figure QPS. An append-only firehose measured in GB per hour. Analytical scans across billions of rows where Postgres melts. A genuine need to scale writes past what one primary can take. In those cases the migration you avoid by choosing right on day one is worth more than the comfort of the default.

Reach for Postgres plus one in the common middle. Postgres as the system of record, with a purpose-built store next to it for the one workload it's bad at. Postgres plus ClickHouse for analytics. Postgres plus Redis for the hot key-value path. Postgres plus a search engine. This is where most mature systems land, and it's a deliberate choice, not an accident you back into.

The point of the screen isn't to talk you out of Postgres. It's to make sure that when you pick it, you picked it. "Obviously" is how you end up migrating in eighteen months. Ten minutes of questions is how you don't.

What's the store you picked by reflex and regretted? Or the one you almost didn't pick and it saved you? Curious which of the five questions teams skip most.

If this was useful

This post is the one-page version of a decision the Database Playbook spends a book on. It walks store by store through Postgres, MySQL, DynamoDB, Cassandra, ClickHouse, Redis, and the rest, mapping each to the access patterns, consistency needs, and scale shapes that suit it. If you've ever picked a database by reflex, the book is the slow, deliberate version of that decision.