Gabriel Anhaia

Posted on Apr 26

Postgres 18 Just Made 80% of Your NoSQL Migration Plan Pointless

#database #postgres #architecture #nosql

Book: Database Playbook
My project: Hermes IDE | GitHub — an IDE for developers who ship with Claude Code and other AI coding tools
Me: xgabriel.com | GitHub

PostgreSQL 18 shipped on September 25, 2025. If you skimmed the release notes during sprint planning and filed it under "minor upgrade," go back. The five things that landed in this release are exactly the five reasons your team had a "we may need to move some workloads to Mongo / Dynamo / Cassandra" doc on Confluence in 2023.

You can stop maintaining that doc. Most of the workloads you flagged for it now have a viable Postgres path, with one schema, one operator, one backup story.

Five PG18 features. What they replace. Where NoSQL still wins. And the JSONB schema that does what people went to Mongo for.

Async I/O changes the throughput conversation

The headline feature is the asynchronous I/O subsystem. The PG18 release notes describe sequential scans, bitmap heap scans, and vacuum issuing reads without blocking on each completion; Xata and Crunchy Data both walk through the same change. Postgres can now have many requests in flight at once. Xata and Crunchy Data report speedups in the 2-3× range on NVMe in their internal tests, on the operations that previously left the disk idle waiting for the kernel.

What this kills the case for: "we need Cassandra for write-heavy time-series ingest." The case was always shaped by Postgres's single-block-at-a-time read pattern hurting the analytics queries against fresh data. With AIO, the analytics-side scans run fast enough that you do not need a second store to query the data your OLTP system just wrote. One database, one query language, one consistency story.

What it doesn't kill: actually wide horizontal scale across regions. AIO is a per-node feature. If your write throughput exceeds what one Postgres node can sustain, you still need the sharded architectures NoSQL stores are designed around.

UUIDv7 makes the "we use Mongo because ObjectId" argument obsolete

PG18 ships a native uuidv7() function (see the PG18 release notes). UUIDv7 embeds a millisecond timestamp in the high bits, so the IDs sort by creation time. That solves the B-tree fragmentation problem that made UUIDv4 primary keys painful, and it gives you the same monotonically-ordered-ID property MongoDB's ObjectId had. Plus the uniqueness math of a UUID, plus client-side generation without a coordinator.

Where this lands: "we use Mongo because we need IDs we can generate on the client without round-tripping to the server." Postgres-generated UUIDv7s give you the same property. So does any client-side uuidv7() library. You insert with the ID already populated, you get index locality, you get rough timestamp recovery, and you do not have to explain to anyone what an ObjectId is.

Still NoSQL territory: workloads where you need cross-region monotonic ordering with no clock skew. UUIDv7 inherits whatever clock skew your fleet has, and even MongoDB hits the same wall.

JSON_TABLE plus JSONB plus GIN beats most Mongo queries

JSON_TABLE landed in Postgres 17 and is now bedded in for production use under PG18. The function takes a JSON or JSONB value and a path expression, and it projects the matched values into a relational table you can join, filter, and aggregate against. Stack JSON_TABLE on top of the existing JSONB operators (@>, ?, #>>) and the GIN index with the jsonb_path_ops operator class, and you have an indexed path-query language over schemaless documents that holds up against MongoDB on the workloads people actually use Mongo for.

The canonical Mongo collection: user events with arbitrary properties, modeled in Postgres.

CREATE TABLE user_events (
    id          uuid PRIMARY KEY DEFAULT uuidv7(),
    user_id     bigint NOT NULL,
    event_type  text NOT NULL,
    occurred_at timestamptz NOT NULL DEFAULT now(),
    props       jsonb NOT NULL
);

CREATE INDEX user_events_props_gin
    ON user_events USING GIN (props jsonb_path_ops);

CREATE INDEX user_events_type_time
    ON user_events (event_type, occurred_at DESC);

Insert documents with whatever shape:

INSERT INTO user_events (user_id, event_type, props)
VALUES
  (42, 'checkout', '{"cart": {"items": 3, "total": 79.90},
                    "promo": "SPRING10",
                    "device": {"os": "ios", "ver": "17.4"}}'),
  (42, 'view',     '{"page": "/pricing",
                    "ref": "google",
                    "device": {"os": "android", "ver": "14"}}');

Query like Mongo. "Checkouts where the cart total was over $50 and the user was on iOS, in the last day, projected as columns":

SELECT
    e.id,
    e.user_id,
    e.occurred_at,
    j.total,
    j.promo,
    j.os
FROM user_events e
CROSS JOIN LATERAL JSON_TABLE(
    e.props, '$'
    COLUMNS (
        total numeric  PATH '$.cart.total',
        promo text     PATH '$.promo',
        os    text     PATH '$.device.os'
    )
) AS j
WHERE e.event_type = 'checkout'
  AND e.occurred_at > now() - interval '1 day'
  AND e.props @> '{"device": {"os": "ios"}}'
  AND j.total > 50;

The @> containment predicate hits the GIN index. The event_type, occurred_at filter hits the btree. The JSON_TABLE projects fields into typed columns the rest of the query can use without manual casts. This is the query you would have written db.events.find({ ... }) for, except now you can join it to your users table without a separate replication pipeline.

Performance? Expect tens of milliseconds on this kind of query when GIN selectivity is good and the working set fits RAM — actual numbers depend on hardware, row width, and cardinality. One independent comparison (binaryigor.com) reports roughly parity on read workloads, with Postgres edging ahead under load on machines with reasonable RAM.

What this kills: "we need Mongo for flexible schemas." JSONB gives you flexible schemas, typed projection on read, and GIN indexes that actually use the structure. Your application teams can keep the document-shaped API they like.

What it doesn't kill: workloads where the document shape varies so wildly across rows that no index helps. Pure event-firehose ingestion where you never query by structure. There, Mongo's WiredTiger and BSON storage still has a write-throughput edge in batch inserts.

Logical replication finally feels production-ready

PG18 fixes the things that made logical replication feel like a beta feature. Per the PG18 release notes: generated columns can be replicated, conflicts are reported in pg_stat_subscription_stats and the logs, parallel apply is on by default, and logical replication slot synchronization to physical standbys means a failover does not lose your subscribers. They resume from the new primary.

Where this lands: "we use DynamoDB streams because Postgres CDC is painful." Logical replication is now a first-class CDC source. You can plug Debezium or a native subscriber into a Postgres primary, get a stream of changes with conflict resolution and failover-safe slots, and feed downstream systems without a Kafka tier in between. Most of the architectures that pushed teams to Dynamo for streams existed because Postgres CDC kept dropping events on failover. That stops being a reason in PG18.

Still NoSQL territory: cross-region active-active write workloads. Logical replication is primary-to-replica unless you build careful conflict resolution on top of it. Multi-region with low write latency to many regions remains a strong DynamoDB or Cassandra use case.

Temporal constraints, virtual generated columns, OAuth, skip scan

The smaller PG18 features stack up the same direction. The PG18 release notes cover all four: temporal PRIMARY KEY and FOREIGN KEY constraints let you express "this row is unique-per-period" without app-level checks, the kind of thing time-series databases sold you on. Virtual generated columns compute on read instead of storing, which removes the "denormalize for query speed" workaround that pushed teams to document stores. OAuth authentication is a first-party feature now. Skip-scan over multi-column indexes removes a lot of the manual index gymnastics that made Postgres feel finicky compared to schemaless stores that "just worked" until they did not.

None of those are headline-grabbing. All of them remove a small reason to leave.

Where NoSQL still wins, honestly

Three places. Be honest about them when you write the doc.

Cross-region write throughput at planet scale. If your write workload truly needs writes accepted in multiple regions with single-digit-millisecond local latency, DynamoDB Global Tables and Cassandra's leaderless replication are actually built for that. Postgres has CitusData, Yugabyte, CockroachDB, but they are different products with different tradeoffs.

Pure firehose with no query shape. If you are ingesting hundreds of thousands of writes per second of unstructured documents and the read pattern is "give me everything for user X," document stores still win on raw throughput. The MongoDB benchmarks from the EnterpriseDB whitepaper and the DocumentDatabase comparison show Mongo with a clear edge on batch insert throughput and storage compactness.

Operational simplicity for very small teams. A managed DynamoDB table requires nearly zero ongoing thought. A managed Postgres requires a little more: vacuum tuning, autovacuum metrics, occasional bloat checks. For a team of two that wants to ship a side project, Dynamo's "you never think about it" property is real.

Outside those three, the case for moving off Postgres in 2026 is mostly historical inertia. Your 2023 Confluence doc was right when it was written. Postgres caught up.

What to do with the 80%

Re-read your "things we want to move to NoSQL" list. For each item, ask:

Is it cross-region write throughput? Keep planning the migration.
Is it pure firehose ingestion with no query shape? Maybe.
Is it "we want flexible schemas," "we want client-side IDs," "we want CDC streams," "we want fast time-series reads," "we want JSON-first APIs"? Stay. PG18 has the feature you wanted.

The boring win is operational: one database, one backup tool, one auth model, one query language your analysts already know, one schema-migration tool that has tests. The things that look like minor footnotes in the 18 release notes are the things that close the last gaps that made hybrid-architecture diagrams unavoidable.

If this was useful

The Database Playbook is a per-workload decision guide. What JSONB plus GIN actually buys you, when CDC vs. dual-writes is the right pattern, where multi-region writes break Postgres and what to do instead. Written for the engineer who has to defend a database choice in a design review on Tuesday.

DEV Community