Lie to the Postgres planner: EXPLAIN your query at 10,000 the rows

#database #performance #postgres #sql

Your query is snappy in dev. Then it hits production data and the plan quietly flips from an Index Scan to a Seq Scan, or a Nested Loop to a Hash Join, and everything falls over. The painful part: you can't see that coming on a dev database with 5,000 rows, because the planner picks plans based on how big it thinks the tables are.

So let's change what it thinks.

The planner runs on `pg_class` statistics

When Postgres plans a query, it doesn't count your rows — it reads cached estimates from the catalog, mainly pg_class.reltuples (estimated row count) and relpages (size in pages). ANALYZE refreshes them. Crucially, those are just numbers in a table — and in a transaction you can change them, watch the planner react, and roll the change back so nothing is ever committed.

That gives you a "what does this query's plan look like at scale?" button without loading a single row.

The trick

BEGIN;

-- Tell the planner these tables are 10,000× bigger than they are.
UPDATE pg_class
   SET reltuples = reltuples * 10000
 WHERE relname IN ('orders', 'customers')
   AND relkind IN ('r', 'p');

-- No ANALYZE: this never runs the query, just plans it.
EXPLAIN (FORMAT JSON) SELECT * FROM orders JOIN customers USING (customer_id) WHERE ...;

ROLLBACK;   -- the catalog edit is never committed, invisible to everyone else

Run that at factor 1, 100, 10000 and diff the plans. The factor where a Seq Scan or a join algorithm flips is exactly the data size where your query's behaviour changes — the thing you wanted to know.

A few details that make it safe and correct:

EXPLAIN without ANALYZE never executes the query. It only plans it. So you can do this with a scary DELETE or a 10-minute report and nothing happens.
ROLLBACK unconditionally. The pg_class edit lives only inside the transaction; other sessions never see it, and it's gone the moment you roll back. (Wrap it in try/finally so an error still rolls back.)
Scale the indexes too. An index's row estimate also lives in pg_class (the index has its own row there). If you only bump the table, the planner sees a giant table with a tiny index and makes weird choices. Bump both.

Here's the real version from cli2ui's "scale simulation," which first does a plain EXPLAIN to discover which tables the query actually touches, then scales exactly those plus their indexes:

UPDATE pg_class
   SET reltuples = reltuples * $1            -- the factor
 WHERE oid IN (
   SELECT oid FROM pg_class
     WHERE relname = ANY($2) AND relkind IN ('r','p')
   UNION
   SELECT indexrelid FROM pg_index           -- the tables' indexes too
    WHERE indrelid IN (SELECT oid FROM pg_class
                       WHERE relname = ANY($2) AND relkind IN ('r','p'))
 );

…wrapped in:

conn.autocommit = False
try:
    cur.execute("SET LOCAL statement_timeout = %s", [timeout_ms])
    cur.execute("SET LOCAL lock_timeout = '2s'")   # don't hang on catalog locks
    cur.execute(SCALE_PGCLASS_SQL, [factor, relnames, relnames])
    cur.execute("EXPLAIN (FORMAT JSON) " + sql_text)
    plan = cur.fetchone()[0]
finally:
    conn.rollback()   # never persist the what-if catalog edit

The honest caveats

You need privileges to UPDATE pg_class — superuser, or ownership of those catalog rows. This is a developer/staging trick, not something you hand to an untrusted user on prod.
It's an approximation. You're scaling row counts, not regenerating column statistics (histograms, n_distinct, correlation). Selectivity estimates for specific WHERE values won't shift the way they would with real data. It's excellent for "does the plan shape change as the table grows," less so for precise cost numbers.
lock_timeout matters. Touching pg_class takes catalog locks; cap the wait so a busy server doesn't make your what-if hang.

But for the question that actually bites you — "will the planner abandon my index when this table gets big?" — it's a 2-second answer on your laptop instead of a 2 a.m. incident.

This is one piece of cli2ui — a local-only web UI over the psql commands you keep half-remembering. No AI, no SaaS. It's MIT-licensed on GitHub. What command do you reach for that should be a button?