Your query is snappy in dev. Then it hits production data and the plan quietly flips from an Index Scan to a Seq Scan, or a Nested Loop to a Hash Join, and everything falls over. The painful part: you can't see that coming on a dev database with 5,000 rows, because the planner picks plans based on how big it thinks the tables are.
So let's change what it thinks.
The planner runs on pg_class statistics
When Postgres plans a query, it doesn't count your rows — it reads cached estimates from the catalog, mainly pg_class.reltuples (estimated row count) and relpages (size in pages). ANALYZE refreshes them. Crucially, those are just numbers in a table — and in a transaction you can change them, watch the planner react, and roll the change back so nothing is ever committed.
That gives you a "what does this query's plan look like at scale?" button without loading a single row.
The trick
BEGIN;
-- Tell the planner these tables are 10,000× bigger than they are.
UPDATE pg_class
SET reltuples = reltuples * 10000
WHERE relname IN ('orders', 'customers')
AND relkind IN ('r', 'p');
-- No ANALYZE: this never runs the query, just plans it.
EXPLAIN (FORMAT JSON) SELECT * FROM orders JOIN customers USING (customer_id) WHERE ...;
ROLLBACK; -- the catalog edit is never committed, invisible to everyone else
Run that at factor 1, 100, 10000 and diff the plans. The factor where a Seq Scan or a join algorithm flips is exactly the data size where your query's behaviour changes — the thing you wanted to know.
A few details that make it safe and correct:
-
EXPLAINwithoutANALYZEnever executes the query. It only plans it. So you can do this with a scaryDELETEor a 10-minute report and nothing happens. -
ROLLBACKunconditionally. Thepg_classedit lives only inside the transaction; other sessions never see it, and it's gone the moment you roll back. (Wrap it intry/finallyso an error still rolls back.) -
Scale the indexes too. An index's row estimate also lives in
pg_class(the index has its own row there). If you only bump the table, the planner sees a giant table with a tiny index and makes weird choices. Bump both.
Here's the real version from cli2ui's "scale simulation," which first does a plain EXPLAIN to discover which tables the query actually touches, then scales exactly those plus their indexes:
UPDATE pg_class
SET reltuples = reltuples * $1 -- the factor
WHERE oid IN (
SELECT oid FROM pg_class
WHERE relname = ANY($2) AND relkind IN ('r','p')
UNION
SELECT indexrelid FROM pg_index -- the tables' indexes too
WHERE indrelid IN (SELECT oid FROM pg_class
WHERE relname = ANY($2) AND relkind IN ('r','p'))
);
…wrapped in:
conn.autocommit = False
try:
cur.execute("SET LOCAL statement_timeout = %s", [timeout_ms])
cur.execute("SET LOCAL lock_timeout = '2s'") # don't hang on catalog locks
cur.execute(SCALE_PGCLASS_SQL, [factor, relnames, relnames])
cur.execute("EXPLAIN (FORMAT JSON) " + sql_text)
plan = cur.fetchone()[0]
finally:
conn.rollback() # never persist the what-if catalog edit
The honest caveats
-
You need privileges to
UPDATE pg_class— superuser, or ownership of those catalog rows. This is a developer/staging trick, not something you hand to an untrusted user on prod. -
It's an approximation. You're scaling row counts, not regenerating column statistics (histograms,
n_distinct, correlation). Selectivity estimates for specificWHEREvalues won't shift the way they would with real data. It's excellent for "does the plan shape change as the table grows," less so for precise cost numbers. -
lock_timeoutmatters. Touchingpg_classtakes catalog locks; cap the wait so a busy server doesn't make your what-if hang.
But for the question that actually bites you — "will the planner abandon my index when this table gets big?" — it's a 2-second answer on your laptop instead of a 2 a.m. incident.
This is one piece of cli2ui — a local-only web UI over the
psqlcommands you keep half-remembering. No AI, no SaaS. It's MIT-licensed on GitHub. What command do you reach for that should be a button?
Top comments (0)