Why PostgreSQL EXPLAIN ANALYZE Can Mislead You — and What to Use Instead

#postgres #database #postgressql #dataengineering

EXPLAIN ANALYZE is the standard tool for understanding how PostgreSQL runs a query. It shows the chosen plan, estimated and actual row counts, and execution time. For most engineers, it is the first stop when a query is slow.

It is also frequently misread. A query plan can look clean in EXPLAIN ANALYZE and still be slow in production. A rewritten query can show a much lower cost in development and behave identically when deployed. The output is not wrong — but it answers a narrower question than most readers assume.

This article walks through five common ways EXPLAIN ANALYZE is misinterpreted, and the additional tools and flags that fill in the gaps.

Pitfall #1: Treating "Cost" as a Time Estimate

The first column most readers look at is cost. It is presented next to row estimates and looks like a measurable quantity, so it is often read as if it were milliseconds.

EXPLAIN ANALYZE
SELECT * FROM orders WHERE customer_id = 42;

                                  QUERY PLAN
------------------------------------------------------------------------
 Index Scan using orders_customer_id_idx on orders
   (cost=0.43..8.45 rows=1 width=128)
   (actual time=0.012..0.018 rows=1 loops=1)
 Planning Time: 0.087 ms
 Execution Time: 0.034 ms

cost is an arbitrary unit. The PostgreSQL planner uses it for one purpose: comparing candidate plans against each other within a single planning step. A cost value of 8.45 does not correspond to milliseconds, seconds, or any wall-clock measurement. The unit is roughly "the cost of a sequential page read," but it is adjusted by configuration parameters such as cpu_tuple_cost, random_page_cost, and seq_page_cost, so the absolute number has no fixed meaning.

Two plans with identical costs can run at very different speeds, and the planner can pick a higher-cost plan over a lower-cost one when statistics are off (more on that in Pitfall #3).

What to use instead: treat cost as a relative ranking inside one EXPLAIN run, not as a metric to track over time. The numbers that matter for actual performance are at the bottom of the output: actual time, Planning Time, and Execution Time. Those are real milliseconds.

Pitfall #2: Skipping the BUFFERS Flag

Plain EXPLAIN ANALYZE omits one of the most important pieces of information about a query: how much data was read from disk versus served from memory. The BUFFERS flag adds this.

EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE created_at > now() - interval '7 days';

 Seq Scan on orders (actual time=0.014..824.103 rows=148291 loops=1)
   Filter: (created_at > (now() - '7 days'::interval))
   Rows Removed by Filter: 9851709
   Buffers: shared hit=24 read=98217
 Execution Time: 824.512 ms

shared hit is the number of 8 KB pages found in PostgreSQL's shared buffer cache. shared read is the number of pages that had to be fetched from disk (or the operating system's file cache) because they were not in the shared buffers. In the example above, read=98217 is roughly 768 MB of cold I/O.

This matters because development environments and production environments rarely have the same cache state. A query tuned on a developer laptop with a small, fully cached table will report read=0 and look fast. The same query on a production replica with a working set that does not fit in shared_buffers will show large read values and take much longer — even though the plan is identical.

What to use instead: make (ANALYZE, BUFFERS) the default. A simple way to enforce this in psql is to set an alias in ~/.psqlrc:

\set EXPLAIN 'EXPLAIN (ANALYZE, BUFFERS, VERBOSE, FORMAT TEXT)'

After that, :EXPLAIN SELECT ... always includes buffer information. The read count is the closest signal EXPLAIN gives to how the query will behave under cold-cache production conditions.

Pitfall #3: Assuming the Planner Sees Every Row

The planner does not scan the actual table when building a plan. It consults pg_statistic, a sampled summary of each table built by the ANALYZE maintenance command (which is separate from the ANALYZE flag in EXPLAIN ANALYZE, despite the shared name).

The default sample size is roughly 300 × default_statistics_target rows. With the default default_statistics_target of 100, that means PostgreSQL routes queries against a billion-row table using statistics drawn from a 30,000-row sample.

For uniformly distributed columns, this is usually accurate enough. For skewed columns, recently inserted or updated data, or low-cardinality columns where one value dominates, estimates can be off by one or more orders of magnitude.

The current statistics for any column can be inspected directly:

SELECT
  attname,
  n_distinct,
  most_common_vals,
  most_common_freqs
FROM pg_stats
WHERE tablename = 'orders' AND attname = 'status';

When most_common_freqs shows that one value covers, for example, 95% of rows, but the planner still assumes uniform selectivity, queries filtering on that column will pick poor plans.

What to use instead:

Run ANALYZE table_name before any performance investigation, especially after a large bulk load. Autovacuum updates statistics on its own schedule, which may lag behind recent changes.
For skewed columns, raise the sample resolution: ALTER TABLE orders ALTER COLUMN status SET STATISTICS 1000; and run ANALYZE orders again.
Compare estimated rows= against actual rows= in EXPLAIN ANALYZE output. A gap of 10× or more usually points to a statistics problem rather than a query problem.

Pitfall #4: Assuming the EXPLAIN Plan Is the Plan That Runs

Running EXPLAIN ANALYZE in psql produces one plan. The application running the same query in production may execute a different plan. Two mechanisms are responsible.

Prepared statements and plan caching. When a query is executed as a prepared statement, PostgreSQL caches the plan after the fifth execution and may switch to a generic plan — one that is not specialized to the current parameter values. If parameter values vary widely in selectivity (for example, status = 'open' matches 5% of rows while status = 'archived' matches 94%), a generic plan that works well for one value can be very slow for another.

Ad-hoc EXPLAIN does not reproduce this. Running EXPLAIN ANALYZE SELECT ... WHERE status = 'archived' in psql produces a fresh, parameter-specific plan. The cached plan running in production is something else entirely.

Two PostgreSQL features close this gap.

The pg_stat_statements extension records every query executed against the server, normalized into templates with their call counts, total time, and mean time:

SELECT
  query,
  calls,
  mean_exec_time,
  total_exec_time,
  rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;

This shows which queries are actually slow in aggregate, not just which ones happened to be slow during a single debugging session.

The auto_explain module logs the actual plan used by any query that exceeds a configured duration:

# postgresql.conf
shared_preload_libraries = 'auto_explain'
auto_explain.log_min_duration = 1000   # log plans for queries > 1s
auto_explain.log_analyze = true
auto_explain.log_buffers = true

With this configured, the database server logs the plan, timing, and buffer usage of any query slower than the threshold — captured at the moment the slowness occurred, on the actual server, with the actual cache state. That log entry is a more reliable source for diagnosing real production slowness than any reproduction in psql.

Pitfall #5: Treating Execution Time as the User-Visible Latency

The Execution Time line at the bottom of EXPLAIN ANALYZE measures the time PostgreSQL spent producing the result set inside the server. It does not include several components that contribute to the time the end user actually waits:

Sending result rows across the network to the application
Serialization in the client driver (for example, pg for Node.js, psycopg for Python, JDBC for Java)
Round-trip latency between the application server and the database server
Application-side parsing, ORM hydration, and object construction

For a query returning a single row, these components are negligible. For a query returning 100,000 rows, transfer and serialization can easily exceed the database execution time. A query that runs in 80 ms inside PostgreSQL can take several seconds before the user sees a response.

What to use instead: measure both layers separately. A simple example in Python:

import time

t0 = time.monotonic()
rows = cursor.execute("SELECT ...").fetchall()
db_time = time.monotonic() - t0

t1 = time.monotonic()
results = [Order(**row) for row in rows]
parse_time = time.monotonic() - t1

If db_time is small but the overall request is still slow, the bottleneck is outside PostgreSQL and outside the scope of EXPLAIN ANALYZE. Common causes include unnecessarily large result sets, SELECT * against tables with wide TEXT or JSONB columns, network latency between the application and database, and inefficient client-side processing.

Summary: Three Habits That Address Most Misreadings

EXPLAIN ANALYZE is a precise tool with a defined scope. Most confusion comes from asking it questions outside that scope — questions about cold-cache I/O, real production plans, or end-to-end latency. Three habits cover most of the gap:

Always include BUFFERS. The shared read count is the best available indicator of how a query will behave under cold-cache production conditions. Adding it to a psql alias makes it automatic.
Enable auto_explain in production with a threshold such as 1000 ms. This captures the plan that actually ran when a query was actually slow, rather than the plan that can be reproduced in development.
Use pg_stat_statements to find slow patterns. A single EXPLAIN shows one execution. pg_stat_statements shows aggregate behavior across millions of executions. The query that consumes the most total time is rarely the one that is slowest individually — it is often a fast query that runs very frequently.

EXPLAIN ANALYZE reports what the planner and executor know. The remaining performance picture — cache state, plan caching, network transfer, client-side processing — has to be measured with the right tool for each layer.

Source: https://www.kargin-utkin.com/postgresql-explain-analyze-can-mislead-you