PostgreSQL gives you a detailed execution plan for every query. Reading it is a skill — interpreting it correctly under pressure, at scale, or across dozens of slow queries is another thing entirely. Most developers either skip the plan entirely or paste it into an online visualizer and hope for the best.
The problem
You run a slow query. PostgreSQL hands you this:
[
{
"Plan": {
"Node Type": "Hash Join",
"Actual Rows": 923847,
"Actual Loops": 1,
"Plans": [
{ "Node Type": "Seq Scan", "Relation Name": "orders", "Actual Rows": 1000000 },
{ "Node Type": "Hash", "Batches": 8, "Memory Usage": 4096 }
]
},
"Execution Time": 1243.821
}
]
Now what? The plan is telling you something. A sequential scan on a million-row table. A hash join that spilled across 8 batches. But connecting those dots to a concrete action — add an index here, raise work_mem there — takes experience that not everyone has, and time that no one has enough of.
pgexplain: automated plan analysis
pgexplain is a Go library and CLI that parses EXPLAIN (ANALYZE, FORMAT JSON) output and surfaces actionable findings.
$ pgexplain plan.json
[WARN] sequential scan on "orders" discards 8332x more rows than it returns
node: Seq Scan (ID 1)
detail: PostgreSQL read 100000 rows from "orders" but only 12 matched
(customer_id = 42) (8332 rows discarded per row returned).
suggestion: Add an index on "orders" to support the filter (customer_id = 42).
Run EXPLAIN (ANALYZE, BUFFERS) after adding the index to confirm it is used.
1 finding: 0 error(s), 1 warning(s), 0 info
Instead of staring at a wall of JSON, you get a ranked list of what's wrong and what to do about it.
Supported rules
| Rule | What it catches |
|---|---|
SeqScan |
Sequential scan that discards far more rows than it returns |
RowEstimateMismatch |
Planner estimates off by 10× or more |
HashJoinSpill |
Hash joins that spill to disk |
NestedLoopLarge |
Nested loops with large outer input |
MissingIndexOnlyScan |
Heap fetches defeating an index-only scan |
SortSpill |
Sort operations that spill to disk |
TopNHeapsort |
LIMIT queries using slow heapsort |
ParallelNotLaunched |
Parallel plans where workers never started |
MergeJoinUnsortedInputs |
Merge Join with explicit Sort children |
HighTempBlockIO |
High temp block I/O from aggregations, window functions, CTEs |
Use it in CI
pgexplain exits with code 1 if any Warn or Error findings are found, which makes it a drop-in CI gate:
psql -U myuser -d mydb \
-c "EXPLAIN (ANALYZE, FORMAT JSON) SELECT * FROM orders WHERE customer_id = 42" \
| pgexplain || exit 1
Gate your pull requests on query plan quality, not just correctness.
Use it as a library
plan, _ := parser.Parse(explainJSON)
adv := advisor.New(
rules.SeqScan(),
rules.RowEstimateMismatch(),
rules.HashJoinSpill(),
// ...
)
for _, f := range adv.Analyze(plan) {
fmt.Printf("[%s] %s\n → %s\n", f.Severity, f.Message, f.Suggestion)
}
Embed it in your own slow query logger, migration runner, or developer CLI.
Install:
go install github.com/bright98/pgexplain/cmd/pgexplain@latest
# or as a library
go get github.com/bright98/pgexplain
pgwatch: continuous slow query monitoring
Catching bad plans during development is only half the battle. On a live server, slow queries happen continuously — and most teams only notice them after a user complains.
pgwatch is a daemon that tails your PostgreSQL log file, extracts the auto_explain plans that PostgreSQL writes for every slow query, and feeds them through pgexplain automatically.
PostgreSQL log file
│
▼
pgwatch ←─── tails & parses auto_explain JSON blocks
│
▼
pgexplain rule engine ←─── detects the real problems
│
▼
terminal / JSON / HTML report
No database connection required. It's a pure log reader — it never executes EXPLAIN itself.
What the output looks like
=== pgwatch report — Wed, 13 May 2026 14:00:00 UTC ===
#1 2026-05-10 14:23:01 UTC myuser@mydb duration=1243.82ms [auto_explain]
[ERROR] node=1 (Hash Join) — hash batch spill to disk
Inner side wrote 42 MB to temp files across 8 batches.
→ Increase work_mem or reduce the join input size.
#2 2026-05-10 14:31:44 UTC myuser@mydb duration=891.10ms [auto_explain]
[WARN] node=2 (Seq Scan) — sequential scan discards 5000x more rows than it returns
→ Add an index on "events" to support the filter (user_id = 99).
Quick start
Enable auto_explain in postgresql.conf:
shared_preload_libraries = 'auto_explain'
auto_explain.log_min_duration = 1000 # ms
auto_explain.log_format = json
auto_explain.log_analyze = on
log_line_prefix = '%m [%p] %q%u@%d '
Install and run:
go install github.com/bright98/pgwatch/cmd/pgwatch@latest
cp pgwatch.example.yaml pgwatch.yaml
# set log_file to your PostgreSQL log path
pgwatch run -c pgwatch.yaml # daemon mode — flushes a report every hour
pgwatch report -c pgwatch.yaml # one-shot — read the log once and exit
Output formats
pgwatch supports three output formats — terminal (default), json, and html. The HTML report is self-contained with no external dependencies: collapsible plan JSON, color-coded severity badges, and sortable findings.
How they fit together
| Tool | What it does | When to use it |
|---|---|---|
pgexplain (CLI) |
Analyzes a single plan file or psql pipe | During development, in CI |
pgexplain (library) |
Embeds plan analysis into your own Go tool | Custom tooling, migration runners |
pgwatch |
Continuously monitors production slow query logs | Staging and production servers |
- Use pgexplain in development and CI to catch bad plans before they ship.
- Use pgwatch in production to know which queries are hurting you right now, with suggestions attached.
Links
Both are written in Go, require no extensions beyond auto_explain (which ships with PostgreSQL), and are MIT licensed. Feedback and contributions are welcome.
Top comments (0)