Diffing two EXPLAIN plans as trees, not text

#postgres #database #algorithms

You added an index, re-ran EXPLAIN, and now you want the obvious answer: did the plan actually change? So you diff the two outputs as text. And it lights up red everywhere — because cost= and rows= shift on nearly every line, even when the strategy is identical. The one line you cared about — Seq Scan on orders becoming Index Scan using orders_created_idx on orders — is buried in a wall of numeric churn.

Text is the wrong representation. An EXPLAIN plan is a tree, and the question "what changed?" is a tree-diff, not a string-diff.

Get the tree, not the text

EXPLAIN (FORMAT JSON) hands you the plan as structured data instead of pretty-printed lines. Parse it into a small node type — node kind, the table, estimated rows, cost, and children:

@dataclass
class PlanNode:
    node_type: str        # "Seq Scan", "Hash Join", "Sort", …
    relation: str | None  # the table, if it's a scan
    plan_rows: float
    total_cost: float
    children: list["PlanNode"]

Now you have two trees, before and after. The naive diff is to walk both in lockstep and compare node i to node i. That breaks the instant the planner inserts a node: add one Sort or a parallel Gather near the top and every node below it shifts position. A positional diff then reports the entire subtree as changed, which is exactly the noise you were trying to escape.

Align first, then classify

The fix is to align the two node sequences before comparing them — the same job git diff does on lines. Flatten each tree to a list, then let stdlib's difflib.SequenceMatcher find the longest matching run and the insertions/deletions around it.

The trick that makes the output readable is what you align on. If you align on the full node label, a Seq Scan on orders and an Index Scan using … on orders look like two different things — you get a delete and an add, when what actually happened is one node changed its mind. So align on a shape key that deliberately ignores the access method:

_SCANS = {"Seq Scan", "Index Scan", "Index Only Scan",
          "Bitmap Heap Scan", "Bitmap Index Scan"}
_JOINS = {"Nested Loop", "Hash Join", "Merge Join"}

def shape_key(node):
    # Any scan on `orders` keys the same, whatever the method.
    if node.node_type in _SCANS or node.node_type.endswith("Scan"):
        return ("scan", node.relation)
    if node.node_type in _JOINS:
        return ("join",)                  # any join keys the same
    return (node.node_type, node.relation)

With this key, Seq Scan on orders and Index Scan on orders both key to ("scan", "orders"), so they align into one pair. Then you compare the real labels and mark that pair changed — the flip you wanted, as a single row. A genuinely new node (a Sort that wasn't there before) has no partner and falls out as added.

The alignment itself is a dozen lines:

def diff_plans(before, after):
    fa, fb = flatten(before), flatten(after)
    keys_a = [shape_key(n) for _, n in fa]
    keys_b = [shape_key(n) for _, n in fb]
    sm = difflib.SequenceMatcher(a=keys_a, b=keys_b, autojunk=False)

    rows = []
    for tag, i1, i2, j1, j2 in sm.get_opcodes():
        if tag == "equal":               # same shape: changed-label or truly same
            for di, dj in zip(range(i1, i2), range(j1, j2)):
                rows.append(pair(fa[di], fb[dj]))
        elif tag == "delete":
            rows += [removed(fa[di]) for di in range(i1, i2)]
        elif tag == "insert":
            rows += [added(fb[dj]) for dj in range(j1, j2)]
        elif tag == "replace":
            rows += [removed(fa[di]) for di in range(i1, i2)]
            rows += [added(fb[dj]) for dj in range(j1, j2)]
    return rows

Every node ends up in one of four buckets — same / changed / added / removed — and the strategy flips you actually care about stop hiding behind shifting cost numbers.

One more deliberate choice: don't put tree depth in the shape key. Inserting a wrapper node bumps the depth of everything beneath it; if depth were part of the key, that one insertion would re-diff the whole subtree. Leaving depth out lets the sequence alignment keep the rest paired up, and you still carry depth along just for indentation when you render.

The honest caveat

Shape-key alignment is a heuristic for human eyeballing, not a proof. A self-join — two scans of the same table — keys identically, so the two can mis-pair; difflib aligns by sequence order, so an unusual reordering can put the wrong nodes together. It's right the vast majority of the time and it makes "what changed?" legible in a glance, but if it ever looks wrong, fall back to the full node detail. And remember plan_rows/total_cost are estimates — only EXPLAIN ANALYZE gives you real counts and timings to diff.

This is one piece of cli2ui — a local-only web UI over the psql commands you keep half-remembering. No AI, no SaaS. It's MIT-licensed on GitHub. The plan-tree diff above is the real plan_diff.py it uses for "before vs after an index." What command do you reach for that should be a button?

DEV Community

Diffing two EXPLAIN plans as trees, not text

Get the tree, not the text

Align first, then classify

The honest caveat

Top comments (0)