After the Analyze stage from 1.2.2, the raw parse tree has become a Query tree. This section is about the shape of that result. How it differs from a raw parse tree, what abstraction sits at its core, and what assumptions the next stages (rewriter and planner) get to start from when they receive a Query tree.
Tokens descend into catalog coordinates
A raw parse tree and a Query tree are both trees, but they sit on different planes. A raw parse tree mirrors the syntactic structure of SQL text, so its nodes look a lot like tokens. Where a table name appeared, you find a node carrying that name as a token. Where a column appeared, you find another name token. The types of expressions are not yet determined.
A Query tree is what those tokens look like after they have been resolved against the catalog. The same positions hold different representations now. Where a table name was, there is now an RTE carrying that table's OID and lock mode. Where a column name was, there is now an index pair: "the M-th attribute of the N-th item in rtable". Where an operator token sat, there is now an OID pointing to the function that implements that operator. The same shape of tree carries the same kinds of information at the same positions, but the information has dropped one level down from surface SQL text into catalog coordinates.
For SELECT id, name FROM users WHERE age > 18, the contrast looks like this:
raw parse tree (sketch) Query tree (sketch)
SelectStmt Query
├─ targetList ├─ commandType = CMD_SELECT
│ ├─ ColumnRef "id" ├─ rtable
│ └─ ColumnRef "name" │ └─ RangeTblEntry
├─ fromClause │ (relid = 16384, RTE_RELATION)
│ └─ RangeVar "users" ├─ jointree (FROM + WHERE)
└─ whereClause ├─ targetList
└─ A_Expr │ ├─ TargetEntry(resno=1, expr=Var(rt=1, att=1))
('>', ColumnRef "age", │ └─ TargetEntry(resno=2, expr=Var(rt=1, att=2))
A_Const 18) └─ qual: OpExpr(opno=...,
args=[Var(rt=1, att=3), Const(int4=18)])
The descent into catalog coordinates is right there in the picture. Table names become OIDs, column names become pairs of (rtable index, attribute number), operator tokens become typed expression nodes. The rewriter and planner work on those coordinates and have nothing left to resolve.
PG's decision to fold every kind of FROM item into a single enum looks unremarkable at first, but its effect compounds. If each kind of input were a different node type, the rewriter and planner would scatter kind-specific handling across every branch they touch. With one node type and an enum tag, transformations that cut across many places (think "scan every RTE and check locks", "inject an RLS policy on every RTE") fit into one place consistently.
Range table: every FROM input in one place
The Query carries a single List that holds every item appearing in the FROM clause. PG calls this list the range table, and an item in it a RangeTblEntry. A plain table, a subquery (FROM (SELECT ...)), a result of an explicit JOIN, a set-returning function call (FROM generate_series(...)), an inline row set built with VALUES (...), a CTE (WITH ... AS (...)). Whatever form a FROM input takes, it gets folded into a single entry in this list.
The kind tag lives in one field, rtekind, on the RTE. There is no separate node type per kind of input; the same RangeTblEntry object carries different rtekind values to distinguish them. The object type is one, and the value of rtekind decides which auxiliary fields are meaningful. For a plain table the OID and lock mode fields matter; for a subquery the sub-Query field matters; and so on.
You can see the payoff of this consolidation in the shape of the rewriter and planner code. Both stages, having received a Query, work to a single model: "take the N items in rtable, glue them together via jointree, and evaluate targetList against the result." A subquery, a JOIN, a function call all flow through the same model. Adding a new kind of input is a matter of adding one line to RTEKind and filling in a few branches.
The range table plays another role. Column references look at it. In a Query tree, a column is not a name; it is two integers, "attribute M of rtable item N". A column reference's meaning is bound to the order of items in the range table. That is why a rewriter or planner that adds, removes, or reorders rtable items must update the indices in existing column references at the same time.
Target list: SELECT, INSERT, UPDATE, RETURNING all in one shape
The other key List on a Query is the target list. The output columns of a SELECT, the values to assign in INSERT/UPDATE, and the columns returned by RETURNING all live in lists of the same shape with the same kind of node. One target list per Query for the SELECT/INSERT/UPDATE body, plus a second one for RETURNING when present.
The shape is one, but its meaning shifts slightly with context. Each entry has a field called resno (short for result number, indicating where the entry lands in the result), and the interpretation of this number depends on whether you are looking at SELECT, INSERT, or UPDATE. Take this table to compare the two cases.
CREATE TABLE users (
id serial PRIMARY KEY, -- attno 1
name text, -- attno 2
age int -- attno 3
);
For SELECT, resno is the ordinal position of the output column.
SELECT name, age FROM users;
This SELECT's target list has two entries. The first, holding the name expression, has resno 1; the second, holding age, has resno 2. They land in the first and second positions of the result set, respectively. Even though name has attno 2 and age has attno 3 in the table, the result positions in SELECT are independent of those attnos: they go 1, 2, in the order they appear in the SELECT list.
For INSERT, resno is the attribute number of the destination column.
INSERT INTO users (name, age) VALUES ('alice', 20);
This INSERT's target list also has two entries, but the first's resno is not 1, the second's not 2. They are 2 and 3, the attnos of name and age in the table definition. The number that lands in resno is not the order in which the columns were written, but the position of the column in the table itself. UPDATE's SET clause follows the same rule: attno in resno.
The fact that resno carries different meanings in different contexts feels odd at first, but folding two meanings into one List buys clear leverage. Expression evaluation code, target list traversal, RETURNING handling all assume one node type. The branching shrinks. The context-dependent meaning is settled when the List is built, and downstream stages just read what is already there.
The target list also carries items invisible to the user. When an expression appears in ORDER BY or GROUP BY but not in the SELECT list, that expression is added to the target list with a "junk" mark. Sorting and grouping need a place to evaluate the value, so the value rides along in the same List. Junk columns get evaluated but stripped from the final output. The visible result is clean from the application side, while internally a few extra columns flow alongside to make sorting work.
Where the Query goes
Right after a Query tree is built, the rewriter takes it. RULE expansion, view expansion, and row-level security (RLS) policy injection all happen on the Query tree at this stage. The output is still a Query tree (or a list of zero or more, when a rule expands one statement into several). Details are covered in chapter 1.3.
The rewriter hands the Query off to the planner. The planner's job is to convert a Query tree into a PlannedStmt: the strategy for how this SQL will be executed. Which indexes to use, what JOIN order to take, whether to go parallel. Chapter 1.4 covers this in depth.
What is interesting is that the next stage, the executor, does not look at the Query tree. The Query definition in parsenodes.h says it in one line: "the Query structure is not used by the executor." The executor's input is a PlannedStmt, and the Query tree, once turned into a plan, does not flow further down. The Query tree is the common currency of parser/analyzer/rewriter/planner, and that is where its lifetime ends.
This separation of stages is reflected directly in how prepared statements work. A prepared statement is a mechanism to avoid going through the parser every time the same SQL runs repeatedly, by holding the analyzed result or plan in memory. There are two layers of holding. One layer holds the Query tree (already resolved against the catalog) in memory; the layer above holds the PlannedStmt that has gone through the planner. What is commonly called the plan cache refers to these two layers together. A held Query tree lets the next execution start from the planner; a held PlannedStmt is handed straight to the executor. If the catalog changes in between, an invalidation message arrives, the held results are discarded, and analysis or planning runs again. Because the per-stage outputs of parser, analyzer, rewriter, planner, and executor are cleanly separated, deciding which layer to invalidate and where to restart is straightforward.
What this means in practice
First, EXPLAIN works in the same shape for SELECT, INSERT, UPDATE, DELETE, and MERGE. Five kinds of statement live in the same Query data structure, and the planner produces a single PlannedStmt regardless of kind. From the application side, you don't have to wonder whether you need a different tool to inspect an INSERT plan than a SELECT plan. From the tooling side, monitoring tools can deal with plans through one interface without branching on statement kind.
Second, INSERT/UPDATE remains safe even when only some columns are written. The resno in the target list carries the attno of the destination column rather than the order in which the user wrote it. The columns the user wrote land in their attno positions; missing columns are filled in with defaults later by the rewriter, which works on the same target list (covered in 1.3.3). This is the structural reason ORM-generated INSERT/UPDATE statements work correctly even when columns are out of order or only partially specified.
Third, the same SQL can produce different Query trees depending on when it was analyzed. Because catalog coordinates get baked into the Query tree, the same SQL text seen at two different times against two different catalog states produces two different trees. A prepared statement that produces different plans across two connections, a plan cache that re-analyzes after a catalog invalidation, the same code pointing at different tables under different search_path settings: all of these emerge on top of this fact. From the application side, this is a single-line summary of where the intuition "same text, same behavior" breaks.
Top comments (0)