1.1.2 Simple vs Extended

#postgres #database #internals #protocol

The fork visible in 1.1.1 (simple query protocol on one side, extended on the other) is the subject of this section, one level deeper. 1.1.1 set the skeleton: simple is one message, extended is four. The job here is to show how that split translates into four distinct outcomes: plan reuse, parameter safety, pipelining, and error handling.

Message sequence: the shape of one cycle is different

Putting the message sequences side by side makes the difference visible at a glance.

Simple:

Client                          Server
  │                               │
  │── 'Q' (SQL text) ────────────▶│
  │                               │ parse → analyze/rewrite → plan
  │                               │ → create portal → execute → drop portal
  │◀── RowDescription, DataRow*, CommandComplete, ReadyForQuery

Extended:

Client                          Server
  │                               │
  │── 'P' (SQL template) ────────▶│ parse + analyze, store prepared statement
  │◀── ParseComplete                
  │                               │
  │── 'B' (parameter values) ────▶│ choose plan, create portal
  │◀── BindComplete                
  │                               │
  │── 'E' (execute) ─────────────▶│ run portal, send rows
  │◀── DataRow*, CommandComplete   
  │                               │
  │── 'S' (Sync) ────────────────▶│ close transaction
  │◀── ReadyForQuery

Simple finishes one cycle in a single message. Extended slices the cycle into four messages, and that slicing is what produces the four capabilities below.

Capability one: execution plans get reused

The central concept that lets extended split the stages is the prepared statement.

A prepared statement is a SQL template that has already been parsed and analyzed. The places where values would go are left blank with placeholders like $1, $2, and at execution time only the actual values get plugged into those slots. Take INSERT INTO users (id, name) VALUES ($1, $2). Once you turn that into a prepared statement, you can run it later by sending only the values: (1, 'Alice'), (2, 'Bob'). The full SQL text isn't reparsed each time. Give it a name and it becomes a named prepared statement you can call back during the session. Send it without a name and it's an unnamed prepared statement, automatically discarded the moment the next 'P' arrives.

The four messages of the extended protocol are exactly that flow, sliced.

Message	What it does
`'P'` Parse	Take the SQL template, finish parse and analysis, store as a prepared statement
`'B'` Bind	Bind actual parameter values to the prepared statement and prepare for execution (create a portal)
`'E'` Execute	Run the prepared portal and send result rows
`'S'` Sync	End of the cycle, send ReadyForQuery

What this means is that the same prepared statement can be re-executed many times with different parameters by repeating just 'B' + 'E'. Take inserting 1,000 users.

# Driver pseudocode: 1000 INSERTs via a prepared statement
stmt = conn.prepare("INSERT INTO users (id, name) VALUES ($1, $2)")
for i in range(1000):
    stmt.execute(i, f"user{i}")

conn.prepare(...) corresponds to a single 'P' message. Parsing and analysis of the SQL text happen there. Each of the 1000 stmt.execute(...) calls corresponds to a 'B' + 'E' pair. Parse and analyze run only once on the first call; the remaining 999 do bind and execute only. With simple query, the same INSERT text would be sent 1000 times and reparsed 1000 times.

Internally, a prepared statement is held in a structure called CachedPlanSource, which keeps the raw parse tree and the analysis result. When the same prepared statement gets another 'B' + 'E', the backend starts from the saved CachedPlanSource, only redecides the execution plan, and runs. Parsing and analysis are skipped.

Generic plan vs custom plan

One step further. Plan reuse is real, but to be precise there are two kinds of plan.

Custom plan: recomputed every time using the bound parameter values. Helpful when the optimal path differs by value. Take WHERE status = $1. Suppose status='pending' matches 1% of rows and status='completed' matches 99%. A distribution where the value-by-value ratios are this lopsided is what's usually called a skewed distribution. Index scan is fast for 'pending'; sequential scan is fast for 'completed'. Custom plan looks at the value on every call and picks the path that fits it. (Plan construction is the entire subject of chapter 1.4; the kinds and behavior of scan nodes are covered in 1.5.2.)
Generic plan: planned once without knowing the parameters and cached. Every EXECUTE from then on reuses the cached plan, so from each call's point of view the cost of "planning this one" is zero. The trade-off is that the same path is forced for every parameter value.

PostgreSQL decides between the two on every EXECUTE. The decision function is choose_custom_plan(), and the default policy is:

For the first 5 EXECUTEs, always use a custom plan. Collect actual cost measurements.
From the 6th onward, compare the average custom plan cost against the generic plan cost. The custom average includes the cost of planning every time, while the generic side has that cost as zero (for the reason above), so the comparison is intentionally asymmetric.
If generic is cheaper, switch to generic. Otherwise, stay on custom.

The decision can be forced via the plan_cache_mode GUC. auto (default) runs the policy above; force_custom_plan always uses custom; force_generic_plan always uses generic.

Working on another RDBMS engine, the first time I saw the "5 customs, then start comparing" rule I spent a while looking for the reason behind that 5. The conclusion: it's an arbitrary constant. The PG source comment literally says "until we have done at least 5 (arbitrary)". Other engines tend to be stricter with plan cache policy (e.g. lock the first plan in as the generic one) and let you override via a knob, while PG chose to decide dynamically on every call. The result is that a PG prepared statement isn't simply a "plan cache"; it's "automatic switching driven by statistics." This is one reason that even ORM code that uses prepared statements automatically can show much less plan caching than people expect: if a statement is called fewer than 5 times, it gets recomputed as a custom plan every time.

Capability two: SQL injection is structurally blocked

In simple query, putting a parameter into a query means embedding the value inside the SQL text, something like f"SELECT * FROM users WHERE id = {user_input}". If user_input is untrusted, you've just opened the door to SQL injection.

Extended separates the SQL template from the parameter values into different messages. 'P' carries only the template, like SELECT * FROM users WHERE id = $1. 'B' carries the values that fill those slots, in binary or text form. Those values never go through the SQL parser. They're plugged into the already-parsed plan tree as data.

When JDBC PreparedStatement, libpq PQexecParams, or psycopg2 supports ? or $1 placeholders, that's the path being used internally. The real mechanism for SQL injection prevention lives here. It isn't "remember to escape on the client"; it's a structure where the parser has no chance to interpret a user-supplied value as a SQL token.

Capability three: messages can be batched (pipelining)

Simple sends ReadyForQuery back the moment each 'Q' is processed. The client can't send the next query until that response arrives. One query equals one round-trip.

Extended only sends ReadyForQuery when an 'S' (Sync) arrives. That means a sequence like 'P', 'B', 'E', 'B', 'E', 'B', 'E', 'S' can go out as a single batch. 100 INSERTs in one round-trip. In environments with significant network latency (cross-region cloud calls, for instance), the throughput difference is large.

Built on top of this mechanism, PG 14 introduced an official pipeline mode in libpq (PQpipelineSync, PQenterPipelineMode, etc.). The wire-level capability existed before, but the libpq client API for it wasn't clean.

Capability four: a partial error doesn't break the whole batch

Simple, on error, immediately sends ErrorResponse plus ReadyForQuery. The cycle closes right away and the backend is ready for the next 'Q'. As noted above, simple is a 1-round-trip structure, so when the backend returns to normal mode there's nothing queued in the buffer behind the failed message. Closing out and waiting for the next 'Q' is enough.

Where extended runs into real trouble is the batch case. As we saw in capability three, a typical client pushes 'B', 'E', 'B', 'E', ..., 'S' into the wire all at once. Suppose you send 100 INSERTs by pipelining: one 'P', followed by 100 pairs of 'B' + 'E' and a single 'S' all line up in the backend's buffer. While the backend is processing the 1st 'B', the 51st and 70th messages are already sitting in that buffer waiting their turn.

Now suppose the 50th 'B' fails with something like a unique violation. If the backend behaved like simple (immediately sending ErrorResponse + ReadyForQuery and returning to normal mode), it would pull the 51st 'B' out of the buffer and start processing it next. But that 51st 'B' was sent by the client under the assumption that the first 50 had succeeded. The transaction is already aborted, so processing the 51st errors out too. Same for 52, 53, ..., 100. The client ends up tracking the original error plus 50 more downstream errors.

PG avoids this chaos with a different strategy. The moment an error occurs, the backend enters a special state called ignore_till_sync. While in that state, every message that arrives is dropped without being processed until the client explicitly sends an 'S' (Sync). No additional error responses go out. Once 'S' arrives, the backend finally sends ReadyForQuery and starts accepting messages normally again.

The result is that the client receives exactly two responses: one ErrorResponse (the 50th failure) and one ReadyForQuery (in reply to 'S'). A clean boundary forms: "the batch failed somewhere, and everything past that point was discarded." ignore_till_sync is, in essence, the byproduct that makes pipelining safe.

All four in one table

Compressing the four capabilities into a single comparison.

Area	Simple	Extended
Message count	1 (`'Q'`)	4+ (`'P'`, `'B'`, `'E'`, `'S'`)
Plan reuse	None (parse + plan every time)	Yes (`CachedPlanSource` + auto generic/custom)
Parameters	Inline in SQL text	Separated as data at bind time
SQL injection	Client is responsible for escaping	Prevented at the protocol level
Round-trips	1 per query	Batched (1 per Sync)
Error handling	Immediate ReadyForQuery	ignore_till_sync (wait until Sync)

What this means in practice

First, don't assume that "the ORM uses prepared statements" means you're getting full plan caching. PG plans a custom plan for the first 5 EXECUTEs of every prepared statement. If a statement is called only once or twice inside a short transaction, the plan caching benefit is essentially zero. The real benefit shows up in workloads that call the same prepared statement dozens to hundreds of times with different parameters. The ratio of calls to plans in pg_stat_statements, plus a forced plan_cache_mode setting, are the two diagnostic tools.

Second, the answer to "why isn't my prepared statement going generic?" is the wall of 5. Forcing plan_cache_mode = force_generic_plan brings planning cost to zero but locks every parameter value to the same path. With skewed data this can actually be slower. The opposite, force_custom_plan, pays planning cost every time. The default auto, which decides dynamically from the 6th call, is usually safest, but there are environments where explicitly choosing generic is worth the GUC tweak. For example, environments where prepared statements have very short lifetimes due to PgBouncer transaction pooling.

Third, in environments with significant network latency, pipelining is the real lever. Cross-region RDS calls in the cloud, or even same-region setups where there's millisecond-level latency between application and database, will turn 100 simple INSERTs into 100 round-trips. libpq pipeline mode, or JDBC addBatch() + executeBatch(), can collapse that to a single round-trip. Just keep in mind that the error-handling complexity goes up (you need to understand what ignore_till_sync means), so it pays to design batch-level transaction boundaries and a retry policy at the same time as the pipelining itself.