DEV Community

Cover image for 1.1.1 Life of a Query
JoongHyuk Shin
JoongHyuk Shin

Posted on • Edited on

1.1.1 Life of a Query

A single line of SQL goes through five stages before it comes back as a result: message dispatch, parser + analyzer, rewriter, planner, executor. This section follows the broad skeleton of how each stage is implemented in the code and in what order the stages get called. The detailed mechanics of each stage are unpacked in later chapters.

In this book, a statement refers to a single unit of SQL command that the client sends to the backend. One SELECT * FROM users, one INSERT INTO users ..., each is one statement. A single request may carry several statements joined by ;, and in that case each is handled as a separate statement. Every section from here on uses "statement" in this sense.

One backend process owns one query

Every time a client connects, PostgreSQL forks a backend process for it (the parent is postmaster). That process stays alive until the client disconnects, and it handles every query that client sends, by itself. Unlike the thread-pool model common in other RDBMSs, PG uses one OS process per connection. Why PG chose this model is taken up separately in chapter 6.

The actual entry point of that backend is a function called PostgresMain. The name is grand; what it does is unexpectedly simple. Two preparations, that's all.

First, it installs signal handlers. Signals are asynchronous notifications the OS delivers to a process (for example, SIGTERM is a request to shut down, SIGUSR1 is for PG-internal communication). A backend has to react to signals from postmaster and from other backends, so each signal is wired to a handler ahead of time. Signals and IPC (inter-process communication) in general are covered in chapter 6.

Second, it initializes the transaction system. Every SQL statement in PG, even without an explicit BEGIN, runs inside some transaction. The transaction system is the core PG machinery that tracks BEGIN/COMMIT boundaries, MVCC visibility, XID assignment, and so on. Transactions and MVCC are the subject of all of chapter 3. For now, it's enough to know that this machinery is set up before the backend ever sees a SQL statement.

Once those two preparations are done, the real work of the backend begins. An infinite loop.

for (;;)
{
    ...
    ReadyForQuery(whereToSendOutput);
    ...
    firstchar = ReadCommand(&input_message);
    ...
    switch (firstchar)
    {
        case PqMsg_Query:        // 'Q', simple query
            exec_simple_query(query_string);
            break;
        case PqMsg_Parse:        // 'P', extended: parse
            exec_parse_message(...);
            break;
        case PqMsg_Bind:         // 'B', extended: bind
            exec_bind_message(&input_message);
            break;
        case PqMsg_Execute:      // 'E', extended: execute
            exec_execute_message(portal_name, max_rows);
            break;
        case PqMsg_Sync:         // 'S', end of a cycle
            finish_xact_command();
            send_ready_for_query = true;
            break;
        ...
    }
}
Enter fullscreen mode Exit fullscreen mode

This loop is the entire life of a backend.

"Announce that I'm ready, read one message, dispatch on its type." Repeat forever. When the client closes the connection, an 'X' (Terminate) message arrives, the loop exits, and the process dies.

Here the processing path splits into two. The 'Q' path is one road, and 'P' / 'B' / 'E' is the other three. That split is the difference between the simple query protocol and the extended query protocol (the conventions by which the client and the backend exchange messages).

Simple vs extended

Simple is the case where a single message contains the SQL text in full. Type SELECT 1; into psql and hit enter, and that's what flies across the wire. The backend receives that one message and runs the full five-stage cycle (dispatch, parser + analyzer, rewriter, planner, executor) before returning the result.

Extended does the same job but splits it into four messages ('P', 'B', 'E', 'S'). Splitting the stages opens up plan reuse, parameter safety, and pipelining. The semantic differences between the two protocols and how they play out in practice are unpacked in section 1.1.2.

Optimizable vs utility

Everything described so far assumes optimizable statements: SELECT/INSERT/UPDATE/DELETE. These have paths to optimize with a cost model. The planner decides between sequential and index scan, hash join and nested loop.

But statements like CREATE TABLE, VACUUM, SET, and BEGIN (the utility statements) are different. There's nothing for a cost model to optimize. They're DDL or system commands, with no path to choose. In that case the planner produces only an empty shell of a plan and hands the actual work to a utility-statement handler. The executor never gets called on this path.

The detailed branching is the subject of 1.1.3. The takeaway here is just one thing: not every query in PG goes through the planner.

The big picture

We can now compress the journey of a SQL line into a single diagram.

A single line of SQL goes through five stages inside exec_simple_query, parse to analyze+rewrite to plan to portal execution, comes back as a result, and returns to the top of the main loop

Each box in this diagram corresponds to a chapter in the book. 1.2 is parser and analyzer, 1.3 is rewriter, 1.4 is planner, 1.5 is the executor. All of part 1 is essentially one zoomed-in view of this diagram.

Working on another RDBMS engine, I once found this aspect of PG surprising. PG accepts a multi-statement query like SELECT 1; SELECT 2; as a single simple-query message. What's even more surprising is the transaction handling. Without an explicit BEGIN/COMMIT, all those statements get bundled into a single implicit transaction block, and if even one of them fails, the whole batch rolls back.

At first I wondered if this might be standard behavior. Comparing the client protocols of other major databases made it clear this is a PG-specific decision. MySQL has CLIENT_MULTI_STATEMENTS off by default, so multi-statement queries are simply rejected (you have to flip the flag explicitly because of SQL injection risk). Even with the flag on, statements are processed sequentially, and because autocommit is the default, each one commits as its own transaction. Oracle accepts only one statement per OCI call, so to bundle multiple statements you have to wrap them in an anonymous PL/SQL block (BEGIN ... END;). SQL Server accepts multiple statements in a T-SQL batch, but atomic handling still requires an explicit BEGIN TRANSACTION. None of the three does what PG does: bundle automatically as soon as the message arrives.

What this means in practice

This five-stage skeleton turns out to be the foundation for two diagnostic tools you'll use in operations.

First, you can see exactly where EXPLAIN's output comes from. EXPLAIN runs only as far as stage 4 (plan); it skips stage 5 (execute). EXPLAIN ANALYZE actually runs through stage 5 and measures it. That's why EXPLAIN ANALYZE produces real load and shouldn't be casually run in production: an EXPLAIN ANALYZE UPDATE ... actually updates rows. The familiar BEGIN; EXPLAIN ANALYZE UPDATE ...; ROLLBACK idiom exists for exactly this reason.

Second, the fact that one backend means one process means one query at a time explains why connection pooling matters so much. A backend's main loop is essentially single-threaded. While one client runs a long query, that backend can't do anything else. Connection counts therefore drive memory and scheduling costs linearly, and a pooler like PgBouncer becomes effectively mandatory. The answer to "why are PostgreSQL connections so expensive?" lives inside this one-line main loop.

Top comments (0)