Closing Chapter 1: From Query to Data

#postgres #database #internals #query

We opened Chapter 1 with a single line, SELECT * FROM users WHERE id = 1. For that line to leave the client and come back as a result row, the PostgreSQL backend went through five stages. First it decided which processing path the message should take; then the parser and analyzer turned the text into a tree and gave it meaning from the catalog. The rewriter expanded views and injected policies to transform the tree, the planner weighed the possible execution paths by cost and picked the cheapest one, and the executor followed that plan, pulling up one tuple at a time and sending them back to the client.

Chapter 1 was a story about how a query is processed. What tree a given SQL becomes, what plan it turns into, in what order it runs. From start to finish, a chain of logical transformations.

But what every one of those stages ultimately deals with is data. The executor pulls up tuples, yet where on disk those tuples lie and in what shape, how they come up into memory, Chapter 1 never asked. When the planner judged an index scan cheaper than a sequential scan, it never opened up what that index physically is. Chapter 1 followed only the logical journey of a query, leaving untouched the substance of the data that journey stands on.

Chapter 2, Storage & Access Methods, opens up that substance. In what unit data sits on disk (page), where disk and memory meet (buffer manager), where and how a row survives (heap), and how that row is found quickly (B-tree and the specialized indexes). The very tuple the planner weighed by cost and the executor pulled up in Chapter 1, where it actually came from and how it came to be there, is what Chapter 2 reveals.

If Chapter 1 was the logical life of a query, Chapter 2 is the physical dwelling of data. We now look at how the data a query reaches for actually lives on disk.

DEV Community

Closing Chapter 1: From Query to Data

Top comments (0)