Nile Lazarus

Posted on Jun 23, 2023 • Edited on Jul 21, 2023

Demystifying the Internals of PostgreSQL - Chapter 3

#postgres #sql #database

Welcome back once again to my blog series on the internal workings of PostgreSQL.
In the last post, we covered Chapter 2 'Process and Memory Architecture' of The Internals of PostgreSQL by author Hironubu SUZUKI.
Continuing on with this journey, we will now begin covering Chapter 3 'Query Processing'. Query processing is the most complex subsystem in PostgreSQL. This chapter will help us understand this crucial feature better, especially when it comes to query optimization. So let's jump right in.

Overview

As discussed in chapter 2, all queries issued by a connected client are handled by backend processes. This backend is comprised of five subsystems:

1. Parser:

The parser creates a parse tree from an SQL statement in plain text. It checks only the syntax of the statement and returns an error if a syntax error is found. It does not check the semantic correctness of the query, such as if a table name which does not exist is used in the query.

2. Analyzer/ Analyser:

The analyzer performs a semantic analysis of the parse tree to create a query tree and returns an error if a semantic error is found. The query tree contains metadata of the query such as type of command (SELECT, INSERT, etc.) and leaves which hold data on individual particular clauses.

3. Rewriter:

If there are rules stored in the rule system in pg_rules system catalog, the rewriter uses them to transform the query tree.

4. Planner:

The Planner generates the plan tree that could most effectively be executed from the query tree. It is based purely on cost-based optimization and does not handle any rule-based optimization. It is the most complex subsystem in PostgreSQL and is covered in greater detail in later chapters.

5. Executor:

The Executor executes the query using tables and indexes in the order stored in the plan tree provided by the Planner and then retrieves the actual data by following the outlined steps. It interacts with subsystems like the buffer manager, storage manager, and index access methods to efficiently access and process data, ensuring the correct and efficient execution of the plan.

Cost Estimation in Single-Table Query

Query optimization in PostgreSQL is based on cost which is a dimensionless value. Cost is not an absolute performance indicator but instead is used to compare relative performance of operations.
Costs are estimated by functions defined in costsize.c. All operations have corresponding cost functions such as cost of sequential scan is estimated by cost_seqscan() function.
There are three kinds of costs in PostgreSQL:

start-up: cost expended before the first tuple is retrieved
run: cost expended to retrieve all tuples
total: sum of start-up and run costs

The EXPLAIN command can be used to view the start-up and run costs of each operation.

Sequential Scan

The cost of Sequential Scan is calculated using the following formula:

run_cost = (cpu_tuple_cost + cpu_operator_cost) * Ntuple + seq_page_cost * Npage

where seq_page_cost, cpu_tuple_cost and cpu_operator_cost are set in the postgresql.conf file.

Index Scan

The cost_index() function is used to calculate the total cost of an index scan. As explained above, the total cost is the sum of the start-up and run costs of performing the index scan.

Sort

The cost of sorting operations (e.g. ORDER BY) is calculated by the cost_sort() function. If all the tuples are stored in work_mem, the quicksort algorithm is used. If not, a temporary file is created and the file merge sort algorithm is used.

Creating the Plan Tree of a Single-Table Query

Here, we explore the intricate process of creating a plan tree for a single-table query.
The plan tree is a hierarchical structure composed of nodes representing various operations and their relationships. We discuss the basic steps involved in creating the plan tree, starting from determining the access path to setting up scan nodes, applying filtering conditions, and generating projection information. We also introduce different plan node types, such as SeqScan, IndexScan, Filter, and Projection, and elucidate their roles in executing the query effectively.

These topics are covered in greater detail in the chapter along with examples.

How the Executor Performs

In this section, we unravel the inner workings of the Executor component, which is responsible for executing the generated plan tree. We examine the structure of the Executor, which comprises an initialization phase and a loop phase.
During the loop phase, tuples are fetched, processed, and returned to the user. We dive into the tuple processing mechanism, including the handling of filtering conditions, performing projections, and executing various types of join operations. Additionally, we explore the critical role played by the buffer manager in managing data pages and caching frequently accessed data for efficient query execution. Lastly, we shed light on the significance of transaction management in ensuring data consistency throughout the query execution process.

Join Operations

PostgreSQL supports three join operations: nested loop join, merge join, and hash join. These three join methods can perform all join operations such as INNER JOIN, LEFT/RIGHT OUTER JOIN, FULL OUTER JOIN and so on.

Nested Loop Join

This section explains the nested loop join algorithm, which iterates through the join input using nested loops and evaluates the join condition.
PostgreSQL supports nested loop join as well as five variations of it. Explained in this section are nested loop join, materialised nested loop join, indexed nested loop join and also further variations.

Merge Join

This section discusses the merge join algorithm, which requires pre-sorted inputs and merges them based on the join condition efficiently.
Merge join can only be used in natural and equi joins. The cost is estimated using the initial_cost_mergejoin() and final_cost_mergejoin() functions.
Covered in this section is not only merge join but also materialised merge join and other variations.

Hash Join

The hash join section explores the hash join algorithm, which builds hash tables on the join inputs and uses them to perform the join operation.
Like merge join, this can also only be used in natural joins and equi-joins. Its behaviour in PostgreSQL differs depending on the size of the tables. If the target table is small enough, PostgreSQL will implement a simple two-phase in-memory hash join. Otherwise, a hybrid hash join with the skew method is used.
This section covers the in-memory hash join, hybrid hash join with skew, and join access paths and join nodes. I highly recommend reading the chapter for better understanding with examples and illustrations included.

Creating the Plan Tree of Multiple-Table Query

This section focuses on the process of creating a plan tree for multiple-table queries in PostgreSQL and provides a deeper understanding of the steps involved in creating an optimal plan tree for multiple-table queries, considering factors like join ordering, join types, and additional plan nodes. It includes the following subsections:

Preprocessing

The subquery_planner() function defined in planner.c invokes preprocessing. Following are parts of the preprocessing for a multiple-table query:

Planning and Converting CTE
Pulling Subqueries Up
Transforming an Outer Join to an Inner Join

Join Ordering

This section discusses the importance of join ordering in query planning and explores the different strategies used for determining the order in which tables are joined.

Join Type Selection

Here we examine the selection of join types, such as inner join, outer join, and cross join, based on the join conditions and input data.

Other Plan Nodes

Covered here are various other plan nodes involved in multiple-table query processing, such as aggregation, sorting, and limit/offset operations.

DEV Community