Query Processing in PostgreSQL

#sql #database #postgres

This chapter discusses query processing in PostgreSQL, focusing on query optimization, single-table and multiple-table queries, and features like Foreign Data Wrappers and Parallel Query. PostgreSQL's backend comprises five subsystems: Parser, Analyzer/, Rewriter, Planner, and Executor. Query optimization is based on cost estimation for operations, and the planner generates the most effective plan tree from the query tree.
The planner creates a plan tree for single-table queries through preprocessing, finding the cheapest access path, and creating the plan tree from the cheapest path. Access paths are units of processing for estimating costs, and the planner uses a PlannerInfo structure to hold the query tree, relations, and access paths. To get the cheapest access path, the planner estimates costs for all possible access paths and selects the least expensive one.
The executor in PostgreSQL performs single-table queries and join operations, including nested loop join, merge join, and hash join. PostgreSQL supports three join operations: nested loop join with five variations, merge join with four variations, and hash join with a focus on hybrid hash join with skew. Hash join can be used in natural joins and equi-joins and involves a build and probe phase. Hybrid hash join with skew is employed when the inner table is too large to fit in one batch in work_mem.
In multiple-table queries, if the number of tables is less than 12, the planner uses dynamic programming to get the optimal plan; otherwise, it employs the Genetic Query Optimizer. Determining the optimal plan tree through dynamic programming involves creating and estimating costs for RelOptInfo structures in each level. The cheapest paths are calculated efficiently, and examples and explanations for obtaining the cheapest path in a triple-table query are provided.

DEV Community

Query Processing in PostgreSQL

Top comments (0)