Query Processing in PostgreSQL ( Part 1 )

#postgres #database

Always don't forget that,
the query processing is the most complicated system in PostgreSQL.

it all started when the PostgreSQL server side receives a query from the client.
Recall from the last chapter ( chapter 2 ), the process type that's responsible for handling client/end-user requests is the "Backend process".

Backend process System consists of 5 subsystems,

Parser
Analyzer
Rewriter
Planner
Executor

N.B
in this blog, we will focus on this Backend type process and its subsystems.

Take a look at the big picture of the query processing Journey before getting more focus on each subsystem and how it operates and what is its target.

Query processing Journey Briefly,

as we mentioned before it all started when the PostgreSQL server side receives a query from the client/end user.
the backend process is created and starts working based on its subsystems.
first, we meet the "parser" subsystem, which takes the plain text query sent from the client side, checks it for any "Syntax" related errors, and generates the "Parse Tree".
then we meet the "Analyzer" subsystem, which receives the "Parse tree" from the parser, checks for any "Semantics" related errors, and generates the "Query tree".
then we meet the "Rewriter" subsystem, which receives the "Query tree" from the analyzer, and transform it according to the "Rule system".
then the "Planner" subsystem, receives the "Query tree" from the rewriter ( after transforming ), and generates the "Query plan tree" that aims to optimize this query execution as much as possible.
then the "Executor" subsystem, receives the "Query plan tree", and starts executing based on it.

Formally,

1. Parser ( Parser subsystem )

input: client SQL statements
Output: Parse Tree
checks for any "Syntax" errors, and report if happens.
don't check for any "Semantic" errors ( will pass this step if exists )
parse tree has a root node "SelectStmt" structure that is defined in the "parsenodes. h" file.

2. Analyzer ( Analyzer subsystem )

input: Parse tree ( that's generated by Parser subsystem )
Output: Query tree
checks for any "Semantic" errors, and report if happens.
query tree has a root node "Query" structure that is defined in the " parsenodes.h" file.

N.B

semantics error is for example querying a table or attribute that doesn't exist in your database.

3. Rewriter ( Rewriter/Realization subsystem )

input: Query tree ( that's generated by Analyzer subsystem )
Output: Enhanced Query tree
it realizes the "Rule system", and transforms/enhances the provided query tree

N.B

Views in Pg. , implemented using a "rule system".
Views also known as "Virtual tables", are used usually to simplify complex queries ( define a complex query once as view and reuse easily ).

4. Planner ( Planning / Optimizer subsystem )

input: Enhanced Query tree ( that's generated by Rewriter subsystem )
Output: planned query tree ( Query plan tree )
its target is to generate the "best" query plan which can be processed by the executor subsystem most effectively.
Generally planner optimization technique varies, there are
- pure cost-based optimization
- rule-based optimization
- hints
- and more.
in PG, the planner subsystem is based totally on pure cost-based optimization and doesn't support rule-based optimization and hints.

N.B

Hints
- options or strategies specified for enforcement by the SQL Server query processor on SELECT, INSERT, UPDATE, or DELETE statements.
- they override any execution plan the query optimizer might select for a query.
- it's recommended only to use hints as a last resort for experienced developers and database administrators.
- doesn't support in PostgreSQL, and will not be supported, if you want to use it you will need a middleware as (pg_hint_plan)

5.Executor ( Executing subsystem )

input: planned query tree ( Query plan tree ) ( that's generated by Planner subsystem )
Output: Executing the provided query using the set plan by the planner.

N.B
Different trees we mention :

Parse tree ( comes from the parser )
Query tree ( comes from the analyzer )
Enhanced/Transformed Query tree ( comes from the rewriter )
Query plan tree ( comes from the planner )

DEV Community

Query Processing in PostgreSQL ( Part 1 )

Top comments (0)

Read next

Data API for Amazon Aurora Serverless v2 with AWS SDK for Java - Part 6 Comparing cold and warm starts between Data API and JDBC

Handling key-value storage efficiently with Unstorage

Securely connect to an Amazon RDS

Mastering Relational Database Design: A Comprehensive Guide