DEV Community

Cover image for Query Processing in PostgreSQL ( Part 1 )
David George
David George

Posted on

Query Processing in PostgreSQL ( Part 1 )

Always don't forget that,
the query processing is the most complicated system in PostgreSQL.

  • it all started when the PostgreSQL server side receives a query from the client.
  • Recall from the last chapter ( chapter 2 ), the process type that's responsible for handling client/end-user requests is the "Backend process".

Backend process System consists of 5 subsystems,

  • Parser
  • Analyzer
  • Rewriter
  • Planner
  • Executor

N.B
in this blog, we will focus on this Backend type process and its subsystems.

Take a look at the big picture of the query processing Journey before getting more focus on each subsystem and how it operates and what is its target.

Query processing Journey Briefly,

  • as we mentioned before it all started when the PostgreSQL server side receives a query from the client/end user.
  • the backend process is created and starts working based on its subsystems.
  • first, we meet the "parser" subsystem, which takes the plain text query sent from the client side, checks it for any "Syntax" related errors, and generates the "Parse Tree".
  • then we meet the "Analyzer" subsystem, which receives the "Parse tree" from the parser, checks for any "Semantics" related errors, and generates the "Query tree".
  • then we meet the "Rewriter" subsystem, which receives the "Query tree" from the analyzer, and transform it according to the "Rule system".
  • then the "Planner" subsystem, receives the "Query tree" from the rewriter ( after transforming ), and generates the "Query plan tree" that aims to optimize this query execution as much as possible.
  • then the "Executor" subsystem, receives the "Query plan tree", and starts executing based on it.

Formally,

1. Parser ( Parser subsystem )

  • input: client SQL statements
  • Output: Parse Tree
  • checks for any "Syntax" errors, and report if happens.
  • don't check for any "Semantic" errors ( will pass this step if exists )
  • parse tree has a root node "SelectStmt" structure that is defined in the "parsenodes. h" file.

2. Analyzer ( Analyzer subsystem )

  • input: Parse tree ( that's generated by Parser subsystem )
  • Output: Query tree
  • checks for any "Semantic" errors, and report if happens.
  • query tree has a root node "Query" structure that is defined in the " parsenodes.h" file.

N.B

  • semantics error is for example querying a table or attribute that doesn't exist in your database.

3. Rewriter ( Rewriter/Realization subsystem )

  • input: Query tree ( that's generated by Analyzer subsystem )
  • Output: Enhanced Query tree
  • it realizes the "Rule system", and transforms/enhances the provided query tree

N.B

  • Views in Pg. , implemented using a "rule system".
  • Views also known as "Virtual tables", are used usually to simplify complex queries ( define a complex query once as view and reuse easily ).

4. Planner ( Planning / Optimizer subsystem )

  • input: Enhanced Query tree ( that's generated by Rewriter subsystem )
  • Output: planned query tree ( Query plan tree )
  • its target is to generate the "best" query plan which can be processed by the executor subsystem most effectively.
  • Generally planner optimization technique varies, there are
    • pure cost-based optimization
    • rule-based optimization
    • hints
    • and more.
  • in PG, the planner subsystem is based totally on pure cost-based optimization and doesn't support rule-based optimization and hints.

N.B

  • Hints
    • options or strategies specified for enforcement by the SQL Server query processor on SELECT, INSERT, UPDATE, or DELETE statements.
    • they override any execution plan the query optimizer might select for a query.
    • it's recommended only to use hints as a last resort for experienced developers and database administrators.
    • doesn't support in PostgreSQL, and will not be supported, if you want to use it you will need a middleware as (pg_hint_plan)

5.Executor ( Executing subsystem )

  • input: planned query tree ( Query plan tree ) ( that's generated by Planner subsystem )
  • Output: Executing the provided query using the set plan by the planner.

N.B
Different trees we mention :

  • Parse tree ( comes from the parser )
  • Query tree ( comes from the analyzer )
  • Enhanced/Transformed Query tree ( comes from the rewriter )
  • Query plan tree ( comes from the planner )

Top comments (0)