DEV Community

Cover image for Query Processing in PostgreSQL ( Part 2 )
David George
David George

Posted on

Query Processing in PostgreSQL ( Part 2 )

we will summarize the cost estimation of PostgreSQL for "Single-table query"

Recall :

  • The planner subsystem in PostgreSQL is "Cost-based optimization".
  • The cost is just a "Relative" indicator of performance, not an absolute one.

Cost estimation in PostgreSQL

  • cost estimation is done by functions from the "costsize.c" file.
  • each operation has its corresponding function, that is used to measure its performance or calculate its cost.
  • types of cost :
    • 1. Start-up cost
    • 2. Run cost
    • 3. Total cost

Start-up cost

  • the cost expended until the first tuple is fetched.

Run cost

  • the cost expended to fetch all tuples.

Total cost

  • the sum of both start-up and run costs.

N.B

  • both Start-up, Run costs are independent costs since they are calculated independently of any other cost.
  • Total cost is a dependent one since, it's calculated as the sum of both start-up and run costs.

SQL EXPLAIN statement

  • provides information about how PostgreSQL executes statements ( Query plan tree ).
  • also provides cost estimation information, start-up cost, and total cost for each query node ( operation in the query plan ).

N.B

  • Popular operations in query plan tree ( appears using EXPLAIN )
    • sequential (seq.) scan
    • index scan
    • sort

1. Sequential (seq.) scan

  • Sequential scan, scans the entire relation (table) as stored on disk.
  • it's cost estimated by "Cost-seqScan" from "CostSize.c" file we mention before.
  • it has a Start-up cost "Zero"
  • and for the Run cost, there is a defined formula used for its estimation

The formula :

Image description

N.B

  • Here in cost-estimation, we assume the "Worst" case scenario where we consider that all pages will read from storage (neglecting the case of some pages retrieved from the shared buffer)

2. Index scan

  • there are many indexing methods in Pg. for indexing like Btree, Gist, GIN, or BRIN
  • the cost estimated by the "cost_index" function.
  • Start-up cost for the index scan is defined as the cost to read the index pages to access the first tuple in the target table, and estimated using a defined formula,

Image description

  • Run cost for index scan is defined as the cost to read the index pages to access the first tuple in the target table, and estimated using a defined formula,

Image description

Top comments (0)