DEV Community

Cover image for Summary of Chapter# 7 : "Heap Only Tuple and Index-Only Scans" from the book "The Internals of PostgreSQL" Part-2
Vinay Kumar Talreja
Vinay Kumar Talreja

Posted on • Updated on

Summary of Chapter# 7 : "Heap Only Tuple and Index-Only Scans" from the book "The Internals of PostgreSQL" Part-2

This blog aims to assist you in understanding the final concepts of Chapter:7 [Heap Only Tuple and Index-Only Scans] from the book The Internals of PostgreSQL.

Note: Ensure that you have a thorough understanding of
Chapter 7 Part-1 and basics of PostreSQL before we proceed to Chapter 7 Part-2, as it forms the foundation for our exploration.

So, Let's Continue:

  • PostgreSQL removes dead tuples if possible, as in the pruning process, at an appropriate time. In the document of PostgreSQL, this processing is called defragmentation.

Defragmentation of the dead tuples in PostgreSQL is depicted in figure below:

Image description

  • Note: the cost of defragmentation is less than the cost of normal VACUUM processing because defragmentation does not involve removing the index tuples.

  • HOT reduces the consumption of both indexes and tables of pages; this also reduces the number of tuples that the VACUUM processing has to process.

  • HOT has a good influence on performance because it eventually reduces the number of insertions of the index tuples by updating and the necessity of VACUUM processing.

The Cases in which HOT is not available in PostgreSQL is depicted in figure below:

Image description

  • When the updated tuple is stored in the other page, which does not store the old tuple, the index tuple that points to the tuple is also inserted in the index page. Refer to Figure(a) above.

  • When the key value of the index tuple is updated, the new index tuple is inserted in the index page.


Index-Only Scans

  • To reduce the I/O (input/output) cost, index-only scans (often called index-only access) directly use the index key without accessing the corresponding table pages when all of the target entries of the SELECT statement are included in the index key.

  • This technique is provided by almost all commercial RDBMS, such as DB2 and Oracle. PostgreSQL has introduced this option since version 9.2.

  • PostgreSQL utilizes the visibility map of the target table to address this issue.

  • The visibility map indicates whether all the tuples stored on a page are visible or not.

  • If all tuples on a page are visible, PostgreSQL can rely on the index tuple's key and avoid accessing the corresponding table page
    to check visibility.

  • In such cases, the visibility information provided by the index tuple is sufficient.

  • However, if any tuple on a page is not visible, PostgreSQL needs to read the table tuple pointed to by the index tuple and verify its visibility.

  • This process ensures that the visibility of tuples is accurately determined, either by using the index tuple or by accessing the table data when necessary.

  • Tuple_18’ need not be accessed because the 0th page that stores ‘Tuple_18’ is visible, that is, all tuples including Tuple_18 in the 0th page are visible.

  • Tuple_19’ needs to be accessed to treat the concurrency control because the visibility of the 1st page is not visible.

How Index-Only Scans performs in PostgreSQL is depicted in figure below:

Image description


I hope, this blog has helped you in understanding the final concepts of Heap Only Tuple and Index-Only Scans in PostreSQL.

Check out summary of Chapter : 8 Part-1

If you want to understand PostgreSQL In-Depth.

Top comments (0)