DEV Community

Pawan Kukreja
Pawan Kukreja

Posted on • Updated on

[Summary] Chapter#06 "The Internals of PostgreSQL" Vacuum Processing

Vacuum Processing
Vacuum processing is used for two main tasks: removing dead tuples and freezing transaction ids.
To remove dead tuples vacuum processing provides two modes:

Concurrent Vacuum
This vacuum removes dead tuples from each table file and is called simple vacuum and other transactions also can read tables while processing and running.

Full Vacuum
Removes dead tuples and defragments live tuples, the whole file and other transactions cannot access tables while full vacuum is running.

Outline for concurrent Vacuum process.

  • Visibility Map
  • Freeze processing
  • Removing unnecessary clog files
  • Autovacuum demon
  • Full Vacuum

Outline for Concurrent Vacuum

1 Removing dead tuples.

  • Remove Dead tuples and defragment live tuples for each page.
  • Remove index tuples that point to dead tuples.

2 Freezing old txids

  • Freeze old txids of tuples if necessary
  • Update frozen txid related system catalogues
  • Remove unnecessary parts of the clog if possible

3 Others

  • Updates the FSM and VM of processed tables
  • Updates several statistics

First Block:
This block performs freeze processing and removes index tuples that point to dead tuples. PostgreSQL freeze the old tuple list and scan the target table to build a list of dead tuples. List is stored in local memory.
After Scan PostgreSQL removes index tuples by referring to the dead tuple list. If the target index is B-tree, whether the cleanup stage is executed or not is decided by the configuration parameter, and if scanning is incomplete, PostgreSQL proceeds to the next task and then to previous incomplete tasks.

Second Block:
This block removes dead tuples and updates both FSM and VM on a page by page basis.

Third Block:
Third Block performs the cleanup after the deletion of the indexes, and updates the system related to vacuum.

Visibility Map:
Each table has an individual visibility map that holds the visibility of each page in the table file. Visibility determines the dead pages in the tuples.

Freeze Processing:
It has two modes

Lazy Mode:
PostgreSQL calculates the freezelimit txid and freezes tuples whose t_xmin is less than the freezelimit txid.

Eager Mode:
It scans all pages to inspect all tuples in tables, updates relevant system catalogues and removes unnecessary files and pages of the clog.

Improving Freeze Processing in Eager Mode:
VM has information about whether all tuples are frozen in each page or not and when the freeze process is executed in eager mode, pages containing only frozen tuples are skipped.

Autovacuum Daemon:
Vacuum processing has been automated with Autovacuum Daemon, by default it takes 1 min to invoke periodically.

Full Vacuum:
Concurrent Vacuum is not sufficient for essential operations like reducing table size even if many dead tuples are removed, if dead tuples are removed and table size is not reduced this is waste of disk space and has a negative impact on database performance.
Commands to follow if using vacuum command:

  1. Nobody can access the table when Full Vacuum is processing.
  2. At most twice the disk space of the table is used temporarily.

Reference

Top comments (0)