DEV Community

Shaiby014
Shaiby014

Posted on

Vacuum Processing in PostgreSQL: Chapter 6

As we move towards the conclusion of our blog covering 6 chapters from the Internals of PostgreSQL, here is the last blog covering the 6th chapter of the book covering vacuum processing in PostgreSQL.

Introduction

The performance and integrity of PostgreSQL databases are crucially dependent on vacuum processing. Vacuum processing ensures dependable operation and effective data retrieval by eliminating dead tuples and freezing transaction IDs. In this article, we'll examine the subtleties of vacuum processing in PostgreSQL, including its various modes, the Visibility Map, freeze processing, handling of clog files, the auto vacuum daemon, and the requirement for Full VACUUM.

1. Understanding Vacuum Processing:

In PostgreSQL, vacuum processing is a maintenance procedure that eliminates inactive tuples and freezes transaction IDs. Table rows known as "dead tuples" are no longer required, and freezing transaction IDs stops data tampering. Concurrent VACUUM and Full VACUUM are the two modes in which vacuum processing can be carried out.

2. Concurrent Vacuum:

A vacuum processing technique known as concurrent vacuum eliminates dead tuples page by page. While the procedure is active, it permits other transactions to access the table. Concurrent VACUUM removes index tuples pointing to dead tuples and defragments live tuples. By monitoring page visibility and avoiding pointless scans, PostgreSQL 8.4 introduced the Visibility Map (VM) to streamline this procedure.

3. The Visibility Map:

In PostgreSQL 8.4, a table-specific map called the Visibility Map (VM) was added. It complements the Free Space Map (FSM) and boosts the effectiveness of vacuum processing. The VM keeps track of page visibility and reports whether or not a page contains dead tuples. The vacuum process can skip pages without dead tuples by using the VM, which optimizes speed and uses fewer resources.

4. Freeze Processing:

An essential component of vacuum processing is freeze processing, which requires storing outdated transaction IDs (txids) for tuples. Lazy and eager freeze processing modes are available in PostgreSQL. Using data from the VM, only pages containing dead tuples are scanned in lazy mode. All pages are scanned using eager mode, which also completes a thorough freeze operation. Eager mode was significantly optimized in PostgreSQL 9.6 by skipping pages with fully frozen tuples.

5. Removing Unnecessary Clog Files:

Transaction states are kept in PostgreSQL's clog (commit log). PostgreSQL makes an effort to delete unneeded clog files when the minimum frozen transaction ID is adjusted. Older clog files and related pages can be securely erased while still maintaining transaction information for data integrity by locating the clog file that has the minimum frozen transaction ID. This frees up disc space.

6. The Autovacuum Daemon:

The vacuum procedure in PostgreSQL was automated with the advent of the autovacuum daemon. Multiple autovacuum_worker processes are started by the daemon at regular intervals, ensuring gradual and effective removal of dead tuples. Vacuum processing for various tables is carried out concurrently by autovacuum workers, minimizing the impact on ongoing database operations and streamlining database administration.

7. Full VACUUM:

Despite being crucial, concurrent hoover cannot completely reduce table size. A table's physical size does not change, which results in wasted disc space and decreased performance. By carrying out further procedures such table clustering or reorganization, physically compressing the table, and reclaiming unneeded space, full VACUUM resolves this issue.

Conclusion:

Vacuum processing is an indispensable part of PostgreSQL database maintenance. By understanding the different modes, utilizing the Visibility Map, optimizing freeze processing, managing clog files, leveraging the autovacuum daemon, and considering Full VACUUM, database administrators can ensure efficient performance and data integrity. Implementing effective vacuum processing strategies is essential for maintaining the reliability and optimal functioning of PostgreSQL databases.


For extensive study use this link: https://www.interdb.jp/pg/pgsql06.html
Similar articles to this:https://dev.to/hannan2910/vacuum-processing-11m7

Top comments (0)