Namsi Lydia

Posted on Dec 11, 2023

Understanding Pluggable Storage In Postgresql And Its Benefits

#database #postgres #sql #data

What is a pluggable storage?
A pluggable storage refers to the ability to use different storage engines interchangeably within the same database system.A storage engine is a component of a database management system (DBMS) that is responsible for handling how data is stored, indexed, and retrieved. By making the storage engine pluggable, developers can choose and switch between different storage engines based on their specific needs and use cases.

In this article we are going to discuss more about the various storage engines that are currently being used in postgresql and also describe the architecture of some of these pluggable storage engines.

Some of the examples of pluggable storage engines used in postgresql include:

Pluggable table Access Methods.
Zheap.
Oriole Db.

Zheap
Is a way to keep table bloat under control by implementing a storage engine capable of running UPDATE-intense workloads a lot more efficiently. The project was originally started by EnterpriseDB and a lot of effort has been put into the project already.

The main objectives of Zheap include:
1.Reduce write amplification both by avoiding rewrites of heap pages and by making it possible to do an update that touches indexed columns without updating every index.

2.Reduce the tuple size by shrinking the tuple header and eliminating most alignment padding.

3.Provide better control over bloat. zheap will prevent bloat by allowing in-place updates in common cases and by reusing space as soon as a transaction that has performed a delete or non-in-place-update has committed. In short, with this new storage, whenever possible, we’ll avoid creating bloat in the first place.

Basic architecture of Zheap:
Here are the three components that make up the Zheap architecture and they include:

zheap: The table format
undo: Handling transaction rollback, etc.
WAL: Protect critical writes

Zheap:The table format
In the zheap system, transaction details are now stored directly on the page, making data much smaller and improving performance.

Each transaction consumes 16 bytes of storage and includes three pieces of information: transaction id, epoch, and the latest undo record pointer for that transaction. A row in the system points to a transaction slot.

By default, a table has 4 transaction slots, which is usually enough for large tables. However, in some cases, more slots are required. To handle this, zheap uses a feature called "TPD" (Transaction Processing Data), which acts as an overflow area. It stores extra transaction information when the default slots are insufficient.

Zheap :Tuple Formats
In the context of PostgreSQL's zheap implementation, the structure of a single tuple (a row of data) is crucial. In a standard heap tuple in PostgreSQL, a significant portion of the tuple's size is dedicated to storing transactional information within the tuple itself, making the header quite large (20+ bytes). However, in zheap, this approach changes. Transactional details are now stored in page-level structures called transaction slots, allowing the tuple header to shrink to a mere 5 bytes.

Furthermore, zheap optimizes the layout by eliminating CPU alignment (padding) between the tuple header and the actual data in the row. Typically, standard tuples require padding, which consumes additional bytes for each row in the table. Zheap, on the other hand, avoids this padding, resulting in more efficiently packed storage.

Additionally, zheap achieves space savings by removing padding from pass-by-value data types. All these optimizations collectively contribute to saving valuable storage space in every individual row of the table.

In summary, zheap's tuple format streamlines the storage structure by relocating transactional information, reducing header size, eliminating unnecessary padding, and optimizing the handling of specific data types, ultimately leading to more space-efficient storage for each row in the table.

Undo: Handling transaction rollback
One key concept integral to zheap is the concept of "undo." What is the primary purpose of this element?we can understand its primary purpose by describing the following operations and they include:

INSERT: Adding rows
UPDATE: Modifying data
DELETE: Removing rows
UNDO and ROLLBACK in action

INSERT: Adding rows
When you perform an INSERT operation in a zheap it needs to set aside a transaction slot and then record an undo entry. This undo entry helps to correct any issues that may arise in case of an error during the INSERT process. Unlike traditional heap tables in PostgreSQL, zheap allows for prompt reclamation of space if the INSERT operation is rolled back, which is a significant distinction between the two storage mechanisms. The Transaction ID (TID) plays a crucial role in managing and undoing the changes made during the INSERT operation in zheap.

UPDATE: Modifying data
When dealing with an UPDATE statement in a zheap there are two primary scenarios to consider :

New Row Fits in the Old Space:
If the new row can be accommodated within the existing space allocated for the old row, the process is relatively straightforward. The system overwrites the old row with the new data and creates an undo entry that contains the complete old row. This way, if necessary, the system can revert to the old structure by copying the information from the undo entry.

New Row Does Not Fit in the Old Space:
When the new row is longer than the old one, the process becomes more complex. In such cases, zheap essentially has to perform a DELETE followed by an INSERT operation. This is less efficient than an in-place UPDATE, and it can impact performance.

There are specific conditions where space can be promptly reclaimed:

When updating a row to a shorter version.
When non-inplace UPDATEs are executed.

In essence, the efficiency of the UPDATE operation in zheap depends on whether the new data fits into the existing space or requires a more resource-intensive DELETE / INSERT operation.

DELETE: Removing rows
When it comes to the DELETE operation in zheap, the process involves emitting an undo record. This undo record is crucial for restoring the old row in case of a ROLLBACK. During the DELETE operation, the row needs to be eliminated from the zheap to complete the removal process. In simpler terms, when you delete a row in zheap, the system creates a record that allows it to undo the deletion and puts the removed row back in its original place if needed during a rollback.

UNDO and ROLLBACK in action
When a ROLLBACK occurs, it's crucial to ensure that the original state of the table is reinstated. This involves executing the previously scheduled undo action. If errors occur during this process, the undo action is implemented within a new transaction to ensure success.

To optimize efficiency, it's ideal to apply all undo actions associated with a single page simultaneously. This minimizes the amount of Write-Ahead Logging (WAL) that needs to be recorded. Additionally, this approach allows for reducing page-level locking to the bare minimum, decreasing contention and enhancing overall performance.

However, dealing with long transactions poses challenges, especially in scenarios where a substantial amount of data, such as a terabyte, needs to be rolled back. To address this, if the undo action surpasses a configurable threshold, a background worker process is employed to handle the task. This solution is not only elegant but also ensures a positive end-user experience, avoiding prolonged rollbacks. Furthermore, the system must be resilient to potential crashes during the rollback process.

Undo actions can be removed under three circumstances:

When no active transactions can access the data.
Once all undo actions have been successfully completed.
For committed transactions, until they become visible to all transactions.

Sample example implementation of Undo: Handling transaction rollback

-- Example 1: Inserting a Row
BEGIN;
    INSERT INTO my_table (id, name) VALUES (1, 'John Doe');
    -- Undo operation: DELETE FROM my_table WHERE id = 1;
COMMIT / ROLLBACK;

-- Example 2: Deleting a Row
BEGIN;
    DELETE FROM employee WHERE employee_id = 101;
    -- Undo operation: INSERT INTO employee (employee_id, name, department) VALUES (101, 'John Smith', 'IT');
COMMIT / ROLLBACK;

-- Example 3: Updating Data
BEGIN;
    UPDATE products SET price = 15.99 WHERE product_id = 2001;
    -- Undo operation: UPDATE products SET price = 19.99 WHERE product_id = 2001;
COMMIT / ROLLBACK;

In the following examples above ,the BEGIN initiates a transaction, and subsequent operations modify the database state. The crucial aspect is that zheap maintains undo information to revert these changes in case of a ROLLBACK or to finalize them in case of a COMMIT. The undo operations are essentially the reverse of the original operations, ensuring that the database remains in a consistent state despite transaction outcomes.

Pluggable Table Access Methods
Table access methods refer to a capability that enables a different approach to storing data within a table. Before PostgreSQL 11, these methods were mainly used for index data, allowing users to choose storage methods like B-Tree or hash. However, there was no similar flexibility for tables.

With the introduction of PostgreSQL 12, this feature was expanded to include tables as well. Access methods for tables provide a way to implement alternative storage mechanisms. These methods expose APIs allowing PostgreSQL developers to create their own methods. In PostgreSQL 12, the traditional heap format for tables is transitioned into a table access method, which is now the default option. This enhancement provides more flexibility and control over how data is stored in postgresql tables.

How do users use the table access method interface?
To define a table access method, use CREATE ACCESS METHOD with TYPE TABLE. To then have a table use the new access method, specify it in the USING clause in CREATE TABLE, CREATE TABLE AS SELECT, or CREATE MATERIALIZED VIEW. Alternatively, you can specify the default table access method in the postgresql.conf file parameter default _ table _ access _ method.

CREATE ACCESS METHOD heap1 TYPE TABLE HANDLER heap_tableam_handler;
CREATE TABLE tbl1(id int, name text) USING heap1 ...;

basic syntax support for table access methods:

CREATE ACCESS METHOD <new access method> TYPE TABLE
HANDLER <table_am_handler>
CREATE TABLE ... USING <new access method> ...
CREATE TABLE ... USING <new access method> AS ...
CREATE MATERIALIZED VIEW ... USING <new access method>

Advantages of using table access methods
The attractiveness of this functionality lies in the following aspects:

Concise and Pluggable Architecture:
- The feature offers a streamlined and adaptable structure that is both straightforward and easy for PostgreSQL developers to work with.
User-Friendly Customization:
- Users have the ability to explicitly define the access method for each individual table, providing a high degree of customization and control over data storage strategies.
Compatibility Across Databases:
- This functionality caters to a wide range of databases, accommodating both open source and commercial database systems. It ensures flexibility for users who might be using different types of databases within their environment.
Supports Diverse Access Methods:
- The coexistence of various table access methods within the same database is facilitated. This means that different tables can use different storage mechanisms, allowing users to tailor their approach based on specific requirements or performance considerations.

By leveraging pluggable table access methods, it becomes feasible to implement a diverse range of table access methods. This includes:

zHeap as an Alternative to Heap
Provide better control over bloat using in-place updates and undo records for delete and reduce write amplification as compared to heap
Columnar table
The storage layout of the columnar table is column-wise instead of row-wise. With pluggable Table AM, the implementations will be rather focused without bothering to change the core server
In-memory table
Main memory is used as primary storage tables. Rows in the table are read from and written to memory. Some considerations:

Copy of the table data is maintained on disk, only for durability
Follow Only in memory table ( MongoDB has in-memory storage engine )

Index organized table In an index-organized table, as opposed to a conventional heap-organized table where data is stored without a specific order, the information in an index-organized table is stored within a B-tree index structure, arranged in a sorted manner based on the primary key. Each leaf block in this index structure contains both the primary key and the non-key columns of the table.

Conclusion
What is the future of pluggable storage engines in postgresql
The future of pluggable storage engines is bright and is looking very promising.More initiatives from the postgresql community are coming up to improve the already existing extension available and also introduction of orioleDb into the market can be able to optimize some of these challenges too.

Resources:
[https://www.pgcon.org/2019/schedule/attachments/536_pgcon2019_pluggable_table_AM_V1.3.pdf]

[https://wiki.postgresql.org/wiki/Future_of_storage]

[https://www.cybertec-postgresql.com/en/zheap-reinvented-postgresql-storage/]

[https://www.enterprisedb.com/blog/zheap-storage-engine-provide-better-control-over-bloat]

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read full post →

Top comments (1)

GladysStanford • Dec 12 '23

A pluggable storage system enhances the flexibility and adaptability of a database by allowing the interchangeability of different storage engines within the same database framework. Fairbet 7 registration

Forem

Understanding Pluggable Storage In Postgresql And Its Benefits

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

Top comments (1)

Read next

Building a Simple SQLite Library Manager in Python

How to do Review Sentiment Analysis using Python

SQL: Multi field mixed deduplication followed by numbering #eg81

Supabase | My Way of Designing & Managing DB

Okay