DEV Community

Cover image for Understanding Indexes in PostgreSQL: An Introduction
hammadsaleemm
hammadsaleemm

Posted on

Understanding Indexes in PostgreSQL: An Introduction

As an application developer using DBMS, you must have come across indexes in PostgreSQL. Indexes are auxiliary structures in the database, designed to speed up data access and enforce integrity constraints. However, with so many different types of indexes available, it can be challenging to choose the right one for your application.

In this series of articles, we will take a closer look at indexes in PostgreSQL, discussing the available types of indexes, why there are so many different types of them, and how to use them to optimize query performance. We will also delve into the details of the internal workings of indexes, giving you the tools to make informed decisions about which indexes to use.

In this first article, we will discuss the distribution of responsibilities between the general indexing engine related to the DBMS core and individual index access methods, which PostgreSQL enables us to add as extensions.

Types of Indexes in PostgreSQL

PostgreSQL has six different types of indexes built-in as of version 9.6, with one more available as an extension. Despite all differences between types of indexes, each of them associates a key with table rows that contain this key. Each row is identified by TID (tuple id), which consists of the number of block in the file and the position of the row inside the block.

It's important to note that while indexes can speed up data access, there is a cost involved. Each operation on indexed data, such as insertion, deletion, or update of table rows, requires the indexes for that table to be updated as well, and in the same transaction.

Extensibility of Indexes in PostgreSQL

To enable easy addition of a new access method to the system, an interface of the general indexing engine has been implemented in PostgreSQL. Its main task is to get TIDs from the access method and work with them:

Read data from corresponding versions of table rows.
Fetch row versions TID by TID or in a batch using a prebuilt
bitmap.
Check visibility of row versions for the current transaction
taking into account its isolation level.
The indexing engine is involved in performing queries, and is called according to a plan created at the optimization stage. The optimizer evaluates different ways to perform the query, taking into account the capabilities of all access methods that are potentially applicable.

Not only does the optimizer need information about the access method, but when building an index, the system must decide whether the index can be built on several columns and whether this index ensures uniqueness. Each access method should provide all the necessary information about itself, which is achieved through special functions in PostgreSQL.

The indexing engine in PostgreSQL enables the database to work with various access methods uniformly, taking into account their features. The main scanning techniques include index scan and bitmap scan. Index scan returns TID values one by one until the last matching row is reached, and the indexing engine accesses the table rows indicated by TIDs in turn. Bitmap scan, on the other hand, first returns all TIDs that match the condition, and the bitmap of row versions is built from these TIDs before row versions are read from the table. The choice of which scanning technique to use depends on the number of retrieved rows and other factors, and is determined by the optimizer.

Conclusion

In this article, we have introduced the topic of indexes in PostgreSQL, discussing the different types of indexes available and the cost involved in using them. We have also touched on the extensibility of indexes in PostgreSQL, which enables easy addition of new access methods to the system.

In the next article, we will dive deeper into the interface of the access method and critical concepts such as classes and operator families. Armed with this knowledge, we can begin to explore the different types of indexes available in PostgreSQL and how to use them to optimize query performance.

Top comments (0)