DEV Community

Chidera Stella Onumajuru
Chidera Stella Onumajuru

Posted on • Updated on

Internals of Postgres, Understanding the Database Structure

PostgreSQL, also known as Postgres, is a free and open-source relational database management system that emphasizes extensibility and SQL compliance. The Apache AGE team leverages its extensibility to build a graph database using openCypher.

Excerpts in this blog post have been taken from the book "Internals of Postgres," authored by Hironobu SUZUKI.

PostgreSQL's database structure is unique and will be summarized as follows.

Logical Structure of Database Cluster

A database cluster in PostgreSQL is a collection of databases managed by a PostgreSQL server. In a previous post, I started a database cluster called test using the command bin/pg_ctl -D test -l logfile start and subsequently created a database within the same post using bin/createdb testdb. A database cluster can contain different databases. The following image, extracted from "Internals of Postgres," provides a clearer explanation.

A picture showing a database cluster with database inside of it, and database objects inside a database

Databases

A database is a collection of database objects, which encompass tables, indexes, views, sequences, and even the database itself. The image above illustrates a database with various database objects contained within it.

Object Identifiers (OIDs)
OIDs (Object Identifiers) in PostgreSQL are unique identifiers assigned to database objects. These identifiers serve as internal references that aid PostgreSQL in managing and referencing objects within the database.

Physical Structure of Database Cluster

In terms of the physical representation, a database cluster is symbolized by a base directory encompassing subdirectories that correspond to individual databases. Each of these subdirectories contains a variety of database objects associated with the respective database. Tables or indexes with a size less than 1GB are managed by individual Object Identifiers (OIDs), while the data files are handled by a variable known as "relfilenode." However, when the size exceeds 1GB, a new relfilenode is created to manage it.

Conclusion
Understanding the logical structure of a database cluster is crucial, as it allows individuals to comprehend the relationships between the database objects and databases within the cluster.

Its physical structure helps individuals gain insights into the specific locations of files, facilitating easier manipulation and management of the database.

Further reading on specific database objects can be done here.

References
https://www.interdb.jp/pg/pgsql01.html

Top comments (0)