DEV Community

Pawan Kukreja
Pawan Kukreja

Posted on • Updated on

[Summary] Chapter#01 'The Internals of PostgreSQL' Book (Part-01).

Logical Structure of Database Clusters

  • A database cluster is a collection of databases that are managed by a single PostgreSQL server that is operational. The term database cluster in PostgreSQL is not a group of database servers but the server runs on a one host and manages a one database cluster.

Image description

  • Database Clusters contain databases which are logically separated from each other. Each database is a collection of unique database objects and database objects used to store reference data. Database objects are basically tables, indexes etc.

  • All database objects in PostgreSQL are internally merged by object identifiers which are unsigned 4-byte integers. It depends on the type of objects, like the OIDs of the heap table are stored in pg_database and pg_class, Which can be found out by issuing queries.

Physical Structure of Database Clusters:

  • A database cluster is a base directory with subdirectories and files. The initdb utility initialises a new database cluster, usually set to the environment variable PGDATA. In PostgreSQL, a database is a subdirectory with tables, indexes, and configuration files. Tablespaces are distinct from other RDBMS, as they are a directory containing data outside the base directory.

Image description

  • Database is a subdirectory and it is under the base subdirectory and the names of database directories and respective OIDs are identical.

  • Layout of files Associated with Tables and Indexes:
    Tables and Indexes having size less than 1GB is a single file storage stored under the database directory from where it belongs. Relfilenode manages the data files and OIDs are managing Tables and indexes, and in case of increment in table and indexes size from 1GB, PostgreSQL create new file named relfilenode.1 and it kept creating until data ends.
    _fsm and _vm referred to as free space map and visibility map, which stores the information of free space capacity and visibility on each page within the table file.

  • Tablespace is an additional data area of PostgreSQL which is outside the base directory. This function has been implemented in version 8.0. Create TABLESPACE statement creates the tablespace. Subdirectory like ‘PG_14_202011044’ would be created if tablespace is created.
    Symbolic link from pg_tblspc is addressed by tablespace. If you create directory whose name is same as existing database OID, is created under the version specific subdirectory and new tables placed on created directory.

Reference

Top comments (0)