Summary of Chapter# 2 : "Process and Memory Architecture" from the book "The Internals of PostgreSQL"

#database #postgres #bitnine #apacheage

This blog aims to assist you in understanding the concepts of Chapter:2 [Process and Memory Architecture] from the book The Internals of PostgreSQL.

Note: Ensure that you have a thorough understanding of Chapter 1 before we proceed to Chapter 2, as it forms the foundation for our exploration.

So, Let's Start:

Process Architecture

PostgreSQL is a client/server relational database management system with a multi-process architecture that operates on a single host.
A PostgreSQL server refers to a collection of multiple processes that work together to manage a database cluster.

These processes include:

Postgres Server Process: This process acts as the parent process for all other processes related to the management of a database cluster.
Backend Processes: Each backend process handles queries and statements received from connected clients. They are responsible for executing these queries and returning the results.
Background Processes: Various background processes perform specific tasks related to database management features. Examples include VACUUM and CHECKPOINT processes that handle maintenance and data consistency tasks.
Replication-associated Processes: These processes are involved in streaming replication, which is a feature for database replication and synchronization.
Background Worker Process: Introduced in version 9.3, the background worker process allows users to implement custom processing tasks within PostgreSQL. It provides flexibility to perform additional processing as required.

An example of the process architecture in PostgreSQL is depicted in figure below:

Detailed Description of Some Process

Postgres Server Process

The postgres server process is the parent process of all processes within a PostgreSQL server. In earlier versions, it was known as 'postmaster'.
The pg_ctl utility is used to start the postgres server process.
Upon startup, the postgres server process allocates a shared memory area in memory.
It initiates various background processes, replication-associated processes, and background worker processes as needed.
The postgres server process waits for connection requests from clients.
When a connection request is received, it starts a backend process to handle the queries from the connected client.
The postgres server process listens on a network port, typically port 5432 by default.
Multiple PostgreSQL servers can run on the same host, but each server should be configured to listen on a different port number (e.g., 5432, 5433, etc.).

Backend Processes

A backend process, also known as "postgres," is initiated by the postgres server process and handles queries from a single connected client.
Communication between the backend process and the client occurs through a TCP connection.
The backend process terminates when the client disconnects.
When connecting to a PostgreSQL server, it is necessary to explicitly specify the database to be used since only one database can be operated at a time.
PostgreSQL allows multiple clients to connect concurrently, with the maximum number of clients determined by the max_connections configuration parameter (default is 100).
When many clients, such as web applications, frequently establish and terminate connections with the PostgreSQL server, it can lead to increased connection and backend process creation costs.
PostgreSQL does not have a native connection pooling feature, which can negatively impact database server performance. To address this issue, a pooling middleware like pgbouncer or pgpool-II is commonly used.

Background Processes

Some of Background Processes with description are shown in figure below:

Memory Architecture

Memory architecture in PostgreSQL can be classified into two broad categories:

1. Local Memory Area: Each backend process allocates a local memory area for query processing; each area is divided into several sub-areas – whose sizes are either fixed or variable.

Some of sub-area with description are shown in figure below:

2. Shared Memory Area: A shared memory area is allocated by a PostgreSQL server when it starts up. This area is also divided into several fix sized sub-areas.

Some of sub-area with description are shown in figure below: