<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Haseeb Ashraf</title>
    <description>The latest articles on DEV Community by Haseeb Ashraf (@thehaseebashraf).</description>
    <link>https://dev.to/thehaseebashraf</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1103421%2F06d9382c-d38e-4471-9fb3-64e845404fb7.jpg</url>
      <title>DEV Community: Haseeb Ashraf</title>
      <link>https://dev.to/thehaseebashraf</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/thehaseebashraf"/>
    <language>en</language>
    <item>
      <title>A Brief Overview of Base Backup &amp; Point-in-Time Recovery in PostgreSQL</title>
      <dc:creator>Haseeb Ashraf</dc:creator>
      <pubDate>Sat, 15 Jul 2023 07:37:14 +0000</pubDate>
      <link>https://dev.to/thehaseebashraf/a-brief-overview-of-base-backup-point-in-time-recovery-in-postgresql-57oj</link>
      <guid>https://dev.to/thehaseebashraf/a-brief-overview-of-base-backup-point-in-time-recovery-in-postgresql-57oj</guid>
      <description>&lt;h2&gt;
  
  
  Backups in online databases can be categorised into two main categories:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;physical backups&lt;/li&gt;
&lt;li&gt;logical backups&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Although both have their own advantages and disadvantages, logical backup mainly has one major disadvantage being that it takes too much time for performing the backup. Specifically, it takes an unusually long amount of time to backup a very large database and even more time to restore this database from the backup.&lt;/p&gt;

&lt;p&gt;In PostgreSQL however, full physical online backups have been available since &lt;em&gt;version 8.0&lt;/em&gt; and a snapshot of the whole running database cluster is known as a &lt;strong&gt;base backup&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Point-In-Time Recovery (PITR)&lt;/strong&gt; is also a feature that has been available since &lt;em&gt;version 8.0&lt;/em&gt; helps restore the database cluster to any point in time by making use of a &lt;strong&gt;base backup&lt;/strong&gt; and &lt;strong&gt;archive logs&lt;/strong&gt; that are developed using the continuous archiving feature. Let's take an example and say that you made a critical mistake such as deleting all the tables, this feature allows you to restore the database back to the point just before the mistake was made. &lt;/p&gt;

&lt;p&gt;In this short overview, we will look at the following topics: &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- What base backup is&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Firstly, the regular technique to design a base backup are as follows: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;(1) Issue the pg_backup_start command (Version 14 or earlier, pg_start_backup)&lt;/li&gt;
&lt;li&gt;(2) Take a snapshot of the database cluster with the archiving command you want to use&lt;/li&gt;
&lt;li&gt;(3) Issue the pg_backup_stop command (Version 14 or earlier, pg_stop_backup)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a relatively simple and easy to use procedure that is helpful for system administrators as it requires no special tools and only uses common tools such as copy command or any similar tool used for archiving and creating a backup. Additionally, no table locks are required and all of the database users can query the database while being unaffected by the backup operations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- How PITR works&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Now let's take a look at how PITR works. Let's assume that you made a crucial mistake at &lt;code&gt;5:15 PKT on 21st January 2023&lt;/code&gt;. Now you will remove the current database cluster and restore the new one using the backup that has been made before. Next, you will set the &lt;code&gt;parameter_restore_command&lt;/code&gt; and also set a time for the parameter &lt;code&gt;recovery_target_time&lt;/code&gt; to the point where the mistake was made. Now when PostgreSQL boots up, enters into PITR recovery mode and if there is a recovery file in the backup cluster it starts the recovery mode.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- What timelineId is&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In PostgreSQL, a timeline is used to differentiate the original database cluster from the recovered one and is one of the most fundamental concepts of PITR. let's take a look at &lt;strong&gt;timelineId&lt;/strong&gt; in this section.&lt;/p&gt;

&lt;p&gt;Each timeline is given a &lt;strong&gt;timelineId&lt;/strong&gt; which is basically a 4-byte unsigned integer that starts at 1.&lt;br&gt;
Each database cluster is assigned an individual &lt;strong&gt;timelineId&lt;/strong&gt; and the &lt;strong&gt;timelineId&lt;/strong&gt; of the original database cluster is created by the &lt;code&gt;initdb&lt;/code&gt; utility and is 1 by default. And whenever a database cluster is recovered, the &lt;strong&gt;timelineId&lt;/strong&gt; is incremented by 1&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- What timeline history file is&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When a PITR process is completed, a timeline history file is created with names like &lt;/p&gt;

&lt;p&gt;&lt;code&gt;00000003.history&lt;/code&gt;&lt;br&gt;
 This is created under the archival directory. This file keeps a record of which timeline it was initially branched off from and at what time. &lt;/p&gt;

&lt;p&gt;The timeline history file consists of at least one line and each of the lines are composed of the following three things: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;timelineId - it is a timelineId of the archive logd that are used to recover.&lt;/li&gt;
&lt;li&gt;LSN - This points to the location where the  switch of the WAL segment takes place&lt;/li&gt;
&lt;li&gt;Reason - a human readable explanation of why the timeline was changed.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>apache</category>
      <category>postgressql</category>
      <category>apacheage</category>
      <category>bitnine</category>
    </item>
    <item>
      <title>An overview of Write Ahead Logging - WAL in PostgreSQL</title>
      <dc:creator>Haseeb Ashraf</dc:creator>
      <pubDate>Fri, 14 Jul 2023 23:00:33 +0000</pubDate>
      <link>https://dev.to/thehaseebashraf/an-overview-of-write-ahead-logging-wal-in-postgresql-25mi</link>
      <guid>https://dev.to/thehaseebashraf/an-overview-of-write-ahead-logging-wal-in-postgresql-25mi</guid>
      <description>&lt;p&gt;Even when a system failure occurs, a database management system is required to not lose any data as it is extremely crucial and hence transaction logs are an extremely essential part of any database management system.&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;transaction log&lt;/strong&gt; is a historical log of all the past transactions that have occurred and it is made to keep sure that no data is lost during events such as a power outage or a server crash.&lt;/p&gt;

&lt;p&gt;In the world of computer science, &lt;strong&gt;WAL&lt;/strong&gt; is an acronym for &lt;strong&gt;Write Ahead Logging&lt;/strong&gt; which is a protocol to write both changes and actions made in the database, although in &lt;strong&gt;PostgreSQL&lt;/strong&gt;, this is known as &lt;strong&gt;Write Ahead Log&lt;/strong&gt;. In this short overview, we will take a look at the following subsections:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- The logical and physical structure of WAL&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The logical and physical structure of &lt;strong&gt;WAL&lt;/strong&gt; is such that if follows the conventional insertion operations and database recovery techniques used in other databases and PostgreSQL as well. &lt;br&gt;
&lt;strong&gt;PostgreSQL&lt;/strong&gt; will write all data as modifications into a persistent storage to prepare for failures and this historical data is known as &lt;strong&gt;XLOG&lt;/strong&gt; records or WAL data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- The internal layout of WAL&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Logically speaking, PostgreSQL writes *&lt;em&gt;XLOG *&lt;/em&gt; records into a transaction log which is a virtual log that consists of a 8-byte long file. A WAL segment is a 16 MB file by default and it is further internally divided into pages of 8192 bytes (8 KB).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Writing of WAL data&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Now, moving on to understanding the writing of the XLOG files, by issuing the following statement: &lt;br&gt;
&lt;code&gt;testdb=# INSERT INTO tbl VALUES ('A');&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
By invoking the above statement, the internal function &lt;code&gt;exec_simple_query()&lt;/code&gt; is invoked. This function writes and flushes all XLOG files to the WAL segment from the WAL buffer.&lt;br&gt;
WAL writer process&lt;/p&gt;

&lt;p&gt;Writing operations are usually done using &lt;strong&gt;DML (Data Manipulation Language)&lt;/strong&gt; but even non-DML operations are capable of performing writing operations in PostgreSQL. WAL writer works as a background process to check on the WAL buffer periodically and writes all the unwritten XLOG records into the segments containing the WAL.&lt;/p&gt;

</description>
      <category>apacheage</category>
      <category>postgres</category>
      <category>bitnine</category>
      <category>apache</category>
    </item>
    <item>
      <title>A Brief Overview of Buffer Manager in PostgreSQL</title>
      <dc:creator>Haseeb Ashraf</dc:creator>
      <pubDate>Wed, 12 Jul 2023 10:22:37 +0000</pubDate>
      <link>https://dev.to/thehaseebashraf/a-brief-overview-of-buffer-manager-in-postgresql-5ge6</link>
      <guid>https://dev.to/thehaseebashraf/a-brief-overview-of-buffer-manager-in-postgresql-5ge6</guid>
      <description>&lt;p&gt;The job of a &lt;strong&gt;buffer manager&lt;/strong&gt; is to manage and oversee the data transfers that take place persistent memory and shared memory and thus, this part of the DBMS can have a significant impact on the performance and working of the RDBMS. In PostgreSQL however, the buffer manager works very well and efficiently.&lt;/p&gt;

&lt;p&gt;In this brief overview, we will take a look at the following sections and get an understanding of the topics:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Buffer manager structure&lt;/strong&gt;&lt;br&gt;
  The buffer manager in postgreSQL consists of a &lt;em&gt;buffer descriptor&lt;/em&gt;, a &lt;em&gt;buffer table&lt;/em&gt; and a &lt;em&gt;buffer pool&lt;/em&gt;. Things like the data file pages e.g. the tables and indexes and the &lt;em&gt;freespace maps&lt;/em&gt; and &lt;em&gt;visibility maps&lt;/em&gt; as well. Next, the buffer pool is in the form of an array which basically entails that each of the slots store a single page of the data file. The indices of a buffer pool array are known as &lt;strong&gt;buffer_ids&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Buffer manager locks&lt;/strong&gt;&lt;br&gt;
  The buffer manager locks are used for many different purposes and uses many different locks. Now, we will look at the locks that are necessary for understanding the upcoming sections.&lt;/p&gt;

&lt;p&gt;Firstly, there are &lt;strong&gt;Buffer Table Locks&lt;/strong&gt; which help protect the integrity of the data that is contained in the entire buffer table. This can be used in both the exclusive and shared modes and is considered a relatively light weight lock. Whenever an entry is inserted or deleted, an exclusive lock is held by a backend process.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- How the buffer manager works&lt;/strong&gt;&lt;br&gt;
  A buffer manager works whenever a backend process wants to access a desired page. When this task is to be performed, is uses a &lt;em&gt;ReadBufferExtended&lt;/em&gt; function.&lt;br&gt;
How the &lt;em&gt;ReadBufferExtended&lt;/em&gt; function behaves depends on three different logical cases. Additionally, we will look at the PostgreSQL &lt;em&gt;clock sweep page replacement algorithm&lt;/em&gt; in the final section.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Ring buffer&lt;/strong&gt;&lt;br&gt;
  Whenever PostgreSQL is reading or writing a very large table, a ring buffer is used instead of a buffer pool. A small and temporary buffer area is assigned as a ring buffer in the main memory. Bulk reading, bulk writing and Vacuum-processing are the three main operations where ring buffer is used.&lt;/p&gt;

&lt;p&gt;*&lt;em&gt;- Flushing of dirty pages *&lt;/em&gt;&lt;br&gt;
  The background writer and check pointer does processing of flushing dirty pages to storage. Both of these processes perform the same function however, each have different behaviors and roles.&lt;/p&gt;

&lt;p&gt;The role of the background writer is to decrease the influence of large scale writing of checkpointing. Whereas the checkpoint process writes a checkpoint record to the WAL segment and whenever the checkpointing starts it flushes the dirty pages.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>An Overview of Heap Only Tuple and Index-Only Scans</title>
      <dc:creator>Haseeb Ashraf</dc:creator>
      <pubDate>Tue, 11 Jul 2023 09:25:23 +0000</pubDate>
      <link>https://dev.to/thehaseebashraf/an-overview-of-heap-only-tuple-and-index-only-scans-3b9g</link>
      <guid>https://dev.to/thehaseebashraf/an-overview-of-heap-only-tuple-and-index-only-scans-3b9g</guid>
      <description>&lt;p&gt;In this short article, we will take a look at two features related to index scan. Namely, &lt;strong&gt;Heap Only Tuple&lt;/strong&gt; and &lt;strong&gt;Index-Only Scans&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;Heap Only Tuple&lt;/strong&gt; was initially implemented in &lt;em&gt;Version 8.3&lt;/em&gt; of PostgreSQL and its main purpose was to effectively make use of the pages of both the table and index when the updated table page is stored in the same store as the old row. Heap Only Processing, also known as HOT helps reduce the need of VACUUM processing. &lt;/p&gt;

&lt;p&gt;When updating a row without &lt;strong&gt;HOT&lt;/strong&gt;, inserting data into the index tuples takes up the index page space and thus the cost of vacuuming and inserting index tuples are very high. &lt;strong&gt;HOT&lt;/strong&gt; helps reduce the impact of these problems.&lt;/p&gt;

&lt;p&gt;When updating a row with &lt;strong&gt;HOT&lt;/strong&gt;, in case the updated row is stored in the same page table that stores the previous row, PostgreSQL will not insert the corresponding index tuple and sets the update bits' value to zero.&lt;/p&gt;

&lt;p&gt;Now, let's take a look at &lt;strong&gt;Index Only Scans&lt;/strong&gt;, to reduce the cost of I/O, index-only scans will use the index key directly without having to access the table pages whenever a &lt;em&gt;SELECT&lt;/em&gt; statement is called. This method has been provided by all commercial RDBMS systems and PostgreSQL specifically has introduced this feature in the &lt;em&gt;Version 9.2&lt;/em&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>A Short Overview on Vacuum Processing in PostgreSQL</title>
      <dc:creator>Haseeb Ashraf</dc:creator>
      <pubDate>Tue, 27 Jun 2023 16:46:04 +0000</pubDate>
      <link>https://dev.to/thehaseebashraf/a-short-overview-on-vacuum-processing-in-postgresql-2kf4</link>
      <guid>https://dev.to/thehaseebashraf/a-short-overview-on-vacuum-processing-in-postgresql-2kf4</guid>
      <description>&lt;p&gt;&lt;strong&gt;Vacuum processing&lt;/strong&gt; in PosgtreSQL is a maintenance process that helps to facilitate persistent operations.&lt;br&gt;
The main tasks of vaccuum processing are to &lt;em&gt;remove dead tuples&lt;/em&gt; and &lt;em&gt;freeze transaction ids&lt;/em&gt; which are no longer active.&lt;/p&gt;

&lt;p&gt;Two different modes are provided by vacuum processing to remove dead tuples, namely:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Concurrent VACUUM&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Full VACUUM&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Concurrent VACUUM&lt;/strong&gt;, which is often simply referred simply as VACUUM, works by removing dead tuples for each page of the table file but meanwhile other transactions can still read the table while the process in running.&lt;/p&gt;

&lt;p&gt;On the other hand, &lt;strong&gt;Full VACUUM&lt;/strong&gt; works by removing dead tuples and also defragments the whole file but while &lt;strong&gt;Full VACUUM&lt;/strong&gt; is running, the table cannot be accessed by other transactions.&lt;/p&gt;

&lt;p&gt;Work on improving the functionality of vacuum has been rather slow compared to other functions despite it being essential in &lt;strong&gt;PostgreSQL&lt;/strong&gt;. It is a costly and expensive process because vacuum processing involves scanning the whole table. &lt;/p&gt;

&lt;p&gt;When it comes to &lt;em&gt;freezing old tax ids&lt;/em&gt;, it removes unnecessary parts of the clog that are not being used actively if possible. &lt;/p&gt;

&lt;p&gt;In the &lt;strong&gt;first block&lt;/strong&gt;, freeze processing is performed and index tuples that point to old tuples are removed. Firstly, a target table is scanned by PostgreSQL to build a list of dead tuples and freeze old tuples if possible. This list is then stored in &lt;em&gt;maintenance_work_mem&lt;/em&gt; in the local memory area.&lt;/p&gt;

&lt;p&gt;After the scanning is done, index tuples are removed by postgreSQL by pointing to the &lt;strong&gt;deadtuples list&lt;/strong&gt;. Internally, this process is known as the &lt;em&gt;"cleanup stage"&lt;/em&gt; and needless to say, this is an expensive process.&lt;/p&gt;

&lt;p&gt;Next, in the &lt;strong&gt;second block&lt;/strong&gt;, dead tuples are removed and both the FSM and VM are updated page-by-page.&lt;/p&gt;

&lt;p&gt;Lastly, the &lt;strong&gt;third block&lt;/strong&gt; performs a cleanup after the indexes have been deleted and it also updates both the system catalogs and statistics that are related to the vacuum processing for each of the target tables.&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>postgressql</category>
      <category>apache</category>
      <category>apacheage</category>
    </item>
    <item>
      <title>On Overview of Concurrency Control in PostgreSQL</title>
      <dc:creator>Haseeb Ashraf</dc:creator>
      <pubDate>Mon, 26 Jun 2023 19:41:44 +0000</pubDate>
      <link>https://dev.to/thehaseebashraf/on-overview-of-concurrency-control-in-postgresql-1ilp</link>
      <guid>https://dev.to/thehaseebashraf/on-overview-of-concurrency-control-in-postgresql-1ilp</guid>
      <description>&lt;p&gt;Today, we will  take a  look at concurrency control in PostgreSQL.&lt;br&gt;
It is a &lt;strong&gt;control mechanism&lt;/strong&gt;  that maintains &lt;strong&gt;isolation&lt;/strong&gt; and  &lt;strong&gt;atomicity&lt;/strong&gt;,  which are some of  the important properties that are required to run many transaction parallelly  in the database. &lt;br&gt;
Overall, concurrency control  can be categorized into three main techniques:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Multi-version Concurrency Control (MVCC)&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Strict Two-Phase Locking (S2PL)&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Optimistic Concurrency Control (OCC)&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Whenever a data item is read through a transaction, one of the versions is selected by the system to make sure that the system ensures isolation of the transaction. &lt;br&gt;
The main reason MVCC is considered advantageous is that readers and writers don't block each other.&lt;/p&gt;

&lt;p&gt;In some versions of RDBMS and PostgreSQL, a variations of MVCC is used, known as &lt;strong&gt;Snapshot Isolation(SI)&lt;/strong&gt;.&lt;br&gt;
Rollback segments are used by some database systems to implement &lt;strong&gt;SI&lt;/strong&gt;. Whenever an item is being read, an appropriate version of an item is used by PostgreSQL in response to an individual transaction that is using &lt;strong&gt;visibility check rules&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Whenever a transaction is initiated in PostgreSQL, a unique identifier is assigned to each transaction known as a &lt;strong&gt;transaction id(txid)&lt;/strong&gt;. This txidis a 32-bit unsigned integer with approximately 4.2 billion unique combinations. Whenever a transaction starts, the built-in &lt;em&gt;txid_current()&lt;/em&gt; function returns the current txid.&lt;/p&gt;

&lt;p&gt;The above was a short overview of concurrency control in PostgreSQL.&lt;/p&gt;

</description>
      <category>bitnine</category>
      <category>postgres</category>
      <category>postgressql</category>
      <category>apacheage</category>
    </item>
    <item>
      <title>Foreign Data Wrappers in PostgreSQL</title>
      <dc:creator>Haseeb Ashraf</dc:creator>
      <pubDate>Fri, 23 Jun 2023 21:44:26 +0000</pubDate>
      <link>https://dev.to/thehaseebashraf/foreign-data-wrappers-in-postgresql-5l1</link>
      <guid>https://dev.to/thehaseebashraf/foreign-data-wrappers-in-postgresql-5l1</guid>
      <description>&lt;p&gt;In this article, we will discuss two of the most interesting and practical features in PostgreSQL. Namely, &lt;strong&gt;Foreign Data Wrappers (FDW)&lt;/strong&gt; and &lt;strong&gt;Parallel Query&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A table that is located on a remote server in SQL is known as a &lt;em&gt;foreign table&lt;/em&gt;. In PostgreSQL, the tool used to manage foreign tables using SQL are known as &lt;strong&gt;Foreign Data Wrappers (FDW)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The foreign table from the remote server can be accessed after installing the required extension and tweaking the appropriate settings. Furthermore, even join operations can be executed with foreign tables that are located in different servers which are like the local tables.&lt;/p&gt;

&lt;p&gt;Here is a brief overview of how the &lt;strong&gt;FDWs&lt;/strong&gt; perform:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The analyzer generates a query tree from the input SQL.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A connection to the remote server is made using the executor/planner.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In case the &lt;em&gt;use_remote_estimate&lt;/em&gt; option is turned on, which is switched of by default, the planner will execute the EXPLAIN command that helps the user by generating a cost estimate for each plan path.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Next, the executor will transfer a plain text SQL statement to the remote server and will consequently receive the results.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Lastly, the received data is processed by the executor if necessary. E.g. the executor will perform a join processing in case a multi-table query is executed.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are also a number of useful multi-table operations in PostgreSQL that can be used to perform various functions.  Some of which are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Sort operations:&lt;/strong&gt;&lt;br&gt;
Sort operations such as ORDER BY are processed on the local server. The local server fetches all the required rows from the remote server before the sort operation is executed. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Aggregate functions:&lt;/strong&gt;&lt;br&gt;
Similar to sort operations, aggregate functions are also processed on the local server. Some examples of aggregate functions include: AVG() and COUNT(). The executor sends the relevant query to the remote server and then retrieves the relevant query results.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>apache</category>
      <category>postgres</category>
      <category>postgressql</category>
      <category>bitnine</category>
    </item>
    <item>
      <title>Query Processing in PostgreSQL</title>
      <dc:creator>Haseeb Ashraf</dc:creator>
      <pubDate>Fri, 23 Jun 2023 12:02:45 +0000</pubDate>
      <link>https://dev.to/thehaseebashraf/query-processing-in-postgresql-2ank</link>
      <guid>https://dev.to/thehaseebashraf/query-processing-in-postgresql-2ank</guid>
      <description>&lt;p&gt;Query processing ins &lt;strong&gt;PostgreSQL&lt;/strong&gt; is the most complicated system and it efficiently processes the supported SQL. In this article, we will look at query processing, in particular focusing on query optimising.&lt;/p&gt;

&lt;p&gt;In postgreSQL, a parallel query implemented in &lt;em&gt;version 9.6&lt;/em&gt; uses multiple background worker processes, all queried issued by a connected client are handled by a backend process. This backend process consists of five subsystems as shown below. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Parser&lt;/strong&gt;&lt;br&gt;
This helps to generate a parse tree from the SQL statements that were written in plain text.&lt;br&gt;
The root node of this parse tree is the &lt;em&gt;SelectStmt&lt;/em&gt; structure that is defined in the &lt;em&gt;parsenodes.h&lt;/em&gt;. The SELECT query and the corresponding elements of the parse tree have the same numbering. Since the parser only checks the input syntax when generating the parse tree, it only generates an error when there is a syntax error in the query.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Analyser&lt;/strong&gt;&lt;br&gt;
Semantic analysis of the parse tree is carried out and a query tree is generated.&lt;br&gt;
The &lt;em&gt;query&lt;/em&gt; structure defined in the &lt;em&gt;parsenodes.h&lt;/em&gt; is the root of the query tree. This structure also includes the metadata of its corresponding query. A query tree includes things such as a &lt;em&gt;targetlist&lt;/em&gt; which is a list of columns that are the result of the query. The join tree saves the FROM clause and the WHERE clauses.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Rewriter&lt;/strong&gt;&lt;br&gt;
Given that such rules exist, the rewriter uses the rule system to transform the query tree.&lt;br&gt;
The rewriter has a system that generates the rule system. It looks at the rules stored in the &lt;em&gt;pg_rules&lt;/em&gt; system catalog and rewrites the query tree. The rule system itself is quite intriguing but we will omitt it from our short summary to prevent it from becoming too long.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Planner&lt;/strong&gt;&lt;br&gt;
The plan tree that can most effectively be executed from the query tree is generated by the planner.&lt;br&gt;
The planner in the PostgreSQL is a purely cost-based optimisation system and it doesn't support rule based optimisation and hints. Consequently, this planner is the most complicated system in an RDBMS. A plan tree consists of elements called &lt;em&gt;plan nodes&lt;/em&gt; and is connected to the &lt;em&gt;plantree&lt;/em&gt; list. Each of the plan tree consists of information that the executor requires for processing.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Executor&lt;/strong&gt;&lt;br&gt;
This executes the query by accessing the tables and indices that were created by the plan tree in order.&lt;br&gt;
The executor reads and writes indexes and tables in the database cluster with the help of the buffer manager that will be further described in upcoming summaries. The executor occupies some memory areas, line the &lt;em&gt;temp_buffers&lt;/em&gt; and &lt;em&gt;work_mem&lt;/em&gt; are allocated in advance and temporary files are created if necessary.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>bitnine</category>
      <category>apacheage</category>
      <category>postgres</category>
      <category>postgressql</category>
    </item>
    <item>
      <title>Process and Memory architecture in PostgreSQL</title>
      <dc:creator>Haseeb Ashraf</dc:creator>
      <pubDate>Wed, 21 Jun 2023 14:05:58 +0000</pubDate>
      <link>https://dev.to/thehaseebashraf/process-and-memory-architecture-in-postgresql-13p6</link>
      <guid>https://dev.to/thehaseebashraf/process-and-memory-architecture-in-postgresql-13p6</guid>
      <description>&lt;p&gt;The architecture of &lt;strong&gt;PostgreSQL&lt;/strong&gt; is such that it is a client/server type relational database management system and has a multi-process architecture that runs on a single host.&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;PostgreSQL&lt;/strong&gt; server is a collection of several processes cooperatively managing one database cluster that contains the following types of processes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;A &lt;strong&gt;postgres server process&lt;/strong&gt; is a parent of all processes related to a database management system.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Every single &lt;strong&gt;backend process&lt;/strong&gt; is responsible for managing all queries and statements that are generated by any connected client.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Different &lt;strong&gt;background processes&lt;/strong&gt; help with processes of each feature for the database management system.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The &lt;strong&gt;replication associated process&lt;/strong&gt; is responsible for performing the streaming replication process.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Lastly, the &lt;strong&gt;background worker process&lt;/strong&gt;, can work on any processing that is demanded by the users.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A &lt;strong&gt;postgres server process&lt;/strong&gt; starts up by executing the &lt;em&gt;pg_ctl&lt;/em&gt; command with the start option. Next, it allocates shared memory, begins various background processes, starts a few other necessary processes and waits for a connection request from any online clients. A backend process is started whenever it receives a connection request from a client.&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;backend process&lt;/strong&gt; starts by the postgres server process and handles all queries issued by a connected client. It uses a TCP connection to communicate with the client and then terminates when the client is disconnected. As a backend process is allowed to only operate a single database, the client has to specify explicitly the database they want to use when connecting.&lt;/p&gt;

&lt;p&gt;Now let's talk about the memory architecture of &lt;strong&gt;postgreSQL&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The memory architecture of postgreSQL can be classified into two major classes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Local memory area - this is assigned by each backend process for its own use.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Shared memory area - this is used collectively by all the processes running on a PostgreSQL server.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In addition to these, PostgreSQL also allocates various memory areas as mentioned below:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Sub areas for the various access control mechanisms.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Sub areas for various background process e.g. autovaccum and checkpointer.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Sub areas for transaction processing such as two-phase-commit and save-point.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This was a short summary of the memory and process architecture of PostgreSQL.&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>apacheage</category>
      <category>bitnine</category>
      <category>apache</category>
    </item>
    <item>
      <title>An Overview of Database Clusters, Databases, and Tables in PostgreSQL</title>
      <dc:creator>Haseeb Ashraf</dc:creator>
      <pubDate>Sat, 17 Jun 2023 23:11:32 +0000</pubDate>
      <link>https://dev.to/thehaseebashraf/an-overview-of-database-clusters-databases-and-tables-in-postgresql-428d</link>
      <guid>https://dev.to/thehaseebashraf/an-overview-of-database-clusters-databases-and-tables-in-postgresql-428d</guid>
      <description>&lt;p&gt;In &lt;strong&gt;PostgreSQL&lt;/strong&gt;, a database cluster is not a group of database servers but instead it is a single host that runs and looks after the database cluster as a whole. &lt;/p&gt;

&lt;p&gt;The logical structure of a database cluster is such that it is a database object and these objects are logically separate from one another. All of the PostgreSQL database objects are internally managed by assigning each of them their own separate &lt;strong&gt;object identifiers (OIDs)&lt;/strong&gt;. These &lt;strong&gt;OIDs&lt;/strong&gt; are 4-byte integers that help store the relation between database objects in proper system catalogs, based on the kind of object.&lt;/p&gt;

&lt;p&gt;As for the physical structure, the database cluster is mainly a single directory known as a &lt;strong&gt;base directory&lt;/strong&gt; which includes a few subdirectories and several files within the &lt;em&gt;subdirectories&lt;/em&gt;. In this structure, a database is a &lt;em&gt;subdirectory&lt;/em&gt; with the base directory as the root with each of the tables and indexes are a single file saved in the &lt;em&gt;subdirectory&lt;/em&gt; of the database where it originally belongs.&lt;/p&gt;

&lt;p&gt;A &lt;em&gt;tablespace&lt;/em&gt; in in PostgreSQl is an extra space to store data outside the base directory. It is created where the directory is specified when the &lt;strong&gt;CREATE TABLESPACE&lt;/strong&gt; statement is issued. &lt;/p&gt;

&lt;p&gt;The internal layout of the &lt;strong&gt;Heap Table File&lt;/strong&gt; is as follows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Heap tuple(s) &lt;br&gt;
This is a record of the data. These tuples are stacked in such an order that they are from the bottom of the page.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Line pointer(s)&lt;br&gt;
This is a 4 byte long pointer that points to each head tuple.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Header data&lt;br&gt;
This is defined by the structure &lt;em&gt;PageHeaderData&lt;/em&gt; and is assigned at the starting of the page.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In conclusion, this was a short overview of database clusters, databases, and tables in postgreSQL.&lt;/p&gt;

</description>
      <category>database</category>
      <category>apache</category>
      <category>apacheage</category>
      <category>postgres</category>
    </item>
  </channel>
</rss>
