DEV Community: m-hashir

Learning more about Agengraph Pt. 1

m-hashir — Sun, 23 Jul 2023 19:38:59 +0000

Introduction
AgensGraph is a brand new multi-model database based on PostgreSQL with the capability to support both relational (SQL) and graph-based (NoSQL) data models, all at the same time, allowing for the power of both in the same database, with SQL Queries (Ansi-SQL) and NoSQL (Open Cypher) queries in a single Agensgraph query. It is built for enterprise use and possesses immense capabilities that make it a powerful tool, such as ACID transaction, multi-version concurrency control, triggers, a flexible data model among many more, also possessing the capability to be extended with external modules such as PostGIS
Installation
Agensgraph is available to be downloaded on both Linux and Windows with the ability to download via binary package or compiling from source code.
Linux
***Installation of pre-built packages*

Download the pre-complied binary from the download page and select your preferred version
Extract the package into a directory of your choosing and use the Linux command “tar xvf” on the path to complete installation. Installation via source code
Go to Agensgraph’s Github page and get the source via this link and git clone command
Install the essential libraries, which differ between different Linux OS, the commands for which are given below: a. For Centos: $ yum install gcc glibc glib-common readline readline-devel zlib zlib-devel b. For Fedora: $ dnf install gcc glibc bison flex readline readline-devel zlib zlib-devel c. For Ubuntu: $ sudo apt-get install build-essential libreadline-dev zlib1g-dev flex bison
At the clone location, run configure on source true while setting prefix to the location where you are to install Agensgraph
Run the build with make install
Install the extension module and binary with make install-world Post Installation Setup and Config
Environment variable setting is optional and can be done by export LD_LIBRARY_PATH=/usr/local/AgensGraph/lib:$LD_LIBRARY_PATH export PATH=/usr/local/AgensGraph/bin:$PATH export AGDATA=/path/to/make/db_cluster
To create a database cluster: initdb [-D /path/to/make/db_cluster]
To start the server, use: ag_ctl start [-D /path/created/by/initdb]
To create a database: createdb [dbname]. If DB name is not specified, it is set to user name, by default
To execute the interactive interminal: agens [dbname]. If cluster directory is not specified with -D option, the environmental variable AGDATA is used

Data Model
As told before, Agensgraph is a multi-model database, supporting both graph as well as relational model
Property Graph Model
The property graph model contains connected entities (called vertices) which can have any number of attributes, which can be categorized into labels used to group vertices. Edges are the connections between two vertices, and can also have attributes and categorized labels. They must also have a start and end vertex. Agensgraph does not allow broken edges, so before deleting a vertex, all its edges must be deleted first. The properties of edges and vertices are represented in a JSON format. Agensgraph stores information as unordered collections of name-value pairs. Agensgraph uses JSONB (JSON-Binary) format, allowing for faster processing in place of slightly slower input time.
Data Objects
In Agensgraph, several databases can have one or more schemas, which are used for relational tables as well as one or more graphs, used for graph data, although their names cannot be same. Users can only use one graph at one time, but can create several in a database.
Labels
Labels are used to group vertices and edges and can be used to provide access control for different users, and also maintain inheritance via label hierarachies. The defauly label for a vertex is ag_vertex while an edge always has one label. Every label inherits one of more labels, just like classes in OOP.
Data Definition Language
Graph
Creating Graph: CREATE GRAPH [IF NOT EXISTS] graphname [AUTHORIZATION role_name];
Several graphs can be created in a database. To specify graph used, session parameter graph_path is used.
IF NOT EXISTS: Do nothing if same name exists
AUTHORIZATION: Role name of user who will own the new graph
Show Current Graph Path: SHOW graph_path;
If graph_path is not set, the graph created is made to be graph_path
Setting Graph Path: SET graph_path = graphname;
This is a session variable, so it must be set before querying the graph and querying over multiple graph paths is not allowed.
Drop: DROP GRAPH graphname CASCADE;
Initial labels for vertices and edges cannot be deleted, so CASCADE option is used.
Labels
Create
Vertices: CREATE VLABEL v1;
CREATE VLABEL v2 inherits (v1);
Edges: CREATE ELABEL e1;
CREATE ELABEL e2;
CREATE ELABEL e3 inherits (e1, e2);
VLABEL can only inherit VLABEL (same for ELABEL) and multiple inheritance is possible for creation of complex labels.
Drop
DROP VLABEL v2;
DROP VLABEL v1;
v2 inherits v1, so it must be dropped first before v1 can be dropped
DROP e1 knows CASCADE;
Option CASCADE can be used to drop all dependencies as well

Internals of PostgreSQL - Chapter 5

m-hashir — Thu, 29 Jun 2023 19:26:15 +0000

Concurrency control is a process that maintains atomicity and isolation, properties of ACID, of which there are three techniques. In Multi-version Concurrency Control, each write operation creates a new data item while retaining the old version which is in contrast to Strict Two Phase Locking which must block readers when a writer writes an item because the writer acquires an exclusive lock for the item. PostgreSQL used Snapshot Isolation which is a variation of MVCC which uses rollback segments. When reading items, PostgreSQL selects the appropriate version of an item in response to an individual transaction by applying visibility check rules.
Whenever a transaction begins, a unique identifier, referred to as a transaction id (txid) which is a 32-bit unsigned integer, approximately 4.2 billion.
Heap tuples in table pages are classified as a usual data tuple and a TOAST tuple. This section describes only the usual tuple.
While the HeapTupleHeaderData structure contains seven fields, four fields are required in the subsequent sections which are t_xmin, t_xmax, t_cid and t_ctid
FSM is used to insert and update tuples.
For insertion, a new tuple is inserted directly into a page of the target table. For this the header fields would be as follows:
t_xmin = 99
t_xmax =0
t_cid =0
t_ctid =(0,1)

For deletion, target tuple is deleted logically and in header fields, t_xmax is set to 111
In the update operation, PostgreSQL logically deletes the latest tuple and inserts a new one
Tuple_1:
t_xmax is set to 100.
t_ctid is rewritten from (0, 1) to (0, 2).
Tuple_2:
t_xmin is set to 100.
t_xmax is set to 0.
t_cid is set to 0.
t_ctid is set to (0,2)

When inserting a heap or an index tuple, PostgreSQL uses the FSM of the corresponding table or index to select the page which can be inserted it. PostgreSQL , and is used throughout transaction processing. PostgreSQL defines four transaction states, i.e. IN_PROGRESS, COMMITTED, ABORTED, and SUB_COMMITTED. A transaction snapshot is a dataset that stored information about whether all transactions are active, at a certain point in time for an individual transaction. Here an active transaction means it is in progress or has not yet started. Visibility check rules are a set of rules used to determine whether each tuple is visible or invisible using both the t_xmin and t_xmax of the tuple, the clog, and the obtained transaction snapshot.

Internals of PostgreSQL - Chapter 6

m-hashir — Wed, 28 Jun 2023 21:26:17 +0000

In chapter 6 we discuss vacuum processing in PostgreSQL. It is a maintenance process which helps to ensure efficient operation of the database. It mainly involves removing dead tuples and freezing transaction IDs to improve performance and free up space.

Vacuum processing was executed manually until version 8.0 when autovacuum daemon was introduced which automated the process. There are two main modes for vacuum processing, Concurrent VACUUM and Full VACUUM.

Concurrent VACUUM, also known as VACUUM, removes dead tuples for each page while also allowing other transactions to read the table at the same time. On the other hand, Full VACUUM removes dead tuples and defragments live tuples for the entire file but it locks the table which prevents other transactions from accessing it during the process.

In this chapter, we look at the steps involved in concurrent vacuum processing. These include removing dead tuples and index tuples which point to dead tuples, freezing old transaction IDs, updating system catalogs and performing various cleanup tasks.

A key optimization which has been introduced in version 8.4 is the Visibility Map or VM. It increases the efficiency of removing dead tuples by keeping track of the page visibility. VM allows the vacuum process to skip pages which do not have any dead tuples and thus reduces the overall cost of the process.

Freeze processing is an important part of vacuum processing. It involves freezing old transaction IDs to prevent the wraparound of transaction IDs and to ensure the stability of the database in the long-run. It operates in two modes: lazy mode and eager mode.

In lazy mode, freeze processing scans only pages that contain dead tuples, meanwhile in eager mode, it scans all pages and also performs some additional tasks such as updating system catalogs and removing unnecessary files. The mode used can depend on certain conditions, such as the value of the freezeLimit transaction ID and the state of the pg_database system catalog.

Finally, the chapter briefly talks about the PARALLEL option, which allows vacuuming and index cleaning to be performed in parallel for tables with multiple indexes. This is supported in the VACUUM command but not in autovacuum.

Internals of PostgreSQL - Chapter 4

m-hashir — Wed, 28 Jun 2023 21:22:02 +0000

This chapter covers Foreign Data Wrappers and Parallel Query
SQL/MED, added to the SQL standard in 2003, allows access to remote data through features like foreign tables. PostgreSQL's Foreign Data Wrappers (FDW) implement SQL/MED and enable management of foreign tables similar to local tables. By installing the necessary extension and configuring the settings, foreign tables on remote servers can be accessed in PostgreSQL. Multiple servers, such as PostgreSQL and MySQL, can have their respective foreign tables. SELECT queries can be used to access these foreign tables from the local server, and join operations between foreign tables from different servers are also possible. The postgres_fdw extension is the officially developed and maintained FDW extension by the PostgreSQL Global Development Group, while other FDW extensions listed in the Postgres wiki may not receive regular maintenance. Detailed information on PostgreSQL's FDW is provided in Section 4.1.1, including an overview and Section 4.1.2 explaining the functionality of the postgres_fdw extension.
To utilize the FDW feature in PostgreSQL, the appropriate extension must be installed and setup commands (e.g., CREATE FOREIGN TABLE, CREATE SERVER, CREATE USER MAPPING) need to be executed. During query processing, the FDW functions are invoked:

1. Creating a query table
The analyzer/analyser creates the query tree of the input SQL using the definitions of the foreign tables
2. Connecting to Remote Server
To connect to the remote server, the planner (or executor) uses the specific library to connect to the remote database server.
3. Creating a Plan Tree using EXPLAIN commands
PostgreSQL's FDW, including extensions like postgres_fdw, mysql_fdw, tds_fdw, and jdbc2_fdw, supports obtaining statistics from foreign tables for query planning. By setting use_remote_estimate to on, the planner can query the remote server for plan cost estimation using the EXPLAIN command. Only postgres_fdw can utilize the results of EXPLAIN commands accurately, as it provides both start-up and total costs, while other DBMS FDW extensions may lack the necessary information for cost estimation.
4. Deparsing
The planner generates a plain text SQL statement based on the scan paths of foreign tables to create the plan tree. Extensions like mysql_fdw, redis_fdw, or rw_redis_fdw can generate SELECT statements specific to their respective foreign data sources.
*5. Sending SQL statements *
After deparsing, the executor sends the deparsed SQL statements to the remote server and receives the result.

In order to understand how postgre_fdw performs, we see how it deals with multi-table queries, sort operation and aggregate functions.
1. Multi-Table Query
To execute a multi-table query, postgres_fdw fetches each foreign table using a single-table SELECT statement and then join them on the local server.
2. Sort Operations
In version 9.5 or earlier, the sort operation, such as ORDER BY, is processed on the local server, which fetches all the target rows from the remote server prior to the sort operation while in version 9.6 or later, postgres_fdw can execute the SELECT statements with an ORDER BY clause on the remote server if possible.
3. Aggregate Functions
In version 9.6 or earlier, similar to the sort operation mentioned in the previous subsection, the aggregate functions such as AVG() and cont() are processed on the local server while in version 10 or later, postgres_fdw executes the SELECT statement with the aggregate function on the remote server if possible.

Internals of PostgreSQL - Chapter 3

m-hashir — Wed, 28 Jun 2023 20:17:16 +0000

This chapter describes how PostgreSQL implements query processing. It supports three features, which are Foreign Data Wrappers (FDW), Parallel Query and JIT compilation. A backend process handles all queries issued by a client, and have five subsystems, which are Parser, Analyzer, Rewriter, Planner and Executor.
The parser generates a parse tree that the subsequent systems read. The parser only checks for syntax errors, so it only returns a syntax error and does not check the semantics. The analyser, in turn, runs a semantic analysis of the parse tree and generates a query tree. The rewriter works to change the query tree keeping into consideration the rule systems of PostgreSQL. The planner then receives the rewritten query tree and generates a plan tree to be executed. It runs on cost-based optimized and is the most complex subsystem in RDBMS. A plan tree is composed of plan nodes, which have information that the executor requires for processing, and it traverses from end to root of the plan tree. The executor reads and writes tables and indexes in the database cluster via a buffer manager and also uses some memory areas and can create temporary files if necessary and can also use the concurrency control mechanism to maintain consistency.
As stated before, query optimization in PostgreSQL is based on cost wich are non-absolute dimensionless values but are indictors of relative performance. They are estimated by functions defined in costsize. There are three kinds of costs:

Start-up: Cost expended before first tuple is fetched
Run: Cost to fetch all tuples
Total: Sum of both startup and run cost The cost of sequential scan is done by cost_seqscan() function. This has zero start-up cost and run cost is defined by the following formula: ‘run cost’=‘cpu run cost’+‘disk run cost’=(cpu_tuple_cost+cpu_operator_cost)×Ntuple+seq_page_cost×Npage For index scan, common cost function cost_index() is used. The startup cost for this is calculated by: ‘startup cost’={ceil(log2(Nindex,tuple))+(Hindex+1)×50}×cpu_operator_cost, Furthmore, the run cost is calculated by: ‘run cost’=(‘index cpu cost’+‘table cpu cost’)+(‘index IO cost’+‘table IO cost’). Here the table IO cost is defined by the following formula: ‘table IO cost’=max_IO_cost+indexCorrelation2×(min_IO_cost−max_IO_cost). Total cost will be calculated by taking the sum of both of these costs.

Sort path is used for sorting operations such as ORDER BY or preprocessing of merge join operations and is calculated by cost_sort() function. If all tuples can be stored in work_mem, quicksort is used otherwise temp file is created and file merge sort is used as an algorithm. The startup cost is the cost of sorting the target tuples, while the run cost is the cost of reading through all the tuples. They can be better defined by the following formulae:
‘start-up cost’=C+comparison_cost×Nsort×log2(Nsort),
‘run cost’=cpu_operator_cost×Nsort

A planner performs three steps, which are carrying out preprocessing, getting cheapest access path and creating plan tree. Preprocessing could be simplification, normalization and flattening. Getting the cheapest access path is done by estimating costs of all access paths and choosing the cheapest one. The last stage is generating a plan tree, which has four representative fields. The plan nodes also have seven representative fields themselves which can be cost, rows and left/right tree.
In single-table queries, the executor takes the plan nodes from end of the plan tree to the root and invokes function to process nodes.
PostgreSQl supports three join operations, which are as follows:

Nested Loop Join

The most fundamental join operation, PostgreSQL supports five variations of it.

Nested Loop Join: It has zero startup cost and run cost proportional to size of outer and inner tables. The following formula is used for it:
Materialized Loop Join: To optimize the scanning cost of the inner table in a nested loop join, PostgreSQL supports materialized nested loop join. Before execution, the inner table tuples are either stored in work_mem or a temporary file, allowing for more efficient processing during the join operation.
Indexed Nested Loop Join: When an index exists on the inner table and can be used to lookup tuples that satisfy the join condition, PostgreSQL employs the indexed nested loop join. It performs efficiently by processing the join operation based on a single loop of the outer table. The index scan path is used to look up inner tuples based on the outer table's attribute value, known as a parameterized index path.
Other Variations: Using an index on the outer table, particularly when its attributes are involved in the join condition or act as access predicates, can significantly reduce the cost of nested loop joins by narrowing the search range.

Merge Join

Merge join can only be used in natural joins and equi-join and is estimated by initial_cost_mergejoin() and final_cost_mergejoin() functions. It has four variations:

Merge Join: If all tuples can be stored in memory, sorting operations will be able to carried out in the memory itself otherwise temporary files are used.
Materialized Merge Join: Same in nested loop join, this supports materialized merge join to materialize the inner table to make the inner table scan more efficient.
Other Variations: Similar to the nested loop join, the merge join in PostgreSQL also has variations based on which the index scanning of the outer table can be carried out.

Hash Join

The hash join in PostgreSQL is limited to natural joins and equi-joins. It operates differently based on table sizes, using a two-phase in-memory hash join for small inner tables (≤25% of work_mem) and a hybrid hash join with skew method for larger tables. Cost estimation is complex but generally assumes no conflicts, with start-up and run costs dependent on the sizes of the outer and inner tables.

In-Memory Hash Join: The in-memory hash join in PostgreSQL operates within the work_mem using a hash table divided into batches or buckets, determined by ExecChooseHashTableSize(). It consists of a build phase, where inner table tuples are inserted into the batch, and a probe phase, where each outer table tuple is compared with the inner tuples in the batch for join conditions. This method improves join performance
Hybrid Hash Join With Skew: When the inner table's tuples cannot fit into a single batch in work_mem, PostgreSQL utilizes the hybrid hash join with the skew algorithm. It involves multiple batches, some stored in work_mem and others in temporary files, to accommodate and process the tuples efficiently.
Index Scans in Hash Join: Hash join in PostgreSQL uses index scans if possible.

Internals of PostgreSQL - Chapter 2

m-hashir — Tue, 27 Jun 2023 17:10:53 +0000

This chapter focuses on the process and memory architecture of PostgreSQL.
PostgreSQL is a client/server based RDBMS, with a multi-process architecture running on a single server. Some of these processes are as follows:
1. Postgres Server Process: The parent of all processes, this process allocates shared memory area, starts background processes, and waits for client request. Upon receiving one, it starts a backend process.
2. Backend Process: Handles all the queries by one connected client. It communicates via TCP and can have multiple connections at the same time.
3. Background Process: Carries out variety of tasks for specific features and internals.
Memory architecture of PostgreSQL can be classified into two types:
1. Local Memory Area: Each backend process allocates some memory for processing. Examples of these are work_mem, maintenance_work_mem and temp_buffers
2. Shared Memory Area: Allocated by a server when it starts up. Examples of these are shared buffer pool, WAL buffer and commit log.
In addition to these PostgreSQL allocates memory for tasks such as control mechanisms, various background processes and transaction processing.

The Internals of PostgreSQL - Chapter 1

m-hashir — Tue, 27 Jun 2023 17:06:39 +0000

This chapter describes the logical and physical structure of a database cluster as well as the internal layouts of the heap table file and the methods of writing and reading data to a table in PostgreSQL.
PostgreSQL runs a single database cluster upon which a collection of databases are managed. It does not actually run a group of database servers as PostgreSQL runs on a single host. Logically, a database is a collection of database objects which are data structures that store or reference data. There are many examples of it such as a table or a view. In PostgreSQL, even the databases are database objects themselves. These objects are managed by their object identifiers (or OIDs) which are unsigned 4- byte integers stored in system catalogs, which stores their relation.
In regard to physical structure, the cluster is basically one directory knowns as base directory, the path of which is usually the environmental variable PGDATA. A database is usually a subdirectory under the base subdirectory and the database subdirectory stores files of tables and indexes. The base directory also contains other subdirectories for data and configuration files. Tablespaces in PostgreSQL are different than in conventional RDBMS, such that it is one directory that contains data outside of base directory.

The layout of the complete cluster is in the form of many subdirectories, in which the base subdirectory contains the database subdirectories. The base directory of the cluster also contains subdirectories for configuration and other metadata. The databases, which are subdirectories of the base subdirectory, have names similar to their OIDs. In regard to the files for tables and indexes, they are managed by their OIDs while these data files are managed by relfilenode, both of which can be idenitical. However, using some commands such as TRUNCATE, CLUSTER, REINDEX, the relfilenode values can be changed. The database subdirectories also contain files for free space map and visibility map, which can be referred to as the 1st and 2nd fork of the data files. Tablespaces are files present outside of the base directory whose directory is addressed by a symbolic link and link name is same as that of the tablespace.
Data files are divided into pages of a fixed length of 8 KB, which are numbered starting from 0 and new ones are appended when the current one is full. They include three kinds of data: heap tuple which is the record itself, line pointers which hold a pointer to the heap tuple and header data which contains general information about the page.
When data is being written, for a table with one page and one heap tuple, the pd_lower points to the first line pointer, and both the line pointer and pd_upper point to the first tuple. When a second tuple is inserted, the second line pointer is added after the first one, pd_lower is updated to the second line pointer, and pd_upper to the second tuple, with other header data adjusted accordingly. Data is read in two manners, the first being sequential scan where all pages are read by all line pointers in page and B-tree index scan where an index file contains index tuples consisting of an index key and a TID pointing to the target heap tuple. When the index tuple with the desired key is found, the corresponding TID is used to directly retrieve the desired heap tuple, eliminating the need for additional page scanning.