What is Partitioning?
The process of dividing the tables into smaller manageable parts is called partitioning, and these smaller manageable parts are called partitions
.
In the process of partitioning, we divide one logical big table into multiple physical smaller parts.
Why do we partition?
Before actually creating the partitions, let’s understand why we need partitions. The first and most demanding reason to use partitions in a database is to increase the performance of the database. This is achieved by partition-wise joins
; if a user’s queries perform a lot of full-table scans, partitioning will help vastly, because partitions will limit the scope of this search. The second important reason to partition is ease of managing large tables.
Partitioning always helps manage large objects. Although it is applicable to objects of any size, the advantages are more apparent in large tables. When a user recreates an index on a nonpartitioned table, the only option is to build the entire index in one statement. On the other hand, if the table is partitioned, the user can rebuild partitions of the local indexes one at a time.
Creating a table partition
PostgreSQL supports table partitioning through table inheritance, which means every partition will be created as a child table of a single parent table. Partitioning is performed in such a way that every child table inherits a single parent table. The parent table will be empty; it exists just to describe the whole dataset. Currently in PostgreSQL, partitioning can be implemented in range partitioning
or list partitioning
.
Steps to create partition in Postgrsql
- Createthemastertable 2.Create multiple child tables without having an overlapped table constraint
- Create indexes
- Create a trigger function to insert data into childtables
- Enable the constraint exclusion
We will focus on creating a
range partition
in this article.
Range partition
The range partition is the partition in which we partition a table into ranges defined by a single column or multiple columns. When defining the ranges, the user will have to take care that ranges should be connected and not overlap with each other; moreover, ranges must be defined using the < value operator. For instance, one can partition by date ranges or ranges of identifiers for specific business objects.
While working with Range Partitioning
, it is important to note the following:
- In PostgreSQL, range partitioning is a method of dividing a table into multiple partitions based on a specified range of values in one or more columns.
- Each partition holds a subset of the data based on the range values, making it an efficient way to manage and query large datasets.
- Range partitioning is particularly useful when your data can be logically divided into non-overlapping ranges, such as date ranges or numerical ranges.
Creating the master table
So, let’s start by creating a simple master table. This table will contain data on sales stored on a daily basis. This will be done in the following manner:
CREATE TABLE sales_record
(
id NUMERIC PRIMARY KEY,
sales_amount NUMERIC,
sales_date DATE NOT NULL DEFAULT CURRENT_DATE
);
This is the parent table; all the records that a user inserts in this table will move to the child table based on the criteria of sales_date that we are going to create in the following section.
Creating a range partition table
To implement the partition, we will create child tables. All child tables will inherit the master table. We will add the CHECK
constraint for dates because we want to make sure that we have only the correct data on each partition. Partitions starts from the date 2014-01-01
and end on the date 2014-12-31
. Each partition will have two months data.
CREATE TABLE sales_record_m1_to_m2
(
PRIMARY KEY (id, sales_date),
CHECK (sales_date >= DATE '2014-01-01'
AND sales_date < DATE '2014-03-01')
)
INHERITS (sales_record);
This child table will contain data of January and February.
Now, create the sales_record_m3_to_m4
child table as follows:
CREATE TABLE sales_record_m3_to_m4
(
PRIMARY KEY (id, sales_date),
CHECK (sales_date >= DATE '2014-03-01'
AND sales_date < DATE '2014-05-01')
)
INHERITS (sales_record);
This child table will contain data of March and April.
Create the sales_record_m5_to_m6
child table as follows:
CREATE TABLE sales_record_m5_to_m6
(
PRIMARY KEY (id, sales_date),
CHECK (sales_date >= DATE '2014-05-01'
AND sales_date < DATE '2014-07-01')
)
INHERITS (sales_record);
This child table will contain data of May and June.
Create the sales_record_m7_to_m8
child table as follows:
CREATE TABLE sales_record_m7_to_m8
(
PRIMARY KEY (id, sales_date),
CHECK (sales_date >= DATE '2014-07-01'
AND sales_date < DATE '2014-09-01')
)
INHERITS (sales_record);
This child table will contain data of July and August.
CREATE TABLE sales_record_m9_to_m10
(
PRIMARY KEY (id, sales_date),
CHECK (sales_date >= DATE '2014-09-01'
AND sales_date < DATE '2014-11-01')
)
INHERITS (sales_record);
This child table will contain data of September and October.
Now, create the sales_record_m11_to_m12
child table as follows:
CREATE TABLE sales_record_m11_to_m12
(
PRIMARY KEY (id, sales_date),
CHECK (sales_date >= DATE '2014-11-01'
AND sales_date < DATE '2015-01-01')
)
INHERITS (sales_record);
This child table will contain data of November and December.
You can verify that the tables are linked and the partition is successfully created using the following query:
DB_name=# \d+ sales_record
In the next article, we'll dive into operations we can perform in a partitioned table.
Top comments (0)