DEV Community

Cover image for Understanding the Performance of Postgres_fdw Extension in PostgreSQL 4.1.2 (Part 2)
hammadsaleemm
hammadsaleemm

Posted on

Understanding the Performance of Postgres_fdw Extension in PostgreSQL 4.1.2 (Part 2)

In our previous article, we discussed Foreign Data Wrappers (FDW) in PostgreSQL and how they enable PostgreSQL to interact with other data sources. We also introduced the postgres_fdw extension, which is a module maintained by the PostgreSQL Global Development Group that allows users to create foreign tables in PostgreSQL for remote data sources.

In this article, we will dive deeper into the postgres_fdw extension and its performance capabilities in PostgreSQL 4.1.2. We will cover the release notes related to postgres_fdw and how it processes multi-table queries, sort operations, and aggregate functions. Additionally, we will discuss how postgres_fdw can process other Data Manipulation Language (DML) statements, including INSERT, UPDATE, and DELETE.

Release Notes Related to Postgres_fdw

Postgres_fdw has undergone numerous updates over the years. Table shows the release notes related to postgres_fdw as cited from the official document:

Image description

Processing Multi-Table Queries, Sort Operations, and Aggregate Functions

Postgres_fdw can execute queries involving multiple tables by fetching each foreign table with a single-table SELECT statement and then joining them on the local server.

In version 9.5 or earlier, even if the foreign tables were stored on the same remote server, postgres_fdw would fetch them individually and join them. However, in version 9.6 or later, postgres_fdw has been improved to execute the remote join operation on the remote server when the foreign tables are on the same server and the use_remote_estimate option is on.

Let us consider how PostgreSQL processes the following query that joins two foreign tables, tbl_a and tbl_b:

SELECT * FROM tbl_a AS a, tbl_b AS b WHERE a.id = b.id AND a.id < 200;

The result of the EXPLAIN command of the query shows that the executor selects the merge join and is processed as the following steps:

Merge Join (cost=532.31..700.34 rows=10918 width=16)
Merge Cond: (a.id = b.id)
-> Sort (cost=200.59..202.72 rows=853 width=8)
Sort Key: a.id
-> Foreign Scan on tbl_a a (cost=100.00..159.06 rows=853 width=8)
-> Sort (cost=331.72..338.12 rows=2560 width=8)
Sort Key: b.id
-> Foreign Scan on tbl_b b (cost=100.00..186.80 rows=2560 width=8)

The above result shows that the executor fetches the table tbl_a using the foreign table scan and sorts the fetched rows on the local server before carrying out the merge join operation. The executor then fetches the rows of tbl_b using the foreign table scan and sorts them on the local server before continuing with the merge join operation. Finally, the result of the merge join operation is returned as the output of the query.

Sort Operation

When a query involves sorting the result set, postgres_fdw can perform the sort operation on the remote server if possible, thus reducing the amount of data that needs to be transferred to the local server. However, this depends on the planner's decision, which considers the cost of transferring the data versus the cost of performing the sort operation remotely.

Aggregate Functions

Postgres_fdw can also push aggregate functions to the remote server if possible. This means that the remote server will perform the aggregation operation and return the aggregated result set to the local server. However, not all aggregate functions can be pushed down to the remote server, and the planner will make a decision based on the cost of transferring the data versus the cost of performing the aggregation operation remotely.

Conclusion

Postgres_fdw is a powerful tool that allows PostgreSQL to work with data stored on remote servers. The postgres_fdw extension has been gradually improved in recent versions, allowing it to perform more operations on the remote server, such as joins, sorts, and aggregate functions. However, it's important to note that postgres_fdw and the FDW feature do not support the distributed lock manager and the distributed deadlock detection feature, which means that deadlocks can occur in certain situations. Nevertheless, postgres_fdw remains a valuable tool for working with remote data sources in PostgreSQL.

Top comments (0)