In PostgreSQL, when executing a multi-table query using the postgres_fdw extension, each foreign table is fetched individually using single-table SELECT statements, and then the join operation is performed on the local server.
In versions 9.5 or earlier, even if the foreign tables are stored on the same remote server, postgres_fdw fetches them individually and performs the join locally.
However, in versions 9.6 or later, significant improvements have been made to postgres_fdw. It can now execute the remote join operation on the remote server when the foreign tables are on the same server and the use_remote_estimate option is enabled.
Let's explore the execution details for a multi-table query involving two foreign tables, tbl_a and tbl_b:
SELECT * FROM tbl_a AS a, tbl_b AS b WHERE a.id = b.id AND a.id < 200;
The result of the EXPLAIN command for this query is as follows:
EXPLAIN SELECT * FROM tbl_a AS a, tbl_b AS b WHERE a.id = b.id AND a.id < 200;
QUERY PLAN
------------------------------------------------------------------------------
Merge Join (cost=532.31..700.34 rows=10918 width=16)
Merge Cond: (a.id = b.id)
-> Sort (cost=200.59..202.72 rows=853 width=8)
Sort Key: a.id
-> Foreign Scan on tbl_a a (cost=100.00..159.06 rows=853 width=8)
-> Sort (cost=331.72..338.12 rows=2560 width=8)
Sort Key: b.id
-> Foreign Scan on tbl_b b (cost=100.00..186.80 rows=2560 width=8)
(8 rows)
The execution plan reveals that the executor selects a merge join and executes it in the following steps:
- The executor fetches the rows from tbl_a using a foreign table scan (line 8).
- The fetched rows of tbl_a are sorted on the local server (line 6).
- The executor fetches the rows from tbl_b using a foreign table scan (line 11).
- The fetched rows of tbl_b are sorted on the local server (line 9).
- The executor performs a merge join operation on the local server (line 4).
During the execution, the following steps are taken for fetching the rows:
- Start the remote transaction.
- Declare cursor c1 with the SELECT statement:
SELECT id, data FROM public.tbl_a WHERE (id < 200)
. - Execute FETCH commands to obtain the result from cursor c1.
- Declare cursor c2 with the SELECT statement:
SELECT id, data FROM public.tbl_b
. Note that the original WHERE clause of the multi-table query is "tbl_a.id = tbl_b.id AND tbl_a.id < 200." However, postgres_fdw cannot infer the WHERE clause "tbl_b.id < 200" logically. Therefore, the executor executes a SELECT statement without any WHERE clauses and fetches all rows of the foreign table tbl_b. This process is inefficient as unnecessary rows are read from the remote server over the network, and the received rows must be sorted for the merge join operation. - Execute FETCH commands to obtain the result from cursor c2.
- Close cursor c1.
- Close cursor c2.
- Commit the transaction.
When executing multi-table queries with the postgres_fdw extension in PostgreSQL, the foreign tables are fetched individually, and the join operation is performed on the local server. In earlier versions, this approach was followed regardless of whether the foreign tables were on the same remote server. However, in recent versions, postgres_fdw has been improved to execute the remote join operation when the foreign tables are on the same server and the use_remote_estimate option is enabled. These optimisations enhance the performance and efficiency of multi-table query execution in FDW.
Top comments (0)