DEV Community

Cover image for Foreign Data Wrappers and Parallel Query Chapter 4
Fatema Samir
Fatema Samir

Posted on

Foreign Data Wrappers and Parallel Query Chapter 4

Introduction:

Welcome to Chapter 4 of our exploration into the depths of PostgreSQL. PostgreSQL, renowned for its comprehensive SQL support and extensibility, continues to impress with its advanced features. In Chapter 4 of the official PostgreSQL documentation, two notable features take center stage: Foreign Data Wrappers (FDW) and Parallel Query. These powerful functionalities enhance the flexibility, performance, and integration capabilities of your PostgreSQL database. In this blog post, we will delve into a concise summary of one of these features, Foreign Data Wrappers (FDW).

4.1 Foreign Data Wrappers (FDW):

Foreign Data Wrappers (FDW) in PostgreSQL refer to a feature that allows seamless access to remote data sources. Introduced in PostgreSQL version 9.1, FDW is an implementation of the SQL Management of External Data (SQL/MED) specification. This powerful capability enables PostgreSQL to interact with data sources outside its native environment, including other databases, web services, or flat files.

The PostgreSQL documentation describes FDW in two sections: an overview (4.1.1) and the specifics of the postgres_fdw extension (4.1.2).

4.1.1 Overview:
To leverage the power of Foreign Data Wrappers (FDW) in PostgreSQL, you must first install the relevant extension and execute a series of setup commands. These commands, including CREATE FOREIGN TABLE, CREATE SERVER, and CREATE USER MAPPING, allow you to configure the necessary settings for accessing foreign tables seamlessly.

Image description

Once the initial setup is complete, FDW utilizes the functions defined in the extension to facilitate query processing and interact with foreign tables. This process involves several steps:

- Creating a Query Tree:
The analyzer/analyser takes the input SQL and creates a query tree. This tree incorporates the definitions of foreign tables stored in the pg_catalog.pg_classand pg_catalog.pg_foreign_table catalogs, typically defined using CREATE FOREIGN TABLE or IMPORT FOREIGN SCHEMA commands.

- Establishing Connection to the Remote Server: The planner (or executor) establishes a connection with the remote server where the foreign tables reside. This connection allows for seamless data retrieval and processing.

- Creating a Plan Tree Using EXPLAIN Commands (Optional):
FDW supports obtaining statistics from foreign tables to estimate the plan tree of a query. Some FDW extensions, including postgres_fdw, mysql_fdw, tds_fdw, and jdbc2_fdw, utilize this feature. If the use_remote_estimate option is enabled (disabled by default), the planner executes EXPLAIN commands to estimate the cost of different plan paths. This helps in selecting the most efficient execution plan.

- Deparsing:
The planner generates a plain text SQL statement from the plan tree's scan paths of the foreign tables. This process, known as deparsing, involves recreating a plain SELECT text from the query tree created during parsing and analysis.

- Sending SQL Statements and Receiving Results:
After deparsing, the executor sends the deparsed SQL statements to the remote server for execution. The executor then receives the result. The method of sending SQL statements varies depending on the specific FDW extension. For example, mysql_fdw sends SQL statements without using a transaction.

By following these steps, FDW in PostgreSQL enables smooth integration and interaction with foreign tables, seamlessly incorporating them into your database environment.

4.1.2 How the postgres_fdw Extension Performs:

The postgres_fdw extension is a special module officially maintained by the PostgreSQL Global Development Group. It is included in the PostgreSQL source code tree.

- Multi-Table Query:
When executing a multi-table query, the postgres_fdw extension fetches each foreign table individually using single-table SELECT statements. The tables are then joined on the local server, providing an efficient and integrated approach to accessing data.

- Sort Operations:
In PostgreSQL versions 9.5 or earlier, sort operations, such as ORDER BY, are processed on the local server. This means that all the relevant rows are fetched from the remote server before the sort operation is applied. The EXPLAIN command can provide insights into the execution plan for queries involving ORDER BY clauses.

- Aggregate Functions:
Similar to sort operations, aggregate functions like AVG() and COUNT() are processed on the local server in PostgreSQL versions 9.6 or earlier.

Conclusion:

Chapter 4 of the PostgreSQL documentation sheds light on two remarkable features: Foreign Data Wrappers (FDW) and Parallel Query. While FDW enables seamless integration with remote data sources, Parallel Query, described in Section 4.2 (under construction), empowers concurrent execution of queries, enhancing performance in multi-core systems.

FDW broadens PostgreSQL's reach, allowing it to interact with data beyond its own boundaries. By installing the relevant extension and leveraging the postgres_fdw module, users can seamlessly access and process data from diverse sources.

Stay tuned for next chapters.

Reference The Internals of PostgreSQL CH 4

Top comments (0)