DEV Community

HRmemon
HRmemon

Posted on

Exploring PostgreSQL's Foreign Data Wrappers: A Comprehensive Guide

PostgreSQL is an open-source relational database management system that has been developed since 1986, and one of its most interesting features is Foreign Data Wrappers (FDW). This feature allows users to access and manage remote data sources as if they were native to their PostgreSQL database, which can simplify data management and improve query performance. In this blog post, we will explore FDWs in PostgreSQL and provide a comprehensive guide on their architecture, extensions, and how to use them to manage data from different sources. We will also discuss the benefits of using FDWs and provide examples of how they can be used to integrate data from various external sources.

TLDR:

Foreign Data Wrappers (FDW) and Parallel Query are two practical features in PostgreSQL that allow users to work with remote data and optimize query performance. FDWs are PostgreSQL extensions that allow users to define how to communicate with external data sources, such as different databases or file systems, and create foreign tables that are linked to the external data source. PostgreSQL's FDW is highly flexible and extensible, and comes with several built-in FDWs that support different data sources. FDWs are a powerful tool for integrating data from different sources into a single PostgreSQL database.
FDW's architecture is based on the SQL/MED specification and allows users to manage foreign tables similarly to local tables. The FDW feature requires the installation of the appropriate extension and setup commands to create foreign servers and foreign tables. Overall, FDWs and Parallel Query are valuable tools for managing data in PostgreSQL and can greatly expand the capabilities of your database.

What are Foreign Data Wrappers (FDW)?

Foreign Data Wrappers, or FDWs, are a feature in PostgreSQL that allow you to access data stored in other databases or file systems as if it were native to your PostgreSQL database. This means you can query and manipulate data from different sources within a single PostgreSQL database, without having to transfer the data between databases or systems.
FDWs are essentially PostgreSQL extensions that provide a way to define how to communicate with an external data source, such as a different database or a file system. Once an FDW is installed and configured, you can create foreign tables in your PostgreSQL database that are connected to the external data source. These foreign tables can then be queried and updated just like regular tables in your PostgreSQL database.
FDWs are a powerful tool for integrating data from different sources into a single PostgreSQL database, which can simplify data management and improve query performance. With FDWs, you can leverage the strengths of multiple data storage technologies without having to migrate data between them.
Overview of PostgreSQL's FDW
PostgreSQL's FDW feature is highly flexible and extensible, which makes it easy to create custom FDWs that can communicate with a wide range of external data sources.
PostgreSQL comes with several built-in FDWs that support different data sources, such as:
• postgres_fdw: This FDW allows you to access data from another PostgreSQL database, either on the same server or on a remote server.
• file_fdw: This FDW allows you to access data stored in flat files, such as CSV or TSV files, as if they were tables in your PostgreSQL database.
• oracle_fdw: This FDW allows you to access data from an Oracle database.
• mysql_fdw: This FDW allows you to access data from a MySQL database.
• mongodb_fdw: This FDW allows you to access data from a MongoDB database.
In addition to these built-in FDWs, there are also many third-party FDWs available that can support other data sources, such as NoSQL databases or web services.
To use an FDW, you first need to install and configure it in your PostgreSQL database. This typically involves creating a foreign server object that specifies the connection details for the external data source, and then creating foreign tables that are linked to the foreign server object. Once the FDW is set up, you can query and update the foreign tables just like regular tables in your PostgreSQL database.
Overall, FDWs are a powerful feature in PostgreSQL that can greatly expand the capabilities of your database. With FDWs, you can easily integrate data from different sources and take advantage of the strengths of different data storage technologies, all within a single PostgreSQL database.

FDW's Architecture in PostgreSQL

PostgreSQL is an open-source relational database management system (RDBMS) that has been developed since 1986. It supports a wide range of features including transaction control, high availability, and extensibility. One of the most interesting features is the ability to manage remote data, called Foreign Data Wrappers (FDW), which is used to access data from different servers. This section will discuss the architecture of FDW in PostgreSQL.
SQL/MED Specification
In 2003, the SQL Management of External Data (SQL/MED) specification was added to the SQL standard. PostgreSQL has been developing this feature since version 9.1 to realize a portion of SQL/MED. In SQL/MED, a table on a remote server is called a foreign table.
Foreign Data Wrappers
PostgreSQL's FDW uses SQL/MED to manage foreign tables which are similar to local tables. After installing the necessary extension and making the appropriate settings, you can access the foreign tables on the remote servers.
For example, suppose there are two remote servers, postgresql and mysql, which have foreign_pg_tbl table and foreign_my_tbl table, respectively. In this example, you can access the foreign tables from the local server by issuing the SELECT queries.
sqlCopy code
localdb=# SELECT count() FROM foreign_pg_tbl; count ------- 20000 localdb=# SELECT count() FROM foreign_my_tbl; count ------- 10000
Moreover, you can execute the join operation with the foreign tables stored in different servers which are similar to the local tables.
sqlCopy code
localdb=# SELECT count(*) FROM foreign_pg_tbl AS p, foreign_my_tbl AS m WHERE p.id = m.id; count ------- 10000
Many FDW extensions have been developed and listed in Postgres wiki. However, almost all extensions are not properly maintained except for postgres_fdw, which is officially developed and maintained by the PostgreSQL Global Development Group as an extension to access a remote PostgreSQL server.

FDW Architecture

To use the FDW feature, you need to install the appropriate extension and execute setup commands, such as CREATE FOREIGN TABLE, CREATE SERVER and CREATE USER MAPPING. After providing the appropriate setting, the functions defined in the extension are invoked during query processing to access the foreign tables.
The architecture of FDW in PostgreSQL can be described as follows:

  1. The analyzer/analyser creates the query tree of the input SQL using the definitions of the foreign tables, which are stored in the pg_catalog.pg_class and pg_catalog.pg_foreign_table catalogs using the CREATE FOREIGN TABLE or IMPORT FOREIGN SCHEMA command.
  2. The planner (or executor) connects to the remote server using the specific library to connect to the remote database server. For example, to connect to the remote PostgreSQL server, postgres_fdw uses the libpq. To connect to the MySQL server, mysql_fdw uses the libmysqlclient.
  3. If the use_remote_estimate option is on (the default is off), the planner executes EXPLAIN commands for estimating the cost of each plan path.
  4. The planner creates the plain text SQL statement from the plan tree which is internally called deparsing.
  5. The executor sends the plain text SQL statement to the remote server and receives the result. The executor then processes the received data if necessary. For example, if the multi-table query is executed, the executor performs the join processing of the received data and other tables. The details of each processing are described in the following sections.

How FDW Works in PostgreSQL:

Foreign Data Wrappers (FDW) in PostgreSQL allow you to manage foreign tables that are similar to local tables using SQL/MED, a specification to access remote data. This feature has been available in PostgreSQL since version 9.1. After installing the necessary extension and making appropriate settings, you can access foreign tables on remote servers. With FDW, you can also execute join operations with foreign tables stored on different servers, similar to local tables. PostgreSQL provides a few FDW extensions; among them, only postgres_fdw is officially developed and maintained by the PostgreSQL Global Development Group.

To use the FDW feature, you need to install the appropriate extension and execute setup commands, such as CREATE FOREIGN TABLE, CREATE SERVER, and CREATE USER MAPPING. Functions defined in the extension are invoked during query processing to access foreign tables. The planner (or executor) connects to the remote server, uses the specific library to connect to the remote database server, and creates a plan tree using EXPLAIN commands. The planner queries the cost of plans to the remote server by executing the EXPLAIN command.

However, some FDW extensions use only the embedded constant values. The executor sends the plain text SQL statement to the remote server and receives the result. The executor processes the received data if necessary, such as performing join processing of the received data and other tables if a multi-table query is executed. PostgreSQL's FDW is described in detail in the following sections, including an overview, how the postgres_fdw extension works, and creating a query tree, connecting to the remote server, and creating a plan tree using EXPLAIN commands.

Foreign Data Wrappers (FDW) Extension in PostgreSQL

PostgreSQL is an open-source object-relational database system known for its advanced features and extensibility. One of the interesting features of PostgreSQL is Foreign Data Wrappers (FDW), which enables users to access and manage remote data as if it were a part of the local database. In this article, we will discuss the FDW extension in PostgreSQL, its basic concept, and how it works.

Foreign Data Wrappers (FDW)

Foreign Data Wrappers (FDW) is a PostgreSQL extension that uses the SQL Management of External Data (SQL/MED) specification to manage foreign tables that are similar to local tables. With FDW, you can access foreign tables on remote servers and execute join operations with tables stored on different servers.

The FDW feature has been developing by PostgreSQL since version 9.1. To use the FDW feature, you need to install the appropriate extension and execute setup commands, such as CREATE FOREIGN TABLE, CREATE SERVER, and CREATE USER MAPPING. After providing the appropriate settings, the functions defined in the extension are invoked during query processing to access the foreign tables.

Basic Concept of FDW

After installing the necessary extension and making the appropriate settings, you can access the foreign tables on the remote servers. For example, suppose there are two remote servers, postgresql and mysql, which have foreign_pg_tbl table and foreign_my_tbl table, respectively. In this example, you can access the foreign tables from the local server by issuing the SELECT queries as shown below.

localdb=# -- foreign_pg_tbl is on the remote postgresql server.
localdb-# SELECT count(*) FROM foreign_pg_tbl;
 count 
-------
 20000

localdb=# -- foreign_my_tbl is on the remote mysql server.
localdb-# SELECT count(*) FROM foreign_my_tbl;
 count 
-------
 10000
Enter fullscreen mode Exit fullscreen mode

Overview of FDW in PostgreSQL

To use the FDW feature in PostgreSQL, you need to install the appropriate extension and execute setup commands, such as CREATE FOREIGN TABLE, CREATE SERVER, and CREATE USER MAPPING. After providing the appropriate settings, the functions defined in the extension are invoked during query processing to access the foreign tables.

The analyzer/analyser creates the query tree of the input SQL using the definitions of the foreign tables, which are stored in the pg_catalog.pg_class and pg_catalog.pg_foreign_table catalogs using the CREATE FOREIGN TABLE or IMPORT FOREIGN SCHEMA command. To connect to the remote server, the planner (or executor) uses the specific library to connect to the remote database server. The connection parameters, such as username, server's IP address, and port number, are stored in the pg_catalog.pg_user_mapping and pg_catalog.pg_foreign_server catalogs using the CREATE USER MAPPING and CREATE SERVER commands.

PostgreSQL's FDW supports the feature to obtain statistics of the foreign tables to estimate the plan tree of a query, which are used by some FDW extensions, such as postgres_fdw, mysql_fdw, tds_fdw, and jdbc2_fdw. If the use_remote_estimate option is set to on using the ALTER SERVER command, the planner queries the cost of plans to the remote server by executing the EXPLAIN command; otherwise, the embedded constant values are used by default.

Conclusion

FDW is a powerful feature in PostgreSQL that allows users to access and manage remote data as if it were a part of the local database. With FDW, you can execute join operations with tables stored on different servers. PostgreSQL's FDW extension is officially developed and maintained by the PostgreSQL Global Development Group. In this article, we have discussed the basic concept of FDW, how it works, and its overview in PostgreSQL.

Top comments (0)