DEV Community

Cover image for Introduction to Foreign Data Wrappers (FDW)
David George
David George

Posted on

Introduction to Foreign Data Wrappers (FDW)

back in 2003,
a feature called "SQL Management of External Data" added to the SQL standard.
this specification aims to access and work with remote data ( on remote servers ).


Before the SQL/MED

  • there was no standard way of accessing data that was stored and managed externally outside the database management system (DBMS), from within an SQL query.
  • those data are external data sources, such as files, web services, or even other databases.
  • since the standard SQL doesn't support/provide these functionalities, each DBMS vendor had made their custom methods to access external data, which makes a problem that , it becomes difficult to write portable applications that could work with different databases.

SQL Management of External Data

  • abbreviated as "SQL/MED"
  • provides a unified way of accessing and managing data that is stored outside of the database management system (DBMS).
  • a table on a remote server is known as foreign table
  • SQL/MED provides a set of SQL extensions that allow the SQL quires to access and manipulate data from external sources as if it were part of a regular table in the database.
  • Benefits :
    • provides a standardized way of accessing external data ( that resolves one of the main problems and allows easy portable applications development that can work with different databases ).
    • write SQL queries that work across different DBMS vendors without having to rewrite the query for each vendor's proprietary interface.
    • allows external data sources to be integrated into an SQL query, making it easier to combine data from multiple sources and perform complex data transformations.

for PostgreSQL, this SQL/MED feature was realized and developed from version 9.1. ( first release September 12, 2011 )

Foreign Data Wrappers (FDW)

  • FDW is the use of SQL/MED to manage the foreign tables.
  • Foreign tables are defined as tables on the remote server.
  • After installing the necessary extension and making the appropriate settings, you can access the foreign tables on the remote servers just by using FROM table_name in your SQL query.
  • Also you can do cross-database operations as doing a Join operation on two foreign tables comes from two different remote servers ( external sources )
  • since the external data sources can be any of other databases, SQL or NoSQL, PostgreSQL or not, not even a database as flat files or web services and more, There are many types of FDW where each works with different types of data sources, either built-in or Third-party FDW.

some built-in FDW in PostgreSQL :

  • postgres_fdw :

    • This FDW allows you to access tables in other PostgreSQL databases, which can be useful for integrating data from multiple PostgreSQL databases ( cross-database queries for example ) or performing distributed queries.
  • mysql_fdw:

    • This FDW allows you to access tables in MySQL databases, which can be useful for integrating data from MySQL databases into your PostgreSQL database.
  • file_fdw:

    • This FDW allows you to access data in flat files, such as CSV or TSV files, as if they were tables in your PostgreSQL database.
  • dblink:

    • This FDW allows you to execute SQL queries on remote PostgreSQL databases and fetch the results as if they were local tables in your PostgreSQL database.
  • ogr_fdw:

    • This FDW allows you to access geospatial data in various formats, such as ESRI shapefiles or GeoJSON files, and make them appear as tables in your PostgreSQL database.

some Third-party FDW :

  • postgresql_fdw :

    • This FDW is a fork of the built-in postgres_fdw and provides additional functionality, such as support for PostgreSQL extensions and the ability to execute remote functions.
  • oracle_fdw:

    • This FDW allows you to access tables in Oracle databases and make them appear as tables in your PostgreSQL database.

let's take a real-life use case to explain what FDW introduces,

suppose having two PostgreSQL databases,
database_1: "customer" ( Local database, managed by Pg local server )
database_2: "order" ( remote database, managed by another remote server )

for the "Customer" database,

  • it's a PostgreSQL database
  • contains customers table, that have info/records about different customers
  • customers table columns are "customer_id", "first_name", and "last_name"

for the "Order" database,

  • it's a PostgreSQL database
  • contains orders table, that have info/records about different orders.
  • orders columns are "order_id", "customer_id", and "order_date",

what do we want to do?
we want to get a report with info from both databases, about
each customer's first name, last name, and total number of orders this customer ordered.

Before FDW ,

  • we need to export the data from both databases, combine data using the separate program and then perform operations of join and aggregation as we wish, which is a time-consuming and error-prone process especially if we work with large-size data.

After FDW ,

  • there is no need to combine the data from both databases to work with, the FDW feature we are discussing eliminates the need to move data between databases, and allows us to perform joins and aggregations on remote data as if it were local.
  • With FDW, you can create a mapping between the "customers" table in the customer database and the "orders" table in the order database, and then use a single SQL query to perform the join and aggregation
  • for our case, we create a foreign table that maps to the "orders" table of the order database ( remote server/database).

Top comments (0)