Migrating or syncing data between PostgreSQL databases is a common task for developers. Whether consolidating user data, archiving old records, or migrating production data to a reporting database, having a reliable and simple method can save hours of headaches.
PostgreSQL’s dblink
extension makes this process straightforward, allowing you to query one database from another as if the two were connected tables.
Before using dblink
, you need to make sure the extension is created in your target database:
CREATE EXTENSION IF NOT EXISTS dblink;
This only needs to be done once per database. After that, you can safely use dblink
to query other PostgreSQL databases.
Example Scenario
Imagine you have:
- Source database:
user_data_archive
- Target database:
user_data_main
- Target table:
users
The users
table has unique constraints on:
username
email
The goal is to copy users from the archive to the main database without violating these constraints.
Using dblink
to Move Data
Here’s a simple example:
INSERT INTO users (username, email, created_at, updated_at)
SELECT DISTINCT
username,
lower(email) AS email,
NOW() AS created_at,
NOW() AS updated_at
FROM dblink(
'dbname=user_data_archive user=app_user password=securepass',
'SELECT username, email FROM archived_users'
) AS t(username text, email text)
ON CONFLICT DO NOTHING;
How This Works:
dblink
allows the target database (user_data_main
) to query the source (user_data_archive
) directly.
DISTINCT
removes duplicate rows from the incoming data.
ON CONFLICT DO NOTHING
ensures that unique constraints on the target table (like unique usernames or emails) do not cause the query to fail.
This keeps the insert process safe while letting you move large amounts of data efficiently.
Why Use This Approach?
- Simple setup: No need to dump and restore SQL files manually.
- Flexible: You can filter, transform, or normalize data as part of the select.
- Reliable: Works even when the source and target databases live on the same server or different servers.
Practical Use Cases:
- Migrating
users
,products
, or content from a legacy database to a new system. - Populating a reporting or analytics database from production tables.
- Syncing staging and production data safely during testing.
dblink
is a powerful tool for querying one PostgreSQL database from another.
Combining DISTINCT
and ON CONFLICT DO NOTHING
allows you to safely insert data without worrying about violating unique constraints.
This approach keeps your workflow clean, efficient, and maintainable, especially for large tables or ongoing migrations.
Top comments (0)