DEV Community

Cover image for Moving Data from SQL Server to Postgres with NiFi
Kelson V
Kelson V

Posted on

Moving Data from SQL Server to Postgres with NiFi

Learn how to build a containerized ETL pipeline using Apache NiFi to move data from SQL Server to PostgreSQL, featuring fixes for JDBC driver setup and timestamp type mismatches.

I recently set up a pipeline to move sales records from SQL Server to PostgreSQL using NiFi and Docker.

It’s a standard task, but there are a few manual steps and annoying errors that I had to fix. Here is exactly how I did it.

Why move the data?

Basically, you don't want to run heavy analytics on your main SQL Server while people are trying to use it. Moving the data to a separate Postgres database keeps things fast for everyone.

The Setup

I ran everything in Docker so I didn't have to install stuff on my actual computer:

  • SQL Server: Where the data starts.
  • Postgres: Where the data ends up.
  • NiFi: The tool that moves the data.

Step 1: Getting the Drivers Ready

NiFi is a Java app and doesn't know how to talk to these databases by default. You have to manually give it the JDBC driver files (the .jar files).

Download the drivers here:

Since everything is in Docker, I used these commands to "push" the drivers into the NiFi container:

docker cp C:\drivers\mssql-jdbc-13.2.1.jre11.jar nifi-orchestrator:/opt/nifi/nifi-current/lib/
docker cp C:\drivers\postgresql-42.7.9.jar nifi-orchestrator:/opt/nifi/nifi-current/lib/
Enter fullscreen mode Exit fullscreen mode

Note: You have to restart the NiFi container after this or it won't see the drivers.

Step 2: Setting up the NiFi Flow

I used three main boxes (processors) to get this done:

  • GenerateTableFetch: This checks SQL Server for new rows. I set the Maximum-value Column to id. This is important because it tells NiFi: "Only grab rows with an ID higher than the last one we moved." No duplicates.

  • ExecuteSQL: This actually pulls the data.

  • PutDatabaseRecord: This sends the data into the Postgres table (stg_sales).

Step 3: Fixing the Timestamp Error

This was the only real problem I hit. Postgres is very picky about dates. By default, NiFi sends dates as plain text, which makes Postgres crash with a character varying error.

The Fix: Go to the ExecuteSQL properties and set Use Avro Logical Types to true. This makes sure the date stays a "date" so Postgres accepts it.

Step 4: How to Reset for Testing

If you want to test the flow again with the same data, NiFi won't move it because it "remembers" the last ID.

  • Stop the processor.
  • Right-click → View State.
  • Click Clear State.

Also, don't leave the Run Schedule at 0 sec. Set it to 10 sec so it doesn't max out your CPU.

Conclusion

Once everything was green, I checked DBeaver and the data was all there. It’s a simple setup once you get the drivers and the timestamp setting right.

Top comments (0)