DEV Community


Posted on

MSSQL Server and POSTGRES Installation

Setting up a data engineering environment

When diving into the data engineering world, it is important for the set up to be ready.
To set up a data engineering environment it is advisable to work on Linux Environment.
To start with we started by using linux as a host machine.
Then we installed Virtual machine VMware. Then on the Vmware we installed the downloaded Windows operating system.
Now in order to make communication between host(Ubuntu) and virtual machine(Windows) we need to make configurations and networking on Vmware.You need to enable VNC connections as shown below.
Image description

Installation of MSSQL Server.

In this guide we walk you through the MSSQL Server installation in Windows on VMware.
The first step is to download SQL SERVER 2012. In this case it's the one we are using currently.
You can download it from here or download a later version. Then follow the instructions to install it,
then launch the SSMS(Sql Server Management Studio).The UI should be something like this:

Image description

Installation of Postgres Database on Ubuntu 20.04.

We will now install Postgres in our Ubuntu. To install this follow the installation steps here.

For Postgres configuration steps:

To check for Postgres version
psql --version
It is advisable to check if the Postgres is running using this command before you proceed
sudo systemctl status postgresql
Connection To PostgreSQL running.
sudo -u postgres psql

Installation of AdventureWorks2019 and AdventureWorksDW2019

First to start using the databases we start with installing Adventureworks2019 and DW.
To install follow the steps found here. Once installed we need to configure them in MSSQL Server on a virtual machine using these steps here.
After installation and configuration open SSMS and this should be the UI in SSMS:
Image description

3. Introduction toTalend OpenStudio

It is important to start with low code ETL tools. In this article I will show how to install and get started with Talend. We will use Talend later to Migrate Data from MSSQL Server to Postgres.

To get started we download Talend open Studio for linux here.After downloading you will receive multiple files,run the linux file which has a .sh file and launch talend. This is how it should open:
Image description

Migrating Data from SQL Server to Postgres Using Talend ETL

At the Talend Studio we create a job then create tBDInput for SQL Server and tBDOutput for Postgres. At this point, you can create a connection between the two databases.
Right Click on tBDInput Trigger the On subjob OK and drag it to tBDOutput to make the connection complete.
Ensure to enter correctly the database credentials i.e username, password and database name ,table name ,which hosts the table or database to migrate.
Image description

Image description

After all the configuring and connecting both databases it's time now to start moving our data from Sql server to postgres. We write an Sql query to migrate the data.To start with we start with SQL server by using the following command which gets data from users table:
“select userid, user_name, user_email,convert(varchar, getdate(), 0)run_date_time from users"
This is used for migration and logging using the get_date function.
Now we now have to write an SQL query to insert data into postgres.In the UI we create a table and drop it if it exists then perform the action of insertion to custdatademo table as shown;
Image description
Image description
Image description
Now it's time where we have been waiting for so long and it's time to see our output in Postgres. Let's dive in fast and confirm our hard work.
Image description

As you can see we have migrated the data successfully. Thank you for being with us in this long journey.
This is a complete Extract, Transform and Load(ETL) using Talend.

Nicholas Data Eng Tips: Nicholas
Thanks: Christopher
Respects: Neville

Top comments (0)