DEV Community

abbazs
abbazs

Posted on • Edited on

2

How to migrate Microsoft Sql database to postgresql database using pandas?

Recently had to migrate a database in mssql to postgresql. It is a flask based dashboard application. We were developing the application in a trial environment where the hosting and db instance will expire every 120 days. And we used to take back up of the application near to the trial expiry and restore the application in a new trial instance. So to take back up the database and restore it we used flask_alchemydumps. It worked fine, restoring millions of records used to take forever from local machine, but from a trial VM it was fairly quick, within an hour the db was restored.

We got the budget approvals for paid gcp instance and this time we chose postgresql for the db instead of mssql for the cost factor. mssql instance was at least twice expensive than postgresql.

Here comes the challenge, trial VM's were stopped, no more trial VM's. So restoring the db from local machine was not happening at all. And the VPN to connect to db will frequently disconnect and the restoration of db using flask_alchemydumps became impossible.

The web application is hosted in a cloud foundry instance and hence I tried with various options cloud foundry provided (creating worker only instance, a sidecar process, running a task etc..) to restore the data from flask_alchemydumps to postgresql. It failed in all the process with some or the other error message. Most frequent error message was Exit status 137 (out of memory).

Then I tried to restore the data from yet to expire mssql instance using MSSQL Server Management Studio following stackoverflow.com answers migrate data from MS SQL to PostgreSQL? and Passing ssl certificates paths to PostgreSQL ODBC driver. It worked but it was slow and stop after restoring few thousand records.

So what actually worked?

pandas read_sql and to_sql
Here is the code that did the data base migration:

# Imports first!
import pymssql
import pandas as pd
# Probably context manager is not helping much in this case
from contextlib import contextmanager
from sqlalchemy import create_engine

# Postgres instance can only be accessed with sslmode
ssl_args = {
    "sslcert": r"C:\.postgresql\postgresql.crt",
    "sslkey": r"C:\.postgresql\postgresql.key",
    "sslmode": "require",
}

# create a sqlalchemy engine
sqlengine = create_engine(
"postgresql+psycopg2://admin:password@x.x.x.x:5432/thedatabase",
connect_args=ssl_args,
)

# The mssql database connection
@contextmanager
def get_old_db_connection():
    creds = {
        "server": "x.x.x.x",
        "user": "admin",
        "password": "password",
        "database": "old_database",
        "port": "1433",
    }
    conn = pymssql.connect(**creds)
    try:
        yield conn
    finally:
        conn.close()
# The migration code!
with get_old_db_connection() as conn:
    # read_sql with chunksize will return a iterator
    df = pd.read_sql(
        "SELECT * FROM table_1 ORDER BY id;",
        con=conn,
        chunksize=100000,
        parse_dates=["created_date"],
    )
    # Record the completed row index, in case if the 
    # process stopped before completing start again
    # with last updated index 
    # `SELECT * FROM table_1 WHERE id > 1000 ORDER BY id;`  
    for di in df:
        di.to_sql(
           name="table_1", 
           if_exists="append", 
           con=sqlengine, 
           index=False
        )
        print(f"Rows completed = {di['id'].iloc[-1]}")
Enter fullscreen mode Exit fullscreen mode

While flask_alchemydumps is a good choice for backup and restore it didn't work in this case. Ideally it shall have worked.

After the updating the db you most probably need to do this...

Heroku

This site is built on Heroku

Join the ranks of developers at Salesforce, Airbase, DEV, and more who deploy their mission critical applications on Heroku. Sign up today and launch your first app!

Get Started

Top comments (0)

Image of AssemblyAI

Automatic Speech Recognition with AssemblyAI

Experience near-human accuracy, low-latency performance, and advanced Speech AI capabilities with AssemblyAI's Speech-to-Text API. Sign up today and get $50 in API credit. No credit card required.

Try the API

👋 Kindness is contagious

Explore a sea of insights with this enlightening post, highly esteemed within the nurturing DEV Community. Coders of all stripes are invited to participate and contribute to our shared knowledge.

Expressing gratitude with a simple "thank you" can make a big impact. Leave your thanks in the comments!

On DEV, exchanging ideas smooths our way and strengthens our community bonds. Found this useful? A quick note of thanks to the author can mean a lot.

Okay