Erlan Akbaraliev

Posted on Oct 28

Pandas DataFrame into a PostgreSQL Table

#postgres #datascience #python #tutorial

Final Result

We have this DataFrame in Jupyter Notebook. We clean it using Pandas — filling or deleting null values, adding new columns, converting data types, etc.
Now we want to view the cleaned DataFrame as a table inside a SQL database, so we can perform further analysis using SQL.

How to Convert a Pandas DataFrame into a Database Table

We'll use Pandas and SQLAlchemy to connect Jupyter Notebook to a PostgreSQL database.

Steps Overview

Install Jupyter Notebook
Install PostgreSQL
Create a database to store the DataFrame as a table
Prepare the DataFrame in Jupyter
Connect Jupyter to PostgreSQL and convert the DataFrame into a table

1. Install Jupyter Notebook

Open your terminal or PowerShell and navigate (cd) to the directory where you want to store your Jupyter Notebook files.
For example:

cd ~/Documents/

Install and start Jupyter:

pip install notebook
jupyter notebook

You should see a new browser window open:

Click “New” → “Python 3 (ipykernel)” to create a new notebook.
This is where we’ll create and clean our DataFrame.

💡 Your laptop is now running a Jupyter Notebook server locally.
You have successfully installed Jupyter Notebook which we will use to write python code.

2. Install PostgreSQL

Open your terminal and run the following commands, for macOS only:

brew install postgresql              # install PostgreSQL
brew services list                   # check if PostgreSQL is running
brew services start postgresql       # start it if not running
psql postgres                        # enter psql as the default 'postgres' user

You should see something like this:

psql (14.19 (Homebrew))
Type "help" for help.
postgres=#

This means you’re connected to a locally running PostgreSQL server.
Use \l to list all databases.

3. Create a Database

Inside the PostgreSQL terminal (psql), create a new database:

postgres=# CREATE DATABASE mydatabase;
postgres=# \l                     -- list all databases
postgres=# \c mydatabase          -- connect to the new database

4. Prepare the DataFrame in Jupyter

Run the following code in Jupyter:

!pip install pandas

import pandas as pd

df = pd.DataFrame({
    'id': [1, 2, 3, 4, 5],
    'name': ['Po', 'Tai Lung', 'Shifu', 'Oogway', 'Mr Ping'],
    'salary': [3000, 2000, 2500, 4000, 1000],
    'hire_date': pd.to_datetime(pd.Series(
        ['2018.08.31', '2001.01.01', '1970.10.30', '1950.01.01', '2019.01.01']),
        format='%Y.%m.%d')
})

df.head()

5. Connect Jupyter to PostgreSQL and Load the DataFrame

!pip install psycopg2-binary SQLAlchemy

from sqlalchemy import create_engine

# Step 1: Connect to PostgreSQL
username = "postgres"      # default user
password = ""              # the password you set during installation
host = "localhost"         # if running locally
port = "5432"              # default PostgreSQL port
database = "mydatabase"    # the database you created earlier

engine = create_engine(f"postgresql+psycopg2://{username}:{password}@{host}:{port}/{database}")

# Step 2: Load the DataFrame into PostgreSQL
table_name = "mytable"
df.to_sql(table_name, engine, if_exists="replace", index=False)

print(f"✅ Data successfully loaded into table '{table_name}' in database '{database}'.")

6. Verify the Table in PostgreSQL

In your terminal, run:

mydatabase=# SELECT * FROM mytable;

You should see the same data you created in Jupyter:

 id |   name    | salary |     hire_date
----+------------+--------+---------------------
  1 | Po         |   3000 | 2018-08-31 00:00:00
  2 | Tai Lung   |   2000 | 2001-01-01 00:00:00
  3 | Shifu      |   2500 | 1970-10-30 00:00:00
  4 | Oogway     |   4000 | 1950-01-01 00:00:00
  5 | Mr Ping    |   1000 | 2019-01-01 00:00:00

🏹 We successfully connected our Jupyter Notebook with the locally running psql database and made the dataframe on Jupyter available on our local database.

Additional Useful Commands

Command	Description
`\l`	List all databases
`\c database_name`	Connect to a specific database
`\dt`	List all tables in the current database
`\d table_name`	Show details of a specific table
`\dn`	List all schemas
`\du`	List all users
`\q`	Quit PostgreSQL
`\x`	Toggle expanded output for wide tables

DEV Community