DEV Community

Cover image for Pandas DataFrame into a PostgreSQL Table
Erlan Akbaraliev
Erlan Akbaraliev

Posted on

Pandas DataFrame into a PostgreSQL Table

Final Result

  1. We have this DataFrame in Jupyter Notebook. We clean it using Pandas — filling or deleting null values, adding new columns, converting data types, etc.
    Pandas DataFrame

  2. Now we want to view the cleaned DataFrame as a table inside a SQL database, so we can perform further analysis using SQL.
    psql


How to Convert a Pandas DataFrame into a Database Table

We'll use Pandas and SQLAlchemy to connect Jupyter Notebook to a PostgreSQL database.

Steps Overview

  1. Install Jupyter Notebook
  2. Install PostgreSQL
  3. Create a database to store the DataFrame as a table
  4. Prepare the DataFrame in Jupyter
  5. Connect Jupyter to PostgreSQL and convert the DataFrame into a table

1. Install Jupyter Notebook

Open your terminal or PowerShell and navigate (cd) to the directory where you want to store your Jupyter Notebook files.
For example:

cd ~/Documents/
Enter fullscreen mode Exit fullscreen mode

Install and start Jupyter:

pip install notebook
jupyter notebook
Enter fullscreen mode Exit fullscreen mode

You should see a new browser window open:
jupyter notebook result

Click “New” → “Python 3 (ipykernel)” to create a new notebook.
This is where we’ll create and clean our DataFrame.
jupyter run

💡 Your laptop is now running a Jupyter Notebook server locally.
You have successfully installed Jupyter Notebook which we will use to write python code.


2. Install PostgreSQL

Open your terminal and run the following commands, for macOS only:

brew install postgresql              # install PostgreSQL
brew services list                   # check if PostgreSQL is running
brew services start postgresql       # start it if not running
psql postgres                        # enter psql as the default 'postgres' user
Enter fullscreen mode Exit fullscreen mode

You should see something like this:

psql (14.19 (Homebrew))
Type "help" for help.
postgres=#
Enter fullscreen mode Exit fullscreen mode

This means you’re connected to a locally running PostgreSQL server.
Use \l to list all databases.


3. Create a Database

Inside the PostgreSQL terminal (psql), create a new database:

postgres=# CREATE DATABASE mydatabase;
postgres=# \l                     -- list all databases
postgres=# \c mydatabase          -- connect to the new database
Enter fullscreen mode Exit fullscreen mode

4. Prepare the DataFrame in Jupyter

Run the following code in Jupyter:

!pip install pandas

import pandas as pd

df = pd.DataFrame({
    'id': [1, 2, 3, 4, 5],
    'name': ['Po', 'Tai Lung', 'Shifu', 'Oogway', 'Mr Ping'],
    'salary': [3000, 2000, 2500, 4000, 1000],
    'hire_date': pd.to_datetime(pd.Series(
        ['2018.08.31', '2001.01.01', '1970.10.30', '1950.01.01', '2019.01.01']),
        format='%Y.%m.%d')
})

df.head()
Enter fullscreen mode Exit fullscreen mode

5. Connect Jupyter to PostgreSQL and Load the DataFrame

!pip install psycopg2-binary SQLAlchemy

from sqlalchemy import create_engine

# Step 1: Connect to PostgreSQL
username = "postgres"      # default user
password = ""              # the password you set during installation
host = "localhost"         # if running locally
port = "5432"              # default PostgreSQL port
database = "mydatabase"    # the database you created earlier

engine = create_engine(f"postgresql+psycopg2://{username}:{password}@{host}:{port}/{database}")

# Step 2: Load the DataFrame into PostgreSQL
table_name = "mytable"
df.to_sql(table_name, engine, if_exists="replace", index=False)

print(f"✅ Data successfully loaded into table '{table_name}' in database '{database}'.")
Enter fullscreen mode Exit fullscreen mode

6. Verify the Table in PostgreSQL

In your terminal, run:

mydatabase=# SELECT * FROM mytable;
Enter fullscreen mode Exit fullscreen mode

You should see the same data you created in Jupyter:

 id |   name    | salary |     hire_date
----+------------+--------+---------------------
  1 | Po         |   3000 | 2018-08-31 00:00:00
  2 | Tai Lung   |   2000 | 2001-01-01 00:00:00
  3 | Shifu      |   2500 | 1970-10-30 00:00:00
  4 | Oogway     |   4000 | 1950-01-01 00:00:00
  5 | Mr Ping    |   1000 | 2019-01-01 00:00:00
Enter fullscreen mode Exit fullscreen mode

🏹 We successfully connected our Jupyter Notebook with the locally running psql database and made the dataframe on Jupyter available on our local database.


Additional Useful Commands

Command Description
\l List all databases
\c database_name Connect to a specific database
\dt List all tables in the current database
\d table_name Show details of a specific table
\dn List all schemas
\du List all users
\q Quit PostgreSQL
\x Toggle expanded output for wide tables

Top comments (0)