DEV Community

likhitha manikonda
likhitha manikonda

Posted on

๐Ÿ“š Reading Data from Various Data Sources Using Pandas: A Beginnerโ€™s Guide

Pandas makes it incredibly easy to read data from different sources like CSV, Excel, JSON, SQL databases, HTML tables, and even clipboard. This guide walks you through each method with simple examples and clear outputs.


๐Ÿ“ฆ Getting Started

First, install Pandas if you havenโ€™t already:

pip install pandas
Enter fullscreen mode Exit fullscreen mode

Then import it in your Python script:

import pandas as pd
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“ Reading CSV Files

CSV (Comma-Separated Values) is one of the most common formats.

df = pd.read_csv('data.csv')
print(df.head())
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“ Explanation: read_csv() loads the CSV file into a DataFrame. head() shows the first 5 rows.

Output:

     Name  Age     City
0   Alice   25  New York
1     Bob   30     Paris
2  Charlie  22   Chicago
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“Š Reading Excel Files

df = pd.read_excel('data.xlsx', sheet_name='Sheet1')
print(df.head())
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“ Explanation: read_excel() loads data from an Excel file. You can specify the sheet name.

Output:

     Name  Age     City
0   Alice   25  New York
1     Bob   30     Paris
2  Charlie  22   Chicago
Enter fullscreen mode Exit fullscreen mode

๐ŸŒ Reading HTML Tables

url = 'https://en.wikipedia.org/wiki/List_of_countries_by_GDP_(nominal)'
tables = pd.read_html(url)
print(tables[0].head())
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“ Explanation: read_html() reads tables from a webpage. It returns a list of DataFrames.


๐Ÿงพ Reading JSON Files

df = pd.read_json('data.json')
print(df.head())
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“ Explanation: read_json() loads structured JSON data into a DataFrame.


๐Ÿ—ƒ๏ธ Reading from SQL Databases

import sqlite3

conn = sqlite3.connect('my_database.db')
df = pd.read_sql_query("SELECT * FROM users", conn)
print(df.head())
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“ Explanation: Connect to a SQLite database and run SQL queries directly into Pandas.


๐Ÿ“‹ Reading from Clipboard

df = pd.read_clipboard()
print(df.head())
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“ Explanation: read_clipboard() reads tabular data copied to your clipboard (e.g., from Excel or a webpage).


๐Ÿ“„ Reading Text Files with Custom Separators

df = pd.read_csv('data.txt', sep='\t')
print(df.head())
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“ Explanation: Use sep='\t' for tab-separated files or change it to any delimiter like |, ;, etc.


๐Ÿง  Tips for Beginners

  • Always inspect your data with df.head(), df.info(), and df.describe().
  • Use df.columns to see all column names.
  • Handle missing data with df.dropna() or df.fillna().

โœ๏ธ Writing Data to Various File Formats Using Pandas

Once you've manipulated your data, youโ€™ll often want to save it. Pandas makes this easy with built-in functions to export data to formats like CSV, Excel, JSON, SQL, and more.


โœ… Writing to a CSV File

df.to_csv('output.csv', index=False)
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“ Explanation: Saves the DataFrame to a CSV file. index=False prevents writing row numbers.

Output: A file named output.csv containing:

Name,Age,City
Alice,25,New York
Bob,30,Paris
Charlie,22,Chicago
Enter fullscreen mode Exit fullscreen mode

โœ… Writing to an Excel File

df.to_excel('output.xlsx', sheet_name='Sheet1', index=False)
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“ Explanation: Saves the DataFrame to an Excel file. You can name the sheet and skip the index.


โœ… Writing to a JSON File

df.to_json('output.json', orient='records', lines=True)
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“ Explanation: Saves the DataFrame as a JSON file. orient='records' makes each row a JSON object. lines=True writes each object on a new line.

Output:

{"Name":"Alice","Age":25,"City":"New York"}
{"Name":"Bob","Age":30,"City":"Paris"}
{"Name":"Charlie","Age":22,"City":"Chicago"}
Enter fullscreen mode Exit fullscreen mode

โœ… Writing to a SQL Database

import sqlite3

conn = sqlite3.connect('my_database.db')
df.to_sql('users', conn, if_exists='replace', index=False)
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“ Explanation: Saves the DataFrame to a table named users in the SQLite database. if_exists='replace' overwrites the table if it already exists.


โœ… Writing to Clipboard

df.to_clipboard(index=False)
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“ Explanation: Copies the DataFrame to your clipboard so you can paste it into Excel or a document.


โœ… Writing to a Text File with Custom Separator

df.to_csv('output.txt', sep='\t', index=False)
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“ Explanation: Saves the DataFrame as a tab-separated text file.

Output:

Name    Age City
Alice   25  New York
Bob 30  Paris
Charlie 22  Chicago
Enter fullscreen mode Exit fullscreen mode

Top comments (0)