DEV Community

ABHISHEK N M
ABHISHEK N M

Posted on

Data Cleaning Challenge with Pandas (Google Colab)

Data Cleaning Challenge with Pandas (Google Colab)
Data cleaning is one of the most crucial steps in any data science or analytics project. In this challenge, I worked on a real-world dataset from Kaggle with over 100,000 rows, performing various Pandas operations to clean, preprocess, and prepare it for further analysis.

Dataset Details
For this challenge, I selected the E-commerce Sales Dataset from Kaggle containing around 120,000 rows and 12 columns.

It includes data such as:

🧾 Order ID
👤 Customer Name
🛒 Product & Quantity
💰 Sales & Discount
🌍 Region
📅 Order Date

Before Cleaning:

Rows → 120,000
Columns → 12
File format → .csv

⚙️ Tools & Environment
Python 3
Google Colab
Libraries: Pandas, NumPy, Matplotlib

python
from google.colab import files
uploaded = files.upload()

import pandas as pd
df = pd.read_csv('ecommerce_sales.csv')

Enter fullscreen mode Exit fullscreen mode

Top comments (0)