DEV Community

jayanth anbu
jayanth anbu

Posted on

Hands-on Data Cleaning Using Pandas in Google Colab

Data cleaning is one of the most crucial steps in any data science or analytics project. In this challenge, I worked on a real-world dataset from Kaggle with over 100,000 rows, performing various Pandas operations to clean, preprocess, and prepare it for further analysis.

πŸ“‚ Dataset Details
For this challenge, I selected the E-commerce Sales Dataset from Kaggle containing around 120,000 rows and 12 columns.

It includes data such as:

🧾 Order ID
πŸ‘€ Customer Name
πŸ›’ Product & Quantity
πŸ’° Sales & Discount
🌍 Region
πŸ“… Order Date
Before Cleaning:

Rows β†’ 120,000
Columns β†’ 12
File format β†’ .csv

βš™οΈ Tools & Environment
Python 3
Google Colab
Libraries: Pandas, NumPy, Matplotlib

Top comments (0)