DEV Community

ABHISHEK N M
ABHISHEK N M

Posted on

Data Cleaning Challenge with Pandas (Google Colab)

Data Cleaning Challenge with Pandas (Google Colab)
Data cleaning is one of the most crucial steps in any data science or analytics project. In this challenge, I worked on a real-world dataset from Kaggle with over 100,000 rows, performing various Pandas operations to clean, preprocess, and prepare it for further analysis.

Dataset Details
For this challenge, I selected the E-commerce Sales Dataset from Kaggle containing around 120,000 rows and 12 columns.

It includes data such as:

๐Ÿงพ Order ID
๐Ÿ‘ค Customer Name
๐Ÿ›’ Product & Quantity
๐Ÿ’ฐ Sales & Discount
๐ŸŒ Region
๐Ÿ“… Order Date

Before Cleaning:

Rows โ†’ 120,000
Columns โ†’ 12
File format โ†’ .csv

โš™๏ธ Tools & Environment
Python 3
Google Colab
Libraries: Pandas, NumPy, Matplotlib

python
from google.colab import files
uploaded = files.upload()

import pandas as pd
df = pd.read_csv('ecommerce_sales.csv')

Enter fullscreen mode Exit fullscreen mode

Top comments (0)