🧹 Data Cleaning Challenge with Pandas (Google Colab)
Data cleaning is one of the most crucial steps in any data science or analytics project. In this challenge, I worked on a real-world dataset from Kaggle with over 100,000 rows, performing various Pandas operations to clean, preprocess, and prepare it for further analysis.
📂 Dataset Details
For this challenge, I selected the E-commerce Sales Dataset from Kaggle containing around 120,000 rows and 12 columns.
It includes data such as:
- 🧾 Order ID
- 👤 Customer Name
- 🛒 Product & Quantity
- 💰 Sales & Discount
- 🌍 Region
- 📅 Order Date
Before Cleaning:
- Rows → 120,000
- Columns → 12
- File format →
.csv
⚙️ Tools & Environment
- Python 3
- Google Colab
- Libraries: Pandas, NumPy, Matplotlib
python
from google.colab import files
uploaded = files.upload()
import pandas as pd
df = pd.read_csv('ecommerce_sales.csv')
Top comments (0)