DEV Community

Daniel Szakacs
Daniel Szakacs

Posted on

Feedback needed: Mini Data Cleaning & Feature Engineering Project (Café Sales)

Hey all,

I'm fairly new to data work and just finished a small project to get hands-on experience with data cleaning and feature engineering. It’s based on a simulated café sales dataset from Kaggle.

This is my first real attempt at tackling messy data, and I’d love to hear from anyone - especially those of you working with data professionally or regularly - about how I did and how I can improve.

About the Project:

  • Dataset: Artificially generated café sales data (10,000 rows)
  • Tools used: Python (Pandas, NumPy), Jupyter Notebook
  • Goal: Learn and demonstrate data cleaning techniques

What I worked on:

  • Handling missing values
  • Fixing inconsistent text formatting
  • Correcting data types
  • Replacing unclear placeholders like "error" or "unknown"

GitHub:
Check it out here

I'd be super grateful for your feedback on:
How clean and readable my code is
Whether my cleaning approach makes sense
Ideas on what I could have done better or differently

Thank you so much in advance! I truly appreciate every single comment or suggestion you might have. If you have any tips on how I can continue learning or what to explore next, I'd love to hear them!

Thank you.

Top comments (0)