DEV Community

Cover image for Don't Skip the Dataset Description (I Almost Did, and It Would've Cost Me)
Zeba Mushtaq
Zeba Mushtaq

Posted on

Don't Skip the Dataset Description (I Almost Did, and It Would've Cost Me)

Started looking for a tourism dataset on Kaggle for a new project. Found one with real UNWTO data, but it only went up to 2022 — not enough for what I wanted (post-COVID trends).
Then found a better-looking one: "Global Tourism & Travel Trends (2019-2024)," 24 upvotes, great coverage range. Almost picked it on the spot.
Then I actually read the full description. Turns out it's synthetic — 10,000 generated records, not real recorded stats.
Had to rename the whole project: from "Travel Recovery Analysis" to "Travel Behavior & Satisfaction Trends (2019-2024)" — same dataset, just honest framing. Still great for practice: 33 features, zero nulls, covers spend, satisfaction, eco-choices, transport modes.
Anyone else ever almost build a project around the wrong assumption about their data? 👀

Top comments (0)