DEV Community

Daniel Jaouen
Daniel Jaouen

Posted on

5

Lambda School Data Science Project

We recently finished a project for the DSPT2 section of Lambda School's part time data science course. For my project, I decided to use a Kaggle data set of video game sales. The goal of the project is to predict global sales based on critic score, critic count, publisher, platform, etc.

For this project, I decided to lean heavily on Randomized Search Cross Validation and xgboost's XGBRegressor (using the mae eval_metric). However, I first started out with a baseline of the average global sales for the entire data set. This led me to a baseline mae of 0.6605.

Then after applying a randomized search cross validation to xgboost's XGBRegressor, I ended up with an mae of 0.4875. This beats the baseline by 0.173.

I also plotted the permutation importances of all the used features, which can be viewed below:

permutation_importances

That's it. Thanks for reading!

AWS Q Developer image

Your AI Code Assistant

Automate your code reviews. Catch bugs before your coworkers. Fix security issues in your code. Built to handle large projects, Amazon Q Developer works alongside you from idea to production code.

Get started free in your IDE

Top comments (0)

Billboard image

Create up to 10 Postgres Databases on Neon's free plan.

If you're starting a new project, Neon has got your databases covered. No credit cards. No trials. No getting in your way.

Try Neon for Free →