In this Udacity Data Scientist Nanodegree project, I want to find out the underlying factors behind a video game's success. As an avid gamer myself, I thought it would make an interesting project and as such I will aim to answer three questions:
Q1: Which genre's perform the most?
Q2: Which Platform is being used by the most by the users, PS4 or Xbox?
Q3-Q4: How does a games platform, Genre, critic score, user score, and Rating influence sales. Can we predict
Sales using these attributes?
Before I can start, I should be clear about the dataset which I have used for this project. In this project, I have used opensource kaggle dataset which can be find out here.
Now let's deep dive into the findings that I have found:
Q1: Which genre's perform the most?
Figure — List of genre.
Figure — List of genre by continent.
As we can see the top 3 genres, Action, Shooters, and Sports make up by far, the largest sales by genre.
Now when we break it down by continent, we can see that North America values shooters above any other, meanwhile for pure action the EU is slightly above.
For Sports we see that they are practically evenly matched, while EU takes the majority for racing and EU and NA are again, tied for Role-Playing.
Q2: Which Platform is being used by the most by the users, PS4 or Xbox?
Figure — List of data by Platform.
Figure — Popularity of Platform.
Here we can see that for the ps4 on it's release year, global_sales were at its max, with the following year only seeing a slight decline. For Xbox the opposite is true, with Xbox seeing a slight increase globally on it's second year. Overall PS4 outshines the xbox globally which can be seen in the following bar chart.
Q3: Looking at the correlation of features with overall sales
Figure — Correlation of Global_Sales and Critic_Score.
Figure — Correlation of Global_Sales and User_Score.
Figure — Heatmap of All the features.
Figure — Pair Grid of Global_Sales, Critic_Score and User_Score.
Looking above we can see that critic score and global_sales have a positive correlation. User_count, Critic_Count, Critic_Score all seemed to have the highest positive correlations with Global_Sales, with Genre also having a relatively high positive correlation. Check out below for more :
Q4: Can we predict sales with these attributes?
Figure — Prediction of sales.
So, we can obviously assume that outside sales would correlate with global sales as a whole, but I think the interesting aspect is that neither Europe nor North America make up the biggest parts of overall sales, but rather most likely places like Australia, Asia, South America, etc.
Conclusion:
In this article, we had a look at what were the most popular genre according to Kaggle dataset.
We had also seen that PS4 dominates the gaming market based on our analysis.
Finally, we had a look at some of the correlation and found an interesting fact that neither Europe nor North America make up the biggest parts of overall sales, but rather most likely places like Australia, Asia, South America, etc.
To see more about this analysis, see the link to my GitHub available here
References:
GitHub :
https://github.com/ankit1797/Write-A-DS-Blog-Post
Video Game Sales Dataset :
https://www.kaggle.com/gregorut/videogamesales
Top comments (0)