DEV Community

Mohammad Meezan
Mohammad Meezan

Posted on

My 3rd excel project

  • I picked up a random dataset of amazon book sales of 10 years from kaggle and started exploring and analysing it.

1.Data Cleaning

  • At first I opened my dataset in power query and found there were many duplicate values in my book name section then I removed the duplicates.Then I found in the price section there was some 0 values so the price of book can't be zero so I took the mean of the price array and replaced all zeroes with mean to make my analysis better and then I loaded my data and started to analyse it.

2.Data Analysing

  • I did various analysis using pivot tables and some excel function.

1.Year wise sales analysis:In this analysis I found in which year the sales was high in the first few years then it started droping down and as the sales goes down also the moeny generation goes down.

2.Genre analysis:In this analysis I found highest no of boos that sold were non fiction over the last 10 years while the difference in the no of books between the fiction and non fiction books is of just 31.

3.User rating analysis:In this analysis I found the books below 4.3 rating are sold less as compared to books above 4.2 ratings.Books above 4.2 are sold in highest amount.Only 22 books were sold in past 10 years of below 4.3 rating while 329 books above 4.2 rating were sold in past 10 years.

4.Reviews analysis:In this analysis I learned two new functions and I found the insight that the book that got the highest number of reviews by people is Where the crawdads sing and the no of reviews it got is 87841.

  • The two new functions I learned was match and index.

At first I tried to implement the vlookup function but vlookup function can only look from right to left not from left to right and I had to look from right to left then I used the match function.At first I calculated the maximum value in reviews section using max function.Then I got 87841 as result then I implemented match function to know in which row 87841 lies then I found it lies in the 344 row and then I used index function to analyse in the 344 row which book is lying in the name of book section then I found it is the book Where the crawdads sing.

  • The important decisions the business owner should take is:

1.He should bring more stocks of books above the rating of 4.3 cause the books above these ratings are highly sold.

2.He should sell more non fiction books than fiction books but he should also keep fiction books in a good amount cause there is just a difference of 31 books.

3.He should also bring Where the crawdads sing in higher numbers cause its reviews are also high and its also rated 4.8.

And here my analysis got completed.

Top comments (0)