Analyzing Popularity on StackOverflow

akul08 profile image Akul Mehra ・2 min read

My Final Project

Part of the project studies the popularity of three of the most used programming languages for Machine Learning (ML) and Big Data (BD) projects, namely Python, R and Scala on Stack Overflow. The aims of this project can be formulated by the following research questions: What is the popularity of each programming language with respect to a certain topic and what is the distribution of popularity across countries?

Demo Link


Link to Code

GitHub logo akul08 / mbd-stackoverflow

Analyzing StackOverflow dataset for MBD course


Analyzing StackOverflow dataset for MBD course

How I built it

Toolkit: HTML, CSS, jQuery, D3.js, PySpark (Python) for Big Data
Deployment: Github

The interactive map which we created to show popularity distribution over countries is provided via the link above.

In Machine Learning, Python has the highest popularity compared to R and Scala. The top three countries are India, United States of America and France with total number of questions equal to 196, 102 and 39 respectively.

In Big Data, Sweden comes in the first position amongst countries using Scala, followed by United States of America. While, Python found to be more popular in India than R and Scala.

Additional Thoughts / Feelings / Stories

This was the first project to work with Big Data and we learnt a lot on how to process and finally got the visualization done as part of our subject. Looking forward to using these skills in future projects and keep learning.

Posted on by:

akul08 profile

Akul Mehra


Studying Master's in Computer Science (Data Science & Technology) in Netherlands.


markdown guide