DEV Community

Cover image for Happiness status of your GitHub repo: repostatus
Deepjyoti Barman
Deepjyoti Barman

Posted on

Happiness status of your GitHub repo: repostatus

People always ask "why this repo" and never ask "how is this repo", so I created an app that finds how happy a repository is.

TLDR; The app runs a sentiment analysis engine on your repo and finds how happy it is. Check repostatus

How?

So how exactly do you find the sentiment of a non living thing? Well, even I had that thought in my mind. Any repository is made up of people that contribute to that repo, people that interact in the repo's comments.

So, if we are able to run a sentiment analysis engine on the interactions of the people that are contributing to that repo, we might get somewhere?

repostatus extracts three important part of the repo (by using GitHub's API):

  • the commit messages
  • the comments on the issues
  • the comments on PR's

Once, we have these three things, we can combine them, filter out the unnecessary data and run our engine on it.

Tech

The backend is written in Python (FastAPI) and the frontend is written in Vue

First things first, the sentiment analysis engine used by repostatus is the textblob library. It is very easy to use and works great.

So now that we have the engine at our ease, what's next.

Backend

My goto tech stack is Python so no wonder I went with that for the API.

I used FastAPI for the backend. Off late, I have started liking FastAPI more and more and it was an obvious no brainer to go with it for the backend. It's ease of use with the efficiency is just awesome. If you haven't checked it out, do that, I'm sure you'll love it if you're a Python developer.

Services

Services that the API will offer are:

  1. Internal API for the webapp
  2. Public API
  3. Badge API (Yep, you can use repostatus badges on your README).

I wanted to provide a Public API so that people would be able to use it for their own fun projects. The API is capable of working with both private and public repo's. The details for that can be found here

The badge is another thing that I thought would be a nice little addition. This badge works similar to how the travis build badges work or any badge. You can simply use the URL to embed it into your repositories README.

More details about the badge can be found here.

Frontend

I love using VueJS. It was obvious that I would use that to build the frontend. Now for the frontend I wanted to make sure that it doesn't restrict the user too much.

Thus, repostatus works with both private and public repos. For private repos, GitHub's OAuth is used which gives us access to that particular repo and then we run the engine over the repo.

One issue that I faced while implementing the OAuth was that I wanted to make the process seamless. If you go the the app now and select on the OAuth option, you will see the process is pretty neat.

Here's what it does:

  1. Opens a new window and asks the user to give access
  2. User gives access and GitHub redirects the user to my sites callback endpoint.
  3. Window closes and the app shows all the users repos.

The above steps makes it look real seamless. However, the hard part for me was to figure out how do I know when the OAuth is done and then show the window.

Seamless OAuth

So in order to make it seamless, I implemented the following flow:

  1. User clicks on OAuth button, new window is opened and the app keeps waiting for it to close.
  2. In the new window, the user is redirected to the callback URL which returns a nice HTML page that shows the user that the window will close in 5 secs. After 5 seconds the window closes and the app knows that the OAuth is done.
  3. The app then tries to find the repo's of that user and accordingly shows the user all the repos.

I know, it's not much. I have to say, though, I really liked implementing this one little feature and making it so seamless. I'm not even kidding, I just kept on doing OAuth on repeat after implementing it, for a while.

Badge

An example of the badge can be seen below.

[RepoStatus](https://repostatus.deepjyoti30.dev/badge)

Isn't it cool?! It supports options like style of the badge where for-the-badge can be used. It changes the color of the badge based on the happiness status of the repo.

How is happiness exactly calculated

As I mentioned earlier, the happiness of any repo depends on certain parts of the repo. Thus, once the commit messages, issue comments and the PR's comments are extracted, they are run through a filter.

Filtering the data

This step makes sure that the data is cleared off of any unreadable content, like an image that the user might have posted in the comments. Or some code inside a code block that the user might have added in an issue report.

Running the engine

Once the data is cleared off all the unreadable content, it is passed to the textblob library that runs the engine on the data.

Now, this library returns a float score. This score is between -1 and +1 where +1 indicates happy and -1 indicates sad.

Thus, when the engine is run on the comments, let's say we get a score of 0.8, this means, based on the comments, the repo is not that happy but it's more happier as compared to being sad. So repostatus considers that based on the comments the repo is happy.

We do the above process on other aspects of the repo too, the commit messages etc.

Once, all the individual scores are available, they are added up and an average is calculated. This average, mathematically, is also between -1 and +1. This is the overall happiness status of the repo.

Based on this score, it is decided what color is to be assigned to the repo.

Caching

Since the engine takes a pretty hefty amount of memory, the score of any repo is cached for 15 days.

The badges are cached for 24 hours and only after that they are updated.

This caching was necessary in order to make sure the performance of the API was not effected.

Try repostatus here

Source

repostatus is open source. Source for the backend and the frontend can be found below

GitHub logo trotsly / repostatus

Get Happiness status of your repo

repostatus logo

Get Happiness status of your repo


Status of repostatus

Backend for repostatus. Repostatus lets you calculate the happiness status of your repository.

What we do?

We consider various parts of the repo like commit messages, comments on issues, pulls etc and run a sentiment analysis engine on the data in order to find out the happiens status.

Setup

You'll need to setup an environ variable named GITHUB_TOKEN that will contain an access token. In order to get the token, follow this article and accordingly save it to the environment.

One way to save something to environment is:

from os import environ
environ.set('GITHUB_TOKEN', '<your_token>')
Enter fullscreen mode Exit fullscreen mode

Otherwise, it can also be set through the rc file, i:e zshrc, bashrc etc

Tests

For the tests, we are using pytest

If you wish to run the tests yourself, make sure you have it installed. The tests can be run by the…

Top comments (9)

Collapse
 
dmahely profile image
Doaa Mahely

Awesome idea and great implementation! Today I learned my commit style is unfortunate πŸ˜‚

Artest repo stats

Artest backend repo stats

Collapse
 
deepjyoti30 profile image
Deepjyoti Barman

Glad you like it. Feel free to use the badge on your repo!

Collapse
 
hentaichan profile image
γƒ˜γƒ³γ‚Ώγ‚€γ‘γ‚ƒγ‚“

What's the scale for this happiness status? Is there something above balanced? Can it go worse than sad?

Thread Thread
 
deepjyoti30 profile image
Deepjyoti Barman • Edited

Yeah, these are the states.

Happy
Balanced
Sad
Angry

If you check the api docs you'll get an idea.

Collapse
 
thomasbnt profile image
Thomas Bnt β˜•

Oh nice !

Collapse
 
shadowtime2000 profile image
shadowtime2000

Nice project! One piece of feedback I have is allow the Icon in the navbar to be a link because my instinct after doing something is always to click the icon in the navbar to get back to the homepage.

Collapse
 
asixjin profile image
King Asix

I don't understand what about the commits and comments determines if they are "happy" or not

Collapse
 
oguimbal profile image
Olivier Guimbal • Edited

I cant select the repo I'm interested into scanning :( ... its name begins by a "p", but the last repo visible in the listing begins by a "m"... I guess I have too many public repos !

Collapse
 
deepjyoti30 profile image
Deepjyoti Barman • Edited

I think that's a bug. The thing is GitHub's API's return a certain number of repos so in order to get all the repos it would have to iterate the list which becomes a bit hefty, but I guess I can find a workaround.

Updated: Should be fixed now!