How I Make My First (real) Open Source Contribution

#opensource #github #beginners

In this article, I will explain in detail how I (finally) made changes to an Open-source project and my thought process behind it.

Background

I was interested in competitive programming at the time and looking to practice some coding problems. There's this site called A2 Online Judge (A2OJ) that have a ladder of problem that we can follow based on difficulty and Codeforces Rating, It also has a feature that tracks the problem that we solved, but after some time of using it, the site is permanently shutdown.

As any programmers, my first instinct was to build it myself.., but I choose to not reinvent the wheel this time and lookup if anyone has made this yet.

Fortunately, I found a web app on Github that clones the original site. And many people already used it (It has 81 stars! 😲). Still, it doesn't have the ladder of problems based on the Codeforces Rating that I've enjoyed using before.

It started with an Issue

Since it doesn't have those feature I wanted, I start by creating a new issue on Github.

After some time thinking, I think it's better to contribute since many people already use that, and I think it will be helpful to other people as well.

Scoping out the codebase

Since I will add new data, I can just use existing code and figure those out, so I search where the problem list is located in the code and where the data is called in the front-end.

// ladder.py
div_a = [
    [1, 'Watermelon', '4', 'A'] ,
    [2, 'Way Too Long Words', '71', 'A'] ,
    [3, 'String Task', '118', 'A'] ,
    [4, 'Petya and Strings', '112', 'A'] ,
    ....

I found out it's stored in an array like this, so I have to collect the data from the problem archive and just copy-paste it! Simple enough, right? 😅

There are 11 ladders, and every ladder has 100 problems. In total there's about 1100 Problem sets, so there's no way I'm going to do it manually... Luckily there's just the solution!

Web Scraping

So I learned how to do web scraping with a Node.js library called Cheerio in a quick youtube tutorial, it turns out to be straightforward, this code fetch the A2OJ archive and format it just like the array in the repo that I've looked before.

const axios = require("axios")
const cheerio = require("cheerio")

// Fetch html function
async function fetchHTML(url) {
  const { data } = await axios.get(url)
  return cheerio.load(data)
}

(async () => {
const $ = await fetchHTML("https://a2oj.com/Ladder11.html")

let div = [];

// Get title to array
$('a').each(function (i,elem) {
  let title = $(this).text()
  let url = $(this).attr('href')
  let splitted = url.substr(41).split("/")
  div.push([i+1,title,splitted[0],splitted[1]]);
});

console.log(div);

})();

Perfect!, the array is formatted just like in the codebase. Now I can just add those array into the code.

Diving into the codebase

Adding the Problems

I go back to the code that saves all the problems and follow the naming pattern of the variable, it was named div_a, div_b ..etc., so I called every ladder to be rating_1, rating_2 ..etc.

rating_1 = [
  [ 1, 'Young Physicist', '69', 'A' ],
  [ 2, 'Beautiful Matrix', '263', 'A' ],
  [ 3, 'Queue at the School', '266', 'B' ],
  [ 4, 'Borze', '32', 'B' ],
  [ 5, 'Beautiful Year', '271', 'A' ],
    ..
    ..
rating_11 = ..

Link it into the front-end

I imitate the code styling and do as minimum changes as I can.

    try:
        handle = request.GET['handle']
        div = request.GET['div']
+       rating = request.GET['rating']
...
    div = int(div)
+   rating = int(rating)
...
    elif div == 7:
        division = div_a
        div_head = "DIV 1.E"
    ...
+   elif rating == 1:
+       division = rating_1
+       div_head = "Codeforces Rating < 1300"
...

And it turns out to be working!

So I pushed the code and created a pull request.

I was sooo satisfied after it's done and finally hitting the pull request button.

Things that I learned

We don't have to understand the entire codebase. Just focus on the changes you're gonna make and the connection between those. After we broke it into these tiny pieces, it's gonna be so much easier.
We just have to imitate the existing code and extend it from there. This way, we can make minimum changes to the code and do it effectively.
We don't necessarily know what library/frameworks the codebase is using. We can learn it along the way.

This contribution really 'exposes' me to the world of Open-source and how awesome it is to make changes to a public project that many people use.

I hope this post can spark your interest and encourage you to dive into the world of open-source! and to the folks that have contributed before, How's your first open-source contribution experience? I'd love to hear it!