loading...

How to find Good First Issues to Contribute OSS

ohbarye profile image Masato Ohba ・3 min read

This post is based on my presentation titled "How to find Good First Issues".

Target Reader

This post would be helpful to you if you:

  • want to contribute OSS
  • are struggling to find a repository or an issue to contribute
  • (would rather you preferred major/popular OSS 😇)

Honestly speaking, the target was just me. I was an OSS newbie and wondering how I could contribute to any OSS.

Good First Issue List

In this article, I'm going to introduce a tip to find good first issues with a simple script to make an issue list like below.

list

This list has tons of issues waiting for beginners' contribution. You can see the full list as a spreadsheet here

BTW, What "good first issue" is?

It is a kind of a label name that GitHub provides to each repository by default.

labels

According to the official, this label indicates below.

Apply the help wanted and good first issue labels to issues in your repository to highlight opportunities for people to contribute to your project.

How to list up good first issues

To list up those issues, I wrote a simple script named goofi and published on GitHub: https://github.com/ohbarye/goofi

That just does the following three things.

  1. Call GitHub GraphQL API to fetch issues
  2. Format its response
  3. Create CSV

Easy?

GraphQL Query

Even though I could do the same with GitHub API v3 REST version, it'd better off using GraphQL API so that I could avoid N+1 query problem.

Here is a query that the script runs. It fetches issues:

  • whose repository has more than 500 stars.
  • whose repository has more than 1 issue labeled good first issue.
  • whose repository's language is javascript.
  • labeled good first issue.
{
  search(first: 100, query: "language:javascript good-first-issues:>1 stars:>500", type: REPOSITORY) {
    repositoryCount
    pageInfo {
      startCursor
      endCursor
      hasNextPage
    }
    nodes {
      ... on Repository {
        owner {
          login
        }
        name
        url
        issues(first: 100, labels: ["good first issue"], states: OPEN, orderBy: {field: UPDATED_AT, direction: DESC}) {
          totalCount
          nodes {
            title
            url
          }
        }
        stargazers {
          totalCount
        }
      }
    }
  }
}

What made me surprised here is, GitHub API can handle quite specific filters.

  • "good-first-issues:>1"
  • "stars:>500"

If you're interested in the API specification, see https://help.github.com/articles/searching-repositories/.

Response

That is a part of response though, the query above results like below.

{
  "data": {
    "search": {
      "repositoryCount": 196,
      "pageInfo": {
        "startCursor": "Y3Vyc29yOjE=",
        "endCursor": "Y3Vyc29yOjEwMA==",
        "hasNextPage": true
      },
      "nodes": [
        {
          "owner": {
            "login": "vuejs"
          },
          "name": "vue",
          "url": "https://github.com/vuejs/vue",
          "issues": {
            "totalCount": 4,
            "nodes": [
              {
                "title": "warn if $set is used on a property that already exist",
                "url": "https://github.com/vuejs/vue/issues/8129"
              }
            ]
          },
          "stargazers": {
            "totalCount": 105267
          }
        }
      }
    }
  }
}

Format data ~ Write CSV

Once we can easily get data via the GitHub API, it's not a hard task to format the data.

writeIssues(repository) {
  const owner = repository.owner.login;
  const name = repository.name;
  const stars = repository.stargazers.totalCount;

  repository.issues.nodes.forEach((issue) => {
    const title = issue.title;
    const url = issue.url;
    this.writer.write({owner, name, stars, title, url});
  });
}

const nodes = response.data.data.search.nodes;
nodes.forEach(this.writeIssues);

Then?

Now that we have the candidate list, all you have to do is check issues one by one to find a point that you can contribute.


Another way

I was initially using Jasper, really cool GitHub issue reader, to find such issues by creating a stream to gather issues.

jasper

jasper_stream

But...

  • It cannot sort repositories by stars count
  • It cannot reject a wrong usage of "good first issue"
  • It notifies me every when an issue is updated

In short, this way does not work for me because that was too noisy for me.


Are they really nice ways? 🤔

At least, I could have contributed some popular repositories even though I was initially not familiar with them.

e.g., Node.js

contribution

Next Try

I'm willing to try to build GUI part of the script so that everyone can find good first issues anytime when they want to contribute. Once I'm done it, will write another article!


Let's find your "good first issues" and contribute to making this world better! 💪

Discussion

pic
Editor guide
Collapse
mkamranhamid profile image
Kamran Hamid

The csv looks great and nice technique to fetch all the issues. But it would be nice if you add the year column too like when the issue was created