A short intro
A bit of background, as this is my first post on dev.to ,I am a first year engineering student from India. I recently got selected as a student developer at Google Summer of Code 2020. I am working on the project VulnerableCode, which is a basically a Django app which collects data about software vulnerabilities and exposes this through an API(our goal is much larger, and much of it will accomplished by the end of this GSoC). We welcome any contributions(there are many many things to work on), unfortunately the documentation is still WIP sorry for that in advance. Do star it !

These are top 5 things I learned in the process of starting as total beginner to getting selected in the prestigious GSoC(TLDR section is at the bottom).
The skill of finding right tools

So the project I am working on, basically collects the data of all reported software vulnerabilities. These are typically JSON documents which have the name of software vulnerability, it's short description and finally the names of vulnerable software packages. Typically a vulnerability never affects a single software package, rather it affects some range of versions of that software package. Something like this:
"django": {
"advisory": "The administrative interface in django.contrib.admin in Django before 1.1.3, 1.2.x before 1.2.4, and 1.3.x before 1.3 beta 1 does not properly restrict use of the query string to perform certain object filtering, which allows remote authenticated users to obtain sensitive information via a series of requests containing regular expressions, as demonstrated by a created_by__password__regex parameter.",
"cve": "CVE-2010-4534",
"id": "pyup.io-33058",
"specs": [
"<1.1.3",
">=1.2,<1.2.4"
],
},
The problem was, given a list of all versions of a package(obtained via the corresponding package manager's API, in this example it is pypi), filter these into two lists, one of which has all the versions which lie inside any of the given version ranges from the JSON document and the other has versions which don't lie inside any of the range. Old me would've written a version class, then implement comparators, realize in middle of it that not all versions are entirely numeric. We have to deal with 'rc' , 'beta', 'alpha' , 'pre' etc. Some ecosystems use all together weird comparasion symbols,(ruby folks '~>' seriously?). As you can see the complexity keeps increasing.
Instead new me searches GitHub to see if anybody already solved the problem, if yes(which 89% of the times is true) then fork it else implement it myself. Those who are curious as how we solved the version range problem, we basically pip installed this less known module and it had everything we needed already implemented, problem solved. They even had tests for the module. Lesson learned: Avoid reinventing the wheel.
Writing readable, beautiful code

- It looks so beautiful, I honestly stare at it to feel good about myself
- It makes sense even after reading it after a month.
I totally agree with enforcing a style for the entire codebase to maintain consistency and readability. It also helps to increase the 'bus factor' of a project quickly. The Zen of Python is my compass when writing code now :P .
The skill of diving into huge codebase and being productive

Contributing to open source projects is easy

Technical Communication

I found it really useful to be more descriptive than necessary to avoid missing out on context(opinions please) when communicating low level ideas (sometimes down to pointing the line number). In my proposal I relied on using pictures to denote changes in database schema. I also included a GIF to show how the users will interact with yet-to-be made UI for the project.
Things I wished to put here
There were many programming language specific things I have resisted to mention, I believe those demand a separate post. About git-fu ,well it's a classic Karate Kid story regarding my git skills(spoiler: I used edit code on GitHub web).I also found a new hobby about thinking of software architecture of various products, I think the design aspect of the software is what earns us the 'engineer' in the 'software engineer' at same time the saying software development is an art has started making sense. Using Object Oriented Programming has started making sense, I really recommend folks learning about Object Oriented Programming to read some open source code to understand the application of OOP concepts.
TLDR/summary
Top 5 things I learned:
- Avoid reinventing the wheel
- Readability counts, Zen of Python makes sense. Follow PEP8 or something similar.
- The barrier to entry to contribute to open source projects is probably lower than I thought.
- Diving into huge codebase and making sense of it is a very valuable underrated skill.
- Mastering technical communication makes life easy as a developer.
Top comments (3)
Thanks for this! It was a good read and I'm excited to try and contribute to an open source project myself too :) all the best for GSoC too!!
Thank you, good luck for the open source journey!
Stuff you spoke in this read is really helpful.Thanks for it!!!