A short intro
A bit of background, as this is my first post on dev.to ,I am a first year engineering student from India. I recently got selected as a student developer at Google Summer of Code 2020. I am working on the project VulnerableCode, which is a basically a Django app which collects data about software vulnerabilities and exposes this through an API(our goal is much larger, and much of it will accomplished by the end of this GSoC). We welcome any contributions(there are many many things to work on), unfortunately the documentation is still WIP sorry for that in advance. Do star it !
These are top 5 things I learned in the process of starting as total beginner to getting selected in the prestigious GSoC(TLDR section is at the bottom).
The skill of finding right tools
As a software developer the problems we face are most likely not as unique as we think they are. What do I mean by this ? Consider this real example:
So the project I am working on, basically collects the data of all reported software vulnerabilities. These are typically JSON documents which have the name of software vulnerability, it's short description and finally the names of vulnerable software packages. Typically a vulnerability never affects a single software package, rather it affects some range of versions of that software package. Something like this:
"django": {
"advisory": "The administrative interface in django.contrib.admin in Django before 1.1.3, 1.2.x before 1.2.4, and 1.3.x before 1.3 beta 1 does not properly restrict use of the query string to perform certain object filtering, which allows remote authenticated users to obtain sensitive information via a series of requests containing regular expressions, as demonstrated by a created_by__password__regex parameter.",
"cve": "CVE-2010-4534",
"id": "pyup.io-33058",
"specs": [
"<1.1.3",
">=1.2,<1.2.4"
],
},
The problem was, given a list of all versions of a package(obtained via the corresponding package manager's API, in this example it is pypi), filter these into two lists, one of which has all the versions which lie inside any of the given version ranges from the JSON document and the other has versions which don't lie inside any of the range. Old me would've written a version class, then implement comparators, realize in middle of it that not all versions are entirely numeric. We have to deal with 'rc' , 'beta', 'alpha' , 'pre' etc. Some ecosystems use all together weird comparasion symbols,(ruby folks '~>' seriously?). As you can see the complexity keeps increasing.
Instead new me searches GitHub to see if anybody already solved the problem, if yes(which 89% of the times is true) then fork it else implement it myself. Those who are curious as how we solved the version range problem, we basically pip installed this less known module and it had everything we needed already implemented, problem solved. They even had tests for the module. Lesson learned: Avoid reinventing the wheel.
Writing readable, beautiful code
- It looks so beautiful, I honestly stare at it to feel good about myself
- It makes sense even after reading it after a month.
The skill of diving into huge codebase and being productive
Contributing to open source projects is easy
Technical Communication
Things I wished to put here
TLDR/summary
Top 5 things I learned:
- Avoid reinventing the wheel
- Readability counts, Zen of Python makes sense. Follow PEP8 or something similar.
- The barrier to entry to contribute to open source projects is probably lower than I thought.
- Diving into huge codebase and making sense of it is a very valuable underrated skill.
- Mastering technical communication makes life easy as a developer.
Top comments (3)
Thanks for this! It was a good read and I'm excited to try and contribute to an open source project myself too :) all the best for GSoC too!!
Thank you, good luck for the open source journey!
Stuff you spoke in this read is really helpful.Thanks for it!!!