DEV Community: Menelaos Kotoglou

Maintaining a Python package

Menelaos Kotoglou — Mon, 01 Feb 2021 11:25:40 +0000

As previously posted here: https://dev.to/koti/updating-an-important-but-stale-python-library-3o6i, me and Dimitry cooperated on updating a Python library important for the smooth operations of his company.

Unfortunately, it seems that its original author never accepted our previous Pull Request to revive the profanity-check package. As a result, Github issues were kept being opened by other developers asking about the library and issues they had while using it. What surprised us the most, was this one: https://github.com/vzhou842/profanity-check/issues/28, where someone actually recommended our fork of the project as a "new version available here, that solves the issue very well".

Considering this and seeing that package seemed to have many users, either in production or for personal use, we decided that re-publishing the same library would be a great idea to bring the package back to life and make it accessible for more developers to experiment with. The main problem was following up with scikit-learn package's updates which rendered the initial models problematic as the next versions were released.

After taking this quick and handy tutorial from Real Python: https://realpython.com/courses/how-to-publish-your-own-python-package-pypi/, we made some minor modifications to the project's repository and uploaded our updated version to PyPi.

Outcome turned out to be really beneficial during the process and made us better both as people and developers. For example, we found out that when uploading a PyPI package, you don't really sync your repository to the PyPI project page. You only upload a tar'ed "/dist" directory's content produced after running the python setup.py bdist_wheel command.

We also realised that many developers can sometimes be demanding things out of nowhere. They may request features and bug fixes, unable to understand that open-source software main purpose is the community's contribution in every possible way. They should consider putting some work on the projects themselves or contributing in other ways or at least saying "thanks".

At any case, taking on the "responsibility" of maintaining a Python package, widely used by the community seems to be an exciting journey on the open-source software adventure. It makes you feel proud and responsible for ensuring that other people's projects continue to operate smoothly.

You can find the package and instructions for its installation here: https://pypi.org/project/alt-profanity-check/. Also, the source code can be found here: https://gitlab.com/dimitrios/alt-profanity-check. Contributions and new feature ideas are more than welcome.

The Power of Study Groups

Menelaos Kotoglou — Wed, 11 Nov 2020 10:03:25 +0000

Undoubtedly, the Covid-19 crisis has changed students’ habits a lot. Old habits, such as hanging out, drinking beers, watching movies and even studying together have almost torn apart since most countries are experiencing the second Coronavirus wave.

As a student, I decided to take advantage of the time gained from the situation to work on myself, my skills and my academic performance. Especially when the University’s workload increases, keeping yourself disciplined and focused on your goals becomes even harder.

During these times, working in a study group is really helpful. Working as a team with your fellow students helps you maintain your focus and motivation. Even the days you don’t feel like studying, although you need to, your team encourages you to keep on completing your tasks.
Also, the whole team gets the advantage of each individual’s strengths. Usually, students tend to like some courses more than others. Using this, team members help each other and save valuable time and effort from browsing through course’s notes in Elearning platform.

Choosing the right platform that best suits your needs is crucial. My group prefers Google Meet. We consider it as the most stable and user-friendly free video calling software right now. Occasionally, we also use Skype, Zoom, Messenger or Discord.

I need additionally to mention the socialising part of group studying. Social distancing will last for some more months at least, socialising even virtually, is crucial for student’s mental health. As shown in the pie chart, communicating with your teammates for extracurricular activities is a main part of this, helping students to keep their social life balanced. Also, after completing your coursework you can work on your hobbies, such as side projects or gaming.

To conclude, although this chart seems to be very descriptive: the small percentage of “actually studying” time is so efficient that it is worth giving a try to make a study group.

The article is also published on Medium

The Power of Study Groups. A short article to express my opinion… | by Menelaos Κotoglou | Medium

Menelaos Κotoglou ・ Nov 11, 2020 ・ 2 min read
menelaoskotoglou.Medium

Updating an important but stale Python library

Menelaos Kotoglou — Wed, 01 Jul 2020 16:10:47 +0000

The project is based on a “profanity-check” library created by Victor Zhou. You can read more about it here and find it online here: https://github.com/vzhou842/profanity-check. Firstly, we installed the library in a virtual environment and experimented with different samples.

We tested the model with an internal dataset consisting of 850 tweets retrieved through Twitter’s sampling API then labeled manually. This produced the following results:

Confusion Matrix

Actual Predicted	Not Profane (0)	Profane (1)
Not Profane (0)	703	14
Profane (1)	93	39

Accuracy Score: 87.4%

The issue that came up is that with newer Python and scikit-learn versions a list of warnings was thrown when including the library. Library’s dependencies were gradually deprecated and we had to update them making them compatible with newer versions of Python and scikit-learn. This is important because at any given point in time new releases of these libraries might not be able to deserialize the joblib’ed (alternative to pickle) file stored with the library.

As the library’s documentation states https://scikit-learn.org/stable/modules/model_persistence.html :

“… pickle (and joblib by extension), has some issues regarding maintainability and security. Because of this,

Never unpickle untrusted data as it could lead to malicious code being executed upon loading.
While models saved using one version of scikit-learn might load in other versions, this is entirely unsupported and inadvisable. It should also be kept in mind that operations performed on such data could give different and unexpected results.”

This was not happening in this case as the original author had provided the input dataset and the serialised model files, but not the script to create them from these data. For this, I installed Python3.8 and scikit-learn 0.23.1. After lots of experiments, I substituted CountVectorizer from Victor Zhou’s blog post with TfidfVectorizer, trained the model based on the “clean_data.csv” from which the initial version was trained with and got roughly the same accuracy score as the previous model had. In detail:

Confusion Matrix

Actual Predicted	Not Profane (0)	Profane (1)
Not Profane (0)	697	20
Profane (1)	87	45

Accuracy Score: 87.4%

Working on this project helped me to:

Enhance my Python and scikit-learn knowledge.
Work with pandas, NumPy, and Joblib libraries.
Get familiar with open source development workflows.
Use git working alongside with another collaborator to solve a problem.

The biggest surprise was a versioning issue that came up. Specifically, I had to find a way to update the model in a version compatible with Python3.8 since the current model’s scikit-learn version was not compatible with that version. Fortunately scikit-learn 0.23.1 works with Python3.8 hence chose this version.

Overall, It’s been a great experience. I was able to use my academic knowledge to solve a real world problem. Also I was lucky to be supervised and mentored by Dimitrios, yourself.online’s current CTO: was confident to speak and discuss every question that came up, even the dumbest ones. During every project’s step constant and immediate feedback was provided.