DEV Community

Discussion on: Who's looking for open source contributors? (June 18 edition)

Collapse
 
ganeshtata profile image
Tata Ganesh • Edited

Posted last week as well. :-D
pdfminer.six

Description
Pdfminer.six is a python library used for extracting information from PDF documents. It is a fork of pdfminer. This fork was created because the original repository is not being maintained anymore. Pdfminer.six is 100 commits ahead of the original pdfminer, thanks to a number of contributors. Parsing PDFs is extremely difficult, and pdfminer provides some APIs that can help in extracting text and non-text information from PDFs. I am one of the maintainers / admins of the project, but unfortunately, I am not able to give this project the time it deserves. A lot of PRs are still pouring in, and there are several open issues as well. I am trying to address them, but it would be great if more people start to contribute to the project.
Some Tasks

  • Reviewing existing PRs / resolving open issues.
  • There is tons of Documentation work. Somebody can work on making the README better. Maybe beautify it / show capabilities of the library. Documentation is long pending for this project.
  • This is a pure python implementation, but there have been suggestions to use cython for increased speed. Maybe, at a later point of time, the library itself can be written in c++, with a simple python interface ( Something like opencv-python. Not sure about this though ).
  • I am pretty sure that there are people who are far more experienced in working with open source projects than me. I would love it if people can pitch in their ideas on the future of this project. I don’t have any specific roadmap in mind, so I am open to any suggestion.

Please note that this is in no way an exhaustive list of the tasks. I have just written tasks that I can immediately think of.

Tech Stack - Primarily python

This project is alive primarily because people have actively contributed to it, and haven't let it die. It started with goulu creating the repo, and others contributing from time to time. Hopefully, we can keep maintaining it to make it easier for people to work with PDFs! I will keep adding more open tasks as and when I come up with some. Thank you for your time !
Thank you dev.to for this wonderful platform!
ALSO, I have created a Gitter chatroom for having discussions regarding the project. There isn't much content there yet, but I am looking forward to some discussions in the near future!

Collapse
 
egimba profile image
Eric Gitonga

Newish to the world of dev, but most definitely interested. Going to peruse the repo to see how and where I can fit in...

Collapse
 
ganeshtata profile image
Tata Ganesh

Awesome Eric! Please peruse through the repo, and contribute if you find it interesting! Thank you!

Collapse
 
gokulkanand profile image
gokulkanand

The project looks interesting. I'll look at the readme to see what needs to be done.