Previous blog posts of this series
Reflection of these results
In the first post, I shared my aspirations and the challenges I anticipated into opensource projects. By the second post, I had made significant progress, not only overcoming those challenges but also discovering new insights about collaboration in open source.
Challenges
I faced many challenges in my final contributions. One of the biggest was understanding the complexities of the project codebases for DocsGPT
and ChatCraft
. Both are large projects, yet they are entirely different. DocsGPT
uses Python
on the backend and TypeScript
on the frontend, while ChatCraft
is purely TypeScript
with the Chakra UI
framework for its UI. Initially, I had no idea what I was doing—where to start, which components to focus on, or which functionalities to modify or leave untouched. Through committing back and forth and discussing with maintainers, I eventually learned which functions could be adapted and which did not need any changes.
Additionally, balancing my time between multiple courses in my final semester was tough. Despite the challenges, I managed to complete all of my pull requests except one. This remaining PR involves separating image embeddings into a vector space using the CLIP model.
GenAI eatting image from DOCX #1462
-
What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
-
#1451
Chage from docx2txt.process to extract them manually Eatting image as Base64 | Table as HTML tag | Text for paragraph
- Why was this change needed? (You can also link to an open issue here)
To using multiple vector store to retrieve correct image as paragraph order instead of zip them by docx2text.process()
- Other information:
- Progress the issue is now when we retrieve need to find the way to convert base64 back for AI to understand image
@dartpain let me know if this correct approach or wrong direction
On the other hand, one of my other pull requests was closed due to the large number of changes, which made it difficult to review. However, I successfully contributed to another critical PR that enabled MongoDB to efficiently handle search, sorting, and pagination for documents uploaded to the app.
Table Styling & Add search feature to backend #1442
-
What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
-
#1440
-
Why was this change needed? (You can also link to an open issue here) uniform table nicely
-
Other information: update table styling uniformly
Moreover, I have to ingest the document by using celery worker that really interesting and challenging in the same because the process is ingest behind the backend and to be able to understand the workflow was giving me a hard time.
Conclusion
Reflecting on this journey, contributing to large open-source projects like DocsGPT and ChatCraft has been a challenging yet rewarding experience. It tested my ability to adapt, learn quickly, and collaborate effectively with the open-source community. The process of navigating complex codebases, managing multiple responsibilities, and refining my contributions through feedback has significantly enhanced my technical and problem-solving skills.
While not every pull request resulted in success, each one provided valuable lessons that have shaped me into a more confident developer. From understanding project workflows to implementing meaningful features like MongoDB search, sort, and pagination, I’ve gained insights into how impactful even small contributions can be. The unfinished CLIP model PR serves as a reminder that open-source is not just about completion but about consistent progress and persistence.
As I move forward, I am motivated to tackle more complex issues, collaborate with diverse teams, and continue giving back to the community. This experience has been a stepping stone toward my future goals, and I’m excited to apply these lessons in my upcoming opensource project contributing.
Top comments (0)