Vikas Awasthi

Posted on Mar 21

🔍 ContentLens | AI-Powered Document Insights

#webdev #python #ai #gemini

Building ContentLens: My Journey Creating an AI-Powered Document Processing App

Introduction

Over the past weekend, I embarked on an exciting project to build ContentLens - a web application that uses AI to analyze and transform documents. In this blog post, I'll share my experience building this application, the technologies I used, challenges I faced, and what I learned along the way.

What is ContentLens?

ContentLens is a simple yet powerful application that:

Accepts various document formats (text, markdown, JSON, DOCX, and images)
Processes them using Google's Gemini AI
Returns the results in markdown format that you can download

Whether you need to summarize a long document, extract key points, translate content, or transform it into a different format, ContentLens can help. The application is designed with simplicity and privacy in mind - all uploaded files are processed and immediately deleted.

The Technology Stack

For this project, I chose to work with:

FastHTML and MonsterUI: These frameworks provided a clean way to build server-rendered interfaces with minimal JavaScript
Python: As the backend language, handling file processing and API integration
Google Gemini API: For the AI capabilities that power the document analysis
Railway: For deployment and hosting

Building the Application: Step by Step

1. Planning the Architecture

I began by planning a clean object-oriented architecture with these main components:

Document class for handling different file types
Processor class for interacting with the Gemini API
Web routes for handling user requests

2. File Processing Challenges

One of the more challenging aspects was handling different file types. Each format required a different approach:

Text and markdown files needed simple reading
DOCX files required parsing with python-docx
Images needed special handling for the AI

I implemented a strategy pattern where the Document class would handle extraction differently based on file type.

3. Privacy and Security Considerations

From the beginning, I wanted to ensure user privacy. I implemented:

Immediate deletion of uploaded files after processing
Removal of processed results after download
Environment variables for API keys
Input validation and error handling

4. User Experience Enhancements

Based on feedback from early testers, I added:

File upload indicators
Processing status feedback
Dark mode compatibility
Helpful example instructions

Lessons Learned

This project taught me several valuable lessons:

The power of separation of concerns: By keeping document handling, AI processing, and web interfaces separate, the code remained clean and maintainable.
The importance of user feedback: Adding visual indicators for uploads and processing made the application much more user-friendly.
Deployment considerations: Ensuring environment variables were properly set up in Railway and that file paths worked correctly in the deployed environment.
The value of iterative development: Starting with a minimal viable product and adding features based on feedback proved effective.

Future Enhancements

While ContentLens is functional, there are several enhancements I'm considering:

Support for more file formats (PDF, EPUB)
Batch processing of multiple files
Custom AI model selection
User accounts for saving processing history
Additional output formats beyond markdown

Try It Out!

You can try ContentLens yourself at contentlens-production.up.railway.app or check out the code on GitHub.

I welcome any feedback or suggestions for improvement!

Conclusion

What I Learned & What's Next

Building ContentLens taught me a lot about integrating AI APIs and creating clean architecture. I'm planning to add PDF support and batch processing next.

DEV Community