Building ContentLens: My Journey Creating an AI-Powered Document Processing App
Introduction
Over the past weekend, I embarked on an exciting project to build ContentLens - a web application that uses AI to analyze and transform documents. In this blog post, I'll share my experience building this application, the technologies I used, challenges I faced, and what I learned along the way.
What is ContentLens?
ContentLens is a simple yet powerful application that:
- Accepts various document formats (text, markdown, JSON, DOCX, and images)
- Processes them using Google's Gemini AI
- Returns the results in markdown format that you can download
Whether you need to summarize a long document, extract key points, translate content, or transform it into a different format, ContentLens can help. The application is designed with simplicity and privacy in mind - all uploaded files are processed and immediately deleted.
The Technology Stack
For this project, I chose to work with:
- FastHTML and MonsterUI: These frameworks provided a clean way to build server-rendered interfaces with minimal JavaScript
- Python: As the backend language, handling file processing and API integration
- Google Gemini API: For the AI capabilities that power the document analysis
- Railway: For deployment and hosting
Building the Application: Step by Step
1. Planning the Architecture
I began by planning a clean object-oriented architecture with these main components:
- Document class for handling different file types
- Processor class for interacting with the Gemini API
- Web routes for handling user requests
2. File Processing Challenges
One of the more challenging aspects was handling different file types. Each format required a different approach:
- Text and markdown files needed simple reading
- DOCX files required parsing with python-docx
- Images needed special handling for the AI
I implemented a strategy pattern where the Document class would handle extraction differently based on file type.
3. Privacy and Security Considerations
From the beginning, I wanted to ensure user privacy. I implemented:
- Immediate deletion of uploaded files after processing
- Removal of processed results after download
- Environment variables for API keys
- Input validation and error handling
4. User Experience Enhancements
Based on feedback from early testers, I added:
- File upload indicators
- Processing status feedback
- Dark mode compatibility
- Helpful example instructions
Lessons Learned
This project taught me several valuable lessons:
The power of separation of concerns: By keeping document handling, AI processing, and web interfaces separate, the code remained clean and maintainable.
The importance of user feedback: Adding visual indicators for uploads and processing made the application much more user-friendly.
Deployment considerations: Ensuring environment variables were properly set up in Railway and that file paths worked correctly in the deployed environment.
The value of iterative development: Starting with a minimal viable product and adding features based on feedback proved effective.
Future Enhancements
While ContentLens is functional, there are several enhancements I'm considering:
- Support for more file formats (PDF, EPUB)
- Batch processing of multiple files
- Custom AI model selection
- User accounts for saving processing history
- Additional output formats beyond markdown
Try It Out!
You can try ContentLens yourself at contentlens-production.up.railway.app or check out the code on GitHub.
I welcome any feedback or suggestions for improvement!
Conclusion
What I Learned & What's Next
Building ContentLens taught me a lot about integrating AI APIs and creating clean architecture. I'm planning to add PDF support and batch processing next.
Questions for you:
- What other file formats would you find useful in a tool like this?
- Have you worked with the Gemini API? How does it compare to other LLMs?
- What challenges have you faced when deploying Python web apps?
I'd love to hear your thoughts and suggestions in the comments!
Links:
Top comments (0)