I Built an Open Source AI Workspace for Paperless-ngx Because I Wanted Document Intelligence
A few months ago, I started running into a problem that many developers and small teams probably already know:
We store a huge amount of documents everywhere.
PDFs. Notes. Contracts. Scanned files. Technical documentation. Personal archives. Random folders named final-final-v2.
And even after organizing everything with tools like Paperless-ngx, I still felt like something was missing.
I didn’t just want document storage.
I wanted document intelligence.
So I started building Taan Mind.
The idea
The main goal behind Taan Mind was simple:
What if your document archive could actually understand your files?
Instead of manually searching through folders or reading entire PDFs, I wanted a workspace where AI could:
- understand documents
- extract metadata
- enrich OCR content
- connect conversations with files
- work with local-first AI models
- stay self-hosted and privacy-focused
The result became an open-source AI workspace built around Paperless-ngx.
Why I built it
Most AI document tools today have one big issue:
They require uploading sensitive files to external cloud services.
For many people that’s fine.
But for developers, self-hosters, researchers, or small companies handling private data, that’s not always acceptable.
I wanted something that could work locally with tools like:
- Ollama
- local LLMs
- Docker
- self-hosted infrastructure
while still feeling modern and easy to use.
What Taan Mind does
Currently, the project includes:
- AI-powered chat with document context
- OCR processing pipelines
- Metadata enrichment
- KPI dashboards
- Paperless-ngx integration
- Local-first model support with Ollama
- Multi-provider AI support
- Docker-ready deployment
The frontend is built with:
- Nuxt 4
- Nuxt UI
- Tailwind CSS
- AI SDK
And the stack also uses:
- Ollama
- MuPDF
- SQLite
- Drizzle ORM
- Docker Compose
The idea is to create a practical AI workspace instead of just another chatbot UI.
Website:
https://taan-mind.com/
One of the hardest parts
One of the most difficult things was balancing:
- privacy
- performance
- OCR quality
- AI context injection
- local model execution
especially when trying to keep the project simple enough for contributors to run locally.
I also wanted the architecture to stay modular so future integrations like MCP servers, new AI providers, workflows, and automation can be added later without rewriting everything.
Why open source
I decided to make the project open source because I genuinely believe document AI should be transparent.
People should be able to:
- inspect the code
- run it locally
- customize providers
- choose their own models
- own their data
I also think the self-hosted AI ecosystem is becoming incredibly important right now.
Projects like:
- Ollama
- Open WebUI
- Paperless-ngx
have shown how strong community-driven tools can become.
What surprised me the most
The biggest surprise wasn’t the AI part.
It was how many people are searching for better ways to interact with their own documents.
Not just “chat with PDF” demos.
But real workflows like:
- document organization
- searchable archives
- OCR pipelines
- metadata automation
- AI-assisted knowledge retrieval
That space still feels very early, and there’s still a lot left to build.
What’s next
Some upcoming ideas include:
- better RAG pipelines
- MCP integrations
- workflow automation
- semantic search improvements
- more local model optimizations
- collaborative document workflows
Final thoughts
This is actually my first post on Dev.to, so I wanted to share something I’ve genuinely been excited to build.
If you’re interested in:
- self-hosted AI
- open source
- document workflows
- local LLMs
- Paperless-ngx
- Nuxt
- AI tooling
I’d genuinely love feedback from the community.
Links
Website:
https://taan-mind.com/
Top comments (0)