I was paying $20/month for ChatGPT Plus to help with support workflows—until I hit another rate limit during a printer ticket crisis.
So I built my own GPT-powered assistant on a dusty Windows server.
No tokens. No API keys. No cloud.
What I Built
- Self-hosted chatbot using Ollama and Mistral 7B
- Flask backend and React frontend
- Custom JSON knowledge base that updates from closed helpdesk tickets
- Hosted entirely offline on local hardware
Stack Overview
- Flask – backend routing and LLM integration
- React – UI for support prompts and feedback
- Ollama – local LLM runner with REST API
- Windows 10 server with 32GB RAM and GTX 1660 Super
Why I Did It
- No more rate limits or usage caps
- Complete control over data and prompts
- Avoid vendor lock-in
- And honestly? I just wanted to prove it could be done
I published the full write-up here:
Read the full article on Medium
Drops on Tuesday
If you're curious about the setup, prompt wrappers, or how the JSON knowledge base works, I'm happy to share.
Top comments (0)