I've been frustrated with AI coding tools that load 15K-28K tokens of system prompts before you can even ask a question. The AI spends most of its attention reading the manual, not solving your code.
So I built Huiyu Pi — a self-hosted AI coding agent that starts at ~80 tokens.
What it does:
- Browser-based IDE (no heavy Electron app)
- ~80 tokens system prompt (not 20K)
- ~0.3s first token response
- 90%+ cheaper per request
- 100% local — your code, API keys, conversations never leave your machine
- Multi-LLM: Claude, GPT, DeepSeek, Gemini, Mistral, Groq, xAI, OpenRouter
- Built-in terminal, file editor, Git integration
- PWA support (works on mobile)
How to try:
npx huiyu-pi
Then open http://localhost:9144
Tech stack:
- Frontend: React 19, TypeScript, Vite, Tailwind CSS
- Backend: Fastify, WebSocket, SSE
- Terminal: xterm.js + node-pty
- License: MIT
GitHub: https://github.com/huiyu9144/Huiyu-Pi
Would love feedback from the self-hosted community!
Top comments (0)