I Built an Open-Source AI Coding Agent That Talks to 600+ Models — Here's How
A solo developer's journey building a terminal-first AI agent backed by a self-hosted model orchestration platform.
Every AI coding tool I tried had the same problem: vendor lock-in. Copilot ties you to GitHub. Cursor ties you to their cloud. Claude Code ties you to Anthropic. I wanted something that works with any model — cloud or local — and runs entirely on my infrastructure.
So I built two things:
ai-coder — a terminal-based AI coding agent with autonomous task execution
TriForce — the self-hosted backend that powers it, orchestrating 600+ LLMs across 9 providers
Both are open source. This post walks through what they do, how they connect, and how you can try them.
The Problem
Most AI coding tools give you one model, one provider, one way to work. Switch providers? Rebuild your workflow. Want to use a local Ollama model for quick tasks and Claude for complex reasoning? Not supported. Need MCP tools for file operations, web search, or system management? Add another tool.
I wanted a single terminal agent that:
Connects to any LLM (Gemini, Claude, GPT, Mistral, Llama, DeepSeek, Qwen — local or cloud)
Executes code and commands locally on my machine (not on a remote server)
Uses MCP tools for extended capabilities
Works without a subscription to any single vendor
What ai-coder Does
ai-coder is a Python CLI tool. You open your terminal, point it at a project, and start talking.
bash# Install
pip install aicoder
Or on Debian/Ubuntu
curl -s https://repo.ailinux.me/add-ailinux-repo.sh | sudo bash
sudo apt install aicoder
Or on Arch (AUR)
yay -S aicoder
Once running, it enters an autonomous agent loop: you describe a task, it plans, executes, and iterates until done. It can read your codebase, edit files, run tests, and fix errors — all from the terminal.
Key design decisions
Local execution only. Every file operation, every shell command runs as a subprocess on your machine. This was a deliberate security choice. The ai-coder CLI never sends your code to a remote server for execution. The LLM sees your code to reason about it, but all actions happen locally.
Model-agnostic. ai-coder connects to the TriForce backend, which routes requests to whichever model fits. You can switch between a local 7B model for quick refactors and GPT-5 for complex architecture decisions — mid-conversation.
MCP tools. Through the TriForce backend, ai-coder has access to 134+ MCP tools:
code_read, code_edit, code_search — codebase navigation
shell — command execution
search — web search via SearXNG
git — version control operations
dev_analyze, dev_debug, dev_lint — AI-powered code analysis
memory_store, memory_search — persistent context across sessions
The Backend: TriForce
This is where it gets interesting. ai-coder is the frontend; TriForce is the brain.
TriForce is a FastAPI application that acts as a unified gateway to multiple LLM providers:
ProviderModelsTypeOllama20+ local modelsSelf-hostedGeminiGemini 2.5 Pro/FlashCloud APIAnthropicClaude Sonnet/OpusCloud APIOpenAIGPT-4.1, GPT-5Cloud APIMistralMistral Large, CodestralCloud APIGroqLlama, Mixtral (fast inference)Cloud APIOpenRouter200+ modelsCloud APIGitHub ModelsGPT-5, Llama 4, Grok-3Cloud APICloudflare AIWorkers AI modelsCloud API
The backend handles:
Smart routing — picks the best available model based on task type
Tier-based access — free tier gets 90+ models, paid tiers unlock everything
MCP server — exposes 134 tools via the Model Context Protocol standard
Federation — distributes load across multiple nodes via WireGuard mesh
4-layer memory — Redis + persistent storage for context that survives restarts
Architecture in 30 seconds
┌─────────────┐ HTTPS/WSS ┌──────────────────┐
│ ai-coder │ ◄──────────────────►│ TriForce API │
│ (your PC) │ │ (FastAPI) │
└─────────────┘ └────────┬─────────┘
│
┌─────────────┼─────────────┐
│ │ │
┌────▼───┐ ┌─────▼────┐ ┌────▼────┐
│ Ollama │ │ Cloud │ │ MCP │
│ Local │ │ APIs │ │ Tools │
└────────┘ └──────────┘ └─────────┘
Your prompts go to TriForce, which routes them to the right model. Tool calls come back through MCP. File edits happen locally on your machine. No code leaves your computer for execution.
Quick Demo: Debugging with ai-coder
Here's a real workflow. I have a Python service that's throwing a ConnectionResetError. Instead of manually digging through logs:
You: debug the ConnectionResetError in app/services/federation_websocket.py
ai-coder: [reads file] [analyzes code] [searches for related issues]
I found the issue. The WebSocket ping interval (30s) is shorter than
the server's idle timeout (25s), causing premature disconnections.
Fix: Set ping_interval=45, ping_timeout=30, and add a heartbeat
at 30s intervals.
[applies fix] [runs tests] ✓ All tests pass.
Three things happened here that matter:
ai-coder read the actual file from my codebase (MCP code_read)
It reasoned about the timing relationship (LLM)
It applied the fix directly (local subprocess)
No copy-pasting. No context-switching. No browser tabs.
How to Try It
Option 1: Use the hosted backend (free tier)
The easiest way. Sign up at login.ailinux.me, get an API key, and point ai-coder at it:
bashpip install aicoder
aicoder --setup # walks you through configuration
Free tier gives you access to 90+ models including Gemini, Groq, and Cloudflare AI workers. No credit card needed. Beta code: AILINUX2026.
Option 2: Use the web playground
Don't want to install anything? Try the AI chat at ailinux.me — it uses the same backend and model selection. Not a full coding agent, but gives you a feel for the multi-model approach.
Option 3: Self-host everything
TriForce is open source. Clone it, bring your own API keys, run it on your server:
bashgit clone https://github.com/derleiti/triforce
cd triforce
pip install -r requirements.txt
cp config/triforce.env.example config/triforce.env
Add your API keys
python -m app.main
You get the full stack: 134 MCP tools, model routing, agent coordination, the works.
What I Learned Building This Solo
I've been working on this ecosystem for over a year. A few things I wish someone had told me:
MCP is the right abstraction. Before MCP, I was building custom tool integrations for every feature. Now I define a tool once and it works everywhere — in ai-coder, in the web playground, in the desktop client. The protocol is young but the idea is solid.
Local execution is non-negotiable for trust. Early versions sent commands to the server for execution. Users (rightfully) didn't trust it. Moving all execution to local subprocesses was the single best architecture decision I made.
Model routing beats model loyalty. Different models are better at different things. Gemini Flash for quick answers. Claude for complex reasoning. Local Llama for offline work. Having a router that picks the right tool for the job is more powerful than any single model.
Solo doesn't mean slow. This project has 134 MCP tools, a 3-node federated infrastructure, clients for Linux/Windows/Android, a web frontend, CLI agents, and a package repository. One person, one year. The trick is building systems that compose, not monoliths that collapse.
Links
Website: ailinux.me
GitHub: github.com/derleiti
Login / Free Account: login.ailinux.me
APT Repository: repo.ailinux.me
AUR Package: aicoder / ailinux-client
Built by a solo developer in Bavaria. Star the repo if this is useful. Feedback welcome — I'm @derleiti on GitHub.
Top comments (0)