CodeKing

Posted on Mar 31

"I Made AI Coding Tools Free (For Real This Time)"

#ai #productivity #javascript #opensource

The Problem

You know what's funny about "free" AI coding tools?

They're not free.

Codex CLI says it uses your OpenAI credits — and those run out fast. Claude Code says it uses Anthropic — and suddenly you're staring at "rate limited" at 2 PM on a Tuesday.

So much for "just works."

What If I Told You...

What if I told you there's a way to make Codex, Claude Code, Gemini CLI, and OpenClaw actually free?

Not "free tier" free. Not "limited to 50 requests" free. I'm talking free like WiFi at a coffee shop free.

The Secret: Free Model Routing

ProxyPool Hub now intercepts lightweight AI requests — the kind you'd use for quick code lookups, autocompletes, or "what does this error mean" — and routes them to actual free providers.

No API key. No credits. No billing.

When you need the heavy stuff (o4-mini, Sonnet 4, Gemini 2.5 Pro), it uses your paid accounts. When you just need a quick autocomplete? Free.

What Counts as "Free"?

Any model with "haiku", "mini", "fast", "lite" in the name gets auto-detected as a "fast" tier model
ProxyPool Hub maps those to free alternatives from providers like DeepSeek R1, Qwen3, or MiniMax
You can also manually specify which models route to free providers

The Toggle

One switch in the dashboard: "Enable Free Models." Flip it on. Done.

When it's on, fast-tier requests go to free providers. When it's off, everything goes through your accounts. It's that simple.

The Other Secret: Load Balancing Across API Keys

Here's something nobody talks about: API key load balancing.

You have 3 OpenAI keys? They're not all created equal. Key #1 might hit rate limit at 2 PM. Key #2 at 5 PM. Key #3 is barely used.

ProxyPool Hub now tracks:

Total requests per key
Rate limit cooldown status
Error history

And automatically routes to the least-used available key.

The Algorithm

// Simplified: pick the key with fewest requests
availableKeys.sort((a, b) => a.totalRequests - b.totalRequests);
return availableKeys[0];

That's it. When one key hits rate limit, it automatically tries the next. Your code doesn't know — or care — which key it's using.

But Wait, There's More

App-Specific Routing

You can now bind:

Codex → your Azure OpenAI endpoint (fastest)
Claude Code → your Claude Pro account
OpenClaw → any available key

Each app gets its own priority list of credentials. If the first choice is unavailable, it tries the second. If all fail and you enabled fallback, it falls back to the automatic pool.

8 Provider Support

We added 4 new providers since last time:

MiniMax
Moonshot (Kimi)
ZhipuAI (GLM)
Vertex AI

You can now mix and match across 8 different AI providers in a single dashboard.

The Real Cost

Let's do math.

Without ProxyPool Hub:

Claude Code → ~$20/month (Anthropic API)
Codex → ~$15/month (OpenAI)
Gemini CLI → $0 (free tier, limited)
Total: ~$35/month

With ProxyPool Hub:

Fast requests → Free (routed to free models)
Heavy requests → Your existing keys (shared across tools)
Total: $0/month (if you already have accounts)

The only "cost" is running a local Node.js process. That's it.

Setup (30 Seconds)

npx proxypool-hub@latest start

Open localhost:8081. Add your accounts. Flip the "Free Models" switch. Start coding.

The Catch

There isn't one.

It's 100% local — nothing leaves your machine
No telemetry, no tracking, no data collection
Credentials stored with restrictive file permissions
Runs on localhost — no cloud relay

Try It

GitHub · npm · Discord

Star it if it saves you money. Or don't. But at least try the free model routing — I dare you.

ProxyPool Hub: open-source under AGPL-3.0. Not affiliated with Anthropic, OpenAI, or Google.

DEV Community