Lakshmi Sravya Vedantham

Posted on Mar 27

I Built a Skill So Claude Automatically Routes Tasks to Free-Tier AI Providers

#ai #python #opensource #claudecode

I Built a Skill So Claude Automatically Routes Tasks to Free-Tier AI Providers

Here's a problem I kept running into: I have free-tier access to Groq, OpenAI, Gemini, and MiniMax — but managing them manually is painful. Wrong tool for the job, accidentally burning through monthly limits, no visibility into what's been used.

I built agent-hub to fix this. It's a Claude Code skill that makes Claude the orchestrator — every task is automatically classified, routed to the best provider, tracked against free limits, and shown in a live status bar.

How it works

Claude classifies every incoming message into a task type and routes it:

Task Type	Signals	Provider
`code`	write/fix/debug/refactor	Codex (gpt-4o-mini)
`research`	explain/summarize/compare	Gemini (gemini-2.0-flash)
`creative`	story/dialogue/narrative	MiniMax
`fast`	yes/no, quick lookups	Groq
`general`	everything else	Groq

Classification happens before calling any API — router.py checks token budgets in usage.json, picks the best available provider, calls its API, and returns the response.

Auto-fallback: when a provider drops below 10% of its free tier, traffic automatically shifts to its fallback (Groq→Gemini, Codex→Groq, Gemini→MiniMax, MiniMax→Gemini).

Hard stop: if both primary and fallback are exhausted, the router surfaces it to the user and stops. No silent failures.

The status bar

Every response starts with a live usage bar:

[GROQ ●] Groq: 1,240/14,400 · Codex: 77/500 · Gemini: 108K/1M · MiniMax: 220K/1M

● green — above 50% remaining
● yellow — 10–50% remaining
● red — below 10% (fallback active)
○ gray — exhausted

The active provider appears first in brackets. All four counts are always shown.

Free tier limits

Provider	Model	Limit
Groq	llama-3.3-70b-versatile	14,400 req/day
Codex	gpt-4o-mini	500 req/month
Gemini	gemini-2.0-flash	1M tokens/day
MiniMax	abab6.5s-chat	1M tokens/month

Install

SKILL_DIR="$HOME/.claude/plugins/cache/claude-plugins-official/superpowers/5.0.5/skills/agent-hub"
mkdir -p "$SKILL_DIR/tests"
curl -o "$SKILL_DIR/router.py" https://raw.githubusercontent.com/LakshmiSravyaVedantham/agent-hub/main/router.py
curl -o "$SKILL_DIR/SKILL.md" https://raw.githubusercontent.com/LakshmiSravyaVedantham/agent-hub/main/SKILL.md

Then set your keys:

ROUTER="$SKILL_DIR/router.py"
python3 $ROUTER set-key groq gsk_...
python3 $ROUTER set-key codex sk-...
python3 $ROUTER set-key gemini AIza...
python3 $ROUTER set-key minimax ...
python3 $ROUTER set-key minimax-group-id ...

Keys are stored in ~/.claude/agent-hub/.env with chmod 600 — never in code or skill files.

Usage

In any Claude Code session:

superpowers:agent-hub

Claude validates your keys, shows the initial token bar, and from that point every message is classified and routed automatically. You don't think about providers — Claude does.

Or call router.py directly:

python3 $ROUTER route "explain how transformers work" --type research
python3 $ROUTER status
python3 $ROUTER reset groq

Under the hood

router.py is ~530 lines of Python 3.8+, two dependencies (requests, python-dotenv), 61 tests. Auto-reset logic zeros daily counters at UTC midnight and monthly counters on the 1st. Atomic writes to usage.json via .tmp + rename — no partial writes.

What's next

Three-agent coordination — tested with one session, curious how routing behaves when multiple agents compete for the same provider budget
Claude Code hooks integration — auto-register sessions via hooks so the status bar appears without manually invoking the skill
File section locks — paired with agent-comms for per-file, per-section coordination

GitHub: LakshmiSravyaVedantham/agent-hub — Python 3.8+, MIT license.

Top comments (1)

Harjot Singh • Jun 1

i really like the idea of automating task routing to optimize usage across free-tier providers. it sounds like a smart way to avoid hitting those limits while keeping everything organized. if you ever need a quick web app for something similar, moonshift can get you a full next.js + postgres + auth build deployed in about 7 minutes. let me know if you want to try a free run.