DEV Community

Lakshmi Sravya Vedantham
Lakshmi Sravya Vedantham

Posted on

I Built a Skill So Claude Automatically Routes Tasks to Free-Tier AI Providers

I Built a Skill So Claude Automatically Routes Tasks to Free-Tier AI Providers

Here's a problem I kept running into: I have free-tier access to Groq, OpenAI, Gemini, and MiniMax — but managing them manually is painful. Wrong tool for the job, accidentally burning through monthly limits, no visibility into what's been used.

I built agent-hub to fix this. It's a Claude Code skill that makes Claude the orchestrator — every task is automatically classified, routed to the best provider, tracked against free limits, and shown in a live status bar.


How it works

Claude classifies every incoming message into a task type and routes it:

Task Type Signals Provider
code write/fix/debug/refactor Codex (gpt-4o-mini)
research explain/summarize/compare Gemini (gemini-2.0-flash)
creative story/dialogue/narrative MiniMax
fast yes/no, quick lookups Groq
general everything else Groq

Classification happens before calling any API — router.py checks token budgets in usage.json, picks the best available provider, calls its API, and returns the response.

Auto-fallback: when a provider drops below 10% of its free tier, traffic automatically shifts to its fallback (Groq→Gemini, Codex→Groq, Gemini→MiniMax, MiniMax→Gemini).

Hard stop: if both primary and fallback are exhausted, the router surfaces it to the user and stops. No silent failures.


The status bar

Every response starts with a live usage bar:

[GROQ ●] Groq: 1,240/14,400 · Codex: 77/500 · Gemini: 108K/1M · MiniMax: 220K/1M
Enter fullscreen mode Exit fullscreen mode
  • green — above 50% remaining
  • yellow — 10–50% remaining
  • red — below 10% (fallback active)
  • gray — exhausted

The active provider appears first in brackets. All four counts are always shown.


Free tier limits

Provider Model Limit
Groq llama-3.3-70b-versatile 14,400 req/day
Codex gpt-4o-mini 500 req/month
Gemini gemini-2.0-flash 1M tokens/day
MiniMax abab6.5s-chat 1M tokens/month

Install

SKILL_DIR="$HOME/.claude/plugins/cache/claude-plugins-official/superpowers/5.0.5/skills/agent-hub"
mkdir -p "$SKILL_DIR/tests"
curl -o "$SKILL_DIR/router.py" https://raw.githubusercontent.com/LakshmiSravyaVedantham/agent-hub/main/router.py
curl -o "$SKILL_DIR/SKILL.md" https://raw.githubusercontent.com/LakshmiSravyaVedantham/agent-hub/main/SKILL.md
Enter fullscreen mode Exit fullscreen mode

Then set your keys:

ROUTER="$SKILL_DIR/router.py"
python3 $ROUTER set-key groq gsk_...
python3 $ROUTER set-key codex sk-...
python3 $ROUTER set-key gemini AIza...
python3 $ROUTER set-key minimax ...
python3 $ROUTER set-key minimax-group-id ...
Enter fullscreen mode Exit fullscreen mode

Keys are stored in ~/.claude/agent-hub/.env with chmod 600 — never in code or skill files.


Usage

In any Claude Code session:

superpowers:agent-hub
Enter fullscreen mode Exit fullscreen mode

Claude validates your keys, shows the initial token bar, and from that point every message is classified and routed automatically. You don't think about providers — Claude does.

Or call router.py directly:

python3 $ROUTER route "explain how transformers work" --type research
python3 $ROUTER status
python3 $ROUTER reset groq
Enter fullscreen mode Exit fullscreen mode

Under the hood

router.py is ~530 lines of Python 3.8+, two dependencies (requests, python-dotenv), 61 tests. Auto-reset logic zeros daily counters at UTC midnight and monthly counters on the 1st. Atomic writes to usage.json via .tmp + rename — no partial writes.


What's next

  • Three-agent coordination — tested with one session, curious how routing behaves when multiple agents compete for the same provider budget
  • Claude Code hooks integration — auto-register sessions via hooks so the status bar appears without manually invoking the skill
  • File section locks — paired with agent-comms for per-file, per-section coordination

GitHub: LakshmiSravyaVedantham/agent-hub — Python 3.8+, MIT license.

Top comments (0)