I Tracked Every API Call for 30 Days. 60% Were Wasting Money on the Wrong Model.

#ai #llm #openclaw #productivity

Decided to actually log every API call for a month. The results were embarrassing.

Out of roughly 3000 calls:

1800 (60%) were simple: file reads, grep, reformatting, basic Q&A
750 (25%) were medium: code refactors, test generation, summarization
450 (15%) were complex: architecture decisions, multi-file debugging

I was sending ALL of them to Claude Sonnet at $15/million tokens.

The simple 60% runs identically on DeepSeek-V3 at $1.80/million. The medium 25% works fine on GPT-4o at $5/million. Only the complex 15% actually needs Sonnet.

Before and After

Before routing: ~$240/month (everything on Sonnet)
After routing by task type: ~$140/month
Saved: ~$100/month for zero quality loss

The Surprising Part

I had no idea 60% of my daily work was basically file reads and simple edits until I actually logged it. We all think we are doing complex reasoning all day, but the data says otherwise.

How I Automated It

Manually switching models was annoying. I use TeamoRouter to auto-pick the cheapest model per task. One API key, installs in OpenClaw in 2 seconds.

Routing modes:

teamo-balanced — auto-picks best value per task (my default)
teamo-best — always highest quality
teamo-eco — always cheapest

My Routing Config

I shared my full task-to-model routing table (with exact per-task costs) in our Discord. Too detailed to format in a blog post, but the short version:

Task Type	Model	Cost/1K tokens
File reads, grep	DeepSeek-V3	$0.0014
Simple refactors	DeepSeek-V3	$0.0014
Code review	GPT-4o	$0.005
Summarization	Gemini Flash	$0.0005
Architecture	Claude Sonnet	$0.015
Complex debugging	Claude Sonnet	$0.015