My Decision Tree for Picking the Right LLM Per Task (Saves 40%)

#ai #openclaw #productivity #llm

After months of manually deciding which model to use for each task, I wrote down my actual decision process. Sharing because it might save you time.

The Decision Tree

Is this task reading or writing?

Reading (file reads, grep, search) → cheapest model (DeepSeek-V3 or free MiniMax M2.7)

Is it modifying existing code?

Simple modification (rename, format, extract function) → cheap model (DeepSeek-V3)
Complex modification (refactor across 3+ files) → expensive model (Claude Sonnet)

Is it generating new code?

Boilerplate (tests, docs, type definitions) → cheap model
Architecture or design → expensive model

Is it debugging?

Simple error (typo, missing import, syntax) → cheap model
Complex (race condition, state management, async flow) → expensive model

Is it analysis or review?

Summarization → Gemini Flash (fastest for this)
Code review → GPT-4o (catches different things than Claude)
Security audit → Claude Sonnet (most thorough)

The Numbers

Applying this tree to my actual usage:

60% of tasks → cheap model ($0.0014/1K tokens)
25% of tasks → mid-tier ($0.005/1K tokens)
15% of tasks → premium ($0.015/1K tokens)

Monthly cost: $240 → $140. Same output quality on every task that matters.

Automating the Decision

Following this tree manually lasted about a week before I gave up. I use TeamoRouter to automate it — teamo-balanced mode does roughly what this decision tree describes.

Also has a free tier (teamo-free) with unlimited MiniMax M2.7 calls if you just want to try offloading simple tasks.

Discord where we compare routing strategies and share configs.