DEV Community

sophiaashi
sophiaashi

Posted on

My Decision Tree for Picking the Right LLM Per Task (Saves 40%)

After months of manually deciding which model to use for each task, I wrote down my actual decision process. Sharing because it might save you time.

The Decision Tree

Is this task reading or writing?

  • Reading (file reads, grep, search) → cheapest model (DeepSeek-V3 or free MiniMax M2.7)

Is it modifying existing code?

  • Simple modification (rename, format, extract function) → cheap model (DeepSeek-V3)
  • Complex modification (refactor across 3+ files) → expensive model (Claude Sonnet)

Is it generating new code?

  • Boilerplate (tests, docs, type definitions) → cheap model
  • Architecture or design → expensive model

Is it debugging?

  • Simple error (typo, missing import, syntax) → cheap model
  • Complex (race condition, state management, async flow) → expensive model

Is it analysis or review?

  • Summarization → Gemini Flash (fastest for this)
  • Code review → GPT-4o (catches different things than Claude)
  • Security audit → Claude Sonnet (most thorough)

The Numbers

Applying this tree to my actual usage:

  • 60% of tasks → cheap model ($0.0014/1K tokens)
  • 25% of tasks → mid-tier ($0.005/1K tokens)
  • 15% of tasks → premium ($0.015/1K tokens)

Monthly cost: $240 → $140. Same output quality on every task that matters.

Automating the Decision

Following this tree manually lasted about a week before I gave up. I use TeamoRouter to automate it — teamo-balanced mode does roughly what this decision tree describes.

Also has a free tier (teamo-free) with unlimited MiniMax M2.7 calls if you just want to try offloading simple tasks.


Discord where we compare routing strategies and share configs.

Top comments (0)