Picking a model is mostly an economics + risk decision.
If you always default to the "best" model, you'll burn money.
If you always default to the cheapest, you'll burn time.
Here's a practical way to choose between GPT, Claude, and local open-source models without getting religious about it.
The only question that matters
What's more expensive: tokens or your time?
- If you're doing a low-stakes task (rewrite an email, summarize notes), latency + cost dominate.
- If you're doing a high-stakes task (security review, architecture decision), correctness dominates.
So we'll start with a decision tree.
A simple decision tree
1) Is the output going straight to a user/customer?
- Yes → use your most reliable model (often Claude or top-tier GPT) and add a verification pass.
- No → go cheaper/faster.
2) Is there a deterministic verifier?
If you can verify automatically (tests, typecheck, lint, schema validation), you can use a cheaper model because mistakes get caught.
- Yes (verifier exists) → cheaper model is fine.
- No (human review only) → pay for reliability.
3) How "long-context" is the task?
- Short context (one function, one page) → any decent model.
- Long context (multi-file refactor, big doc, many constraints) → pick the model that handles long inputs well and stays consistent.
4) Is privacy/compliance a constraint?
- Yes (PII, internal code you can't upload) → local model or approved enterprise setup.
- No → cloud models are fine.
My default mapping (works surprisingly well)
Use GPT when:
- you want speed and decent quality
- you have a verifier (unit tests, linter, schema)
- you're iterating quickly (lots of small prompts)
Examples:
- generating unit tests (then running them)
- writing boilerplate code
- converting JSON ↔ YAML
- drafting a README section
Use Claude when:
- you need consistency across many constraints
- the task is "soft" (writing, reasoning, tradeoffs)
- you want less brittle output with fewer weird edge-case misses
Examples:
- architecture reviews
- "read this long incident report and propose fixes"
- multi-step refactor plan with migration steps
Use a local model when:
- the data can't leave your machine
- you want cheap, always-on "autocomplete" style help
- you're okay with rougher output, but you can iterate
Examples:
- internal code search + summarization
- drafting notes from private documents
- quick transformations that you'll manually validate
The underrated trick: two-model workflows
You don't have to pick one model.
Here are two workflows I use a lot:
1) Cheap draft → expensive review
Step 1 (cheap model): draft solution + diff
Step 2 (strong model): review diff, find risks, propose minimal fixes
Step 3 (cheap model): implement fixes
2) Strong planner → cheap executor
Step 1 (strong model): create a detailed step-by-step implementation plan + acceptance criteria
Step 2 (cheap model): implement one step at a time, with tests
This is how you keep quality high without paying top-tier tokens for everything.
A prompt you can reuse: "model selection as code"
When I'm unsure, I literally ask the model to choose.
You are my AI workflow engineer.
Given the task below, choose:
- Model tier: cheap | balanced | premium
- Why
- What verifier I should use (tests/lint/schema/human)
- Risks if I go cheaper
Task:
<PASTE>
It's a meta-prompt, but it helps you think in terms of risk + verification.
Common mistakes
- Using premium models for throwaway drafts → use cheap + verify.
- Using cheap models for irreversible decisions → pay for reliability.
- No verifier → add one (schema, tests, lint, even a checklist).
- Huge prompts → break into chains; big prompts are expensive and fragile.
If you want more practical templates for building AI workflows (prompt chains, review prompts, debugging playbooks), I'm building a Prompt Engineering Cheatsheet at Nova Press.
Free sample: https://getnovapress.gumroad.com
Top comments (0)