Discussion on: "I Pointed Claude Code at My Local Ollama Models — Here's the 3-Minute Setup"

View post

Replies for: The per-app routing config is a smart middle ground. One thing I'd push further: within the same app, the gap between "rename this variable" and "r...

You're right that per-app routing is a coarse approximation. Per-request classification is more powerful but harder to
get right — prompt length is a decent proxy but misses context (a short "refactor this function" can be surprisingly
hard if it touches 10 files).

One heuristic I've been thinking about: context token count as the routing signal. If the request includes more than N
context tokens, route to cloud. Simpler than NLP classification and correlates reasonably well with actual task
complexity.

Curious what signals you'd trust most in practice — keyword patterns or something else?