Anthropic just released Claude Opus 4.7. It's their best coding model yet — 13% better than Opus 4.6 on their internal 93-task benchmark, better vision, stronger at long-running agentic tasks.
It's also $5 per million input tokens and $25 per million output tokens. API only. Every character you type goes through Anthropic's servers.
Let's talk about what you can do locally for $0.
what opus 4.7 actually brings
Based on Anthropic's announcement:
- 13% improvement over Opus 4.6 on a 93-task coding benchmark, including 4 tasks neither Opus 4.6 nor Sonnet 4.6 could solve
- Better vision — higher resolution image understanding
- Stronger agentic workflows — handles complex, multi-step tasks without losing context or stopping early
- Self-verification — the model checks its own outputs before reporting back
- Available on Claude API, Amazon Bedrock, Google Vertex AI, Microsoft Foundry
These are real improvements. Opus has been the go-to for serious coding work, and 4.7 makes it better.
But here's the thing.
the cost of frontier cloud AI
At $5/$25 per million tokens, a heavy coding session with Opus 4.7 can easily run $2-5/day. A team of developers using it as their primary coding agent? That's hundreds per month.
And every line of your proprietary code flows through someone else's infrastructure. Every prompt, every codebase context, every business logic snippet — stored, processed, potentially used for training (even with opt-outs, you're trusting the provider).
For hobby projects, fine. For anything sensitive — financial code, healthcare logic, proprietary algorithms — that's a real risk.
what runs locally right now
The local model landscape has changed dramatically in the last few months. Here's what's available today at $0/month:
Qwen3.6-35B-A3B (released this week)
- 35B total parameters, 3B active (MoE architecture)
- 73.4 on SWE-bench Verified — autonomous bug fixing on real GitHub repos
- 51.5 on Terminal-Bench 2.0 — agentic terminal coding
- Built-in vision, 262K context
- Runs on 8 GB VRAM with Q4_K_M quantization
- Apache 2.0 license
Is it as good as Opus 4.7? On raw capability, probably not — Anthropic has massive compute advantages. But on the tasks most developers actually do daily (fixing bugs, writing functions, understanding codebases, code review), Qwen3.6 is genuinely competitive. And it runs on hardware you already own.
the real comparison isn't benchmarks
It's this:
| Claude Opus 4.7 | Qwen3.6-35B-A3B | |
|---|---|---|
| Cost | $5/$25 per million tokens | $0 forever |
| Privacy | Cloud-processed | Never leaves your machine |
| Speed | Subject to API congestion | As fast as your GPU |
| Availability | Depends on Anthropic's uptime | Runs offline |
| Rate limits | Yes | No |
| Data retention | Anthropic's policy | You control everything |
| License | Proprietary | Apache 2.0 |
| Vision | Yes | Yes |
| Agentic coding | Yes (strong) | Yes (73.4 SWE-bench) |
| Setup | API key + credit card | Ollama + 10 minutes |
how to set up the local alternative
ollama run qwen3.6:35b-a3b
That's it. Or if you want a full desktop experience with a coding agent, vision support, and model management:
Locally Uncensored just shipped v2.3.3 with Qwen3.6 day-0 support. It wraps Ollama into a desktop app with a built-in coding agent that streams live between tool calls, agent mode with 13 tools and MCP integration, and remote access from your phone. Open source, AGPL-3.0.
when cloud still makes sense
Being honest: there are cases where Opus 4.7 is worth the money.
- You need the absolute frontier of capability and $25/M output tokens is pocket change for your use case
- You're doing something that requires Anthropic's specific safety features
- You need the model to handle tasks that are genuinely beyond what open models can do today
- You don't have a GPU (though even a laptop with 8GB VRAM works for Qwen3.6)
For everyone else — the gap between cloud and local is closing fast. A model that scores 73.4 on SWE-bench running on a gaming laptop would have been science fiction two years ago.
the trajectory matters more than today's snapshot
Every few months, a new open model drops that would have been frontier-class the year before. The pricing gap between cloud and local is permanent — cloud will always cost per token, local will always be free after hardware.
Opus 4.7 is impressive. But the question isn't whether it's good — it's whether it's $5/$25 per million tokens better than what you can run yourself.
For a growing number of developers, the answer is no.
Locally Uncensored — open-source desktop app for local AI. Chat, coding agents, image gen, video gen. Qwen3.6 day-0 support. AGPL-3.0.
Top comments (0)