David

Posted on Apr 16

claude opus 4.7 just dropped. here's what runs locally for free.

#ai #programming #opensource #productivity

Anthropic just released Claude Opus 4.7. It's their best coding model yet — 13% better than Opus 4.6 on their internal 93-task benchmark, better vision, stronger at long-running agentic tasks.

It's also $5 per million input tokens and $25 per million output tokens. API only. Every character you type goes through Anthropic's servers.

Let's talk about what you can do locally for $0.

what opus 4.7 actually brings

Based on Anthropic's announcement:

13% improvement over Opus 4.6 on a 93-task coding benchmark, including 4 tasks neither Opus 4.6 nor Sonnet 4.6 could solve
Better vision — higher resolution image understanding
Stronger agentic workflows — handles complex, multi-step tasks without losing context or stopping early
Self-verification — the model checks its own outputs before reporting back
Available on Claude API, Amazon Bedrock, Google Vertex AI, Microsoft Foundry

These are real improvements. Opus has been the go-to for serious coding work, and 4.7 makes it better.

But here's the thing.

the cost of frontier cloud AI

At $5/$25 per million tokens, a heavy coding session with Opus 4.7 can easily run $2-5/day. A team of developers using it as their primary coding agent? That's hundreds per month.

And every line of your proprietary code flows through someone else's infrastructure. Every prompt, every codebase context, every business logic snippet — stored, processed, potentially used for training (even with opt-outs, you're trusting the provider).

For hobby projects, fine. For anything sensitive — financial code, healthcare logic, proprietary algorithms — that's a real risk.

what runs locally right now

The local model landscape has changed dramatically in the last few months. Here's what's available today at $0/month:

Qwen3.6-35B-A3B (released this week)

35B total parameters, 3B active (MoE architecture)
73.4 on SWE-bench Verified — autonomous bug fixing on real GitHub repos
51.5 on Terminal-Bench 2.0 — agentic terminal coding
Built-in vision, 262K context
Runs on 8 GB VRAM with Q4_K_M quantization
Apache 2.0 license

Is it as good as Opus 4.7? On raw capability, probably not — Anthropic has massive compute advantages. But on the tasks most developers actually do daily (fixing bugs, writing functions, understanding codebases, code review), Qwen3.6 is genuinely competitive. And it runs on hardware you already own.

the real comparison isn't benchmarks

It's this:

	Claude Opus 4.7	Qwen3.6-35B-A3B
Cost	$5/$25 per million tokens	$0 forever
Privacy	Cloud-processed	Never leaves your machine
Speed	Subject to API congestion	As fast as your GPU
Availability	Depends on Anthropic's uptime	Runs offline
Rate limits	Yes	No
Data retention	Anthropic's policy	You control everything
License	Proprietary	Apache 2.0
Vision	Yes	Yes
Agentic coding	Yes (strong)	Yes (73.4 SWE-bench)
Setup	API key + credit card	Ollama + 10 minutes

how to set up the local alternative

ollama run qwen3.6:35b-a3b

That's it. Or if you want a full desktop experience with a coding agent, vision support, and model management:

Locally Uncensored just shipped v2.3.3 with Qwen3.6 day-0 support. It wraps Ollama into a desktop app with a built-in coding agent that streams live between tool calls, agent mode with 13 tools and MCP integration, and remote access from your phone. Open source, AGPL-3.0.

when cloud still makes sense

Being honest: there are cases where Opus 4.7 is worth the money.

You need the absolute frontier of capability and $25/M output tokens is pocket change for your use case
You're doing something that requires Anthropic's specific safety features
You need the model to handle tasks that are genuinely beyond what open models can do today
You don't have a GPU (though even a laptop with 8GB VRAM works for Qwen3.6)

For everyone else — the gap between cloud and local is closing fast. A model that scores 73.4 on SWE-bench running on a gaming laptop would have been science fiction two years ago.

the trajectory matters more than today's snapshot

Every few months, a new open model drops that would have been frontier-class the year before. The pricing gap between cloud and local is permanent — cloud will always cost per token, local will always be free after hardware.

Opus 4.7 is impressive. But the question isn't whether it's good — it's whether it's $5/$25 per million tokens better than what you can run yourself.

For a growing number of developers, the answer is no.

Locally Uncensored — open-source desktop app for local AI. Chat, coding agents, image gen, video gen. Qwen3.6 day-0 support. AGPL-3.0.

DEV Community