DEV Community

David
David

Posted on

claude opus 4.7 just dropped. here's what runs locally for free.

Anthropic just released Claude Opus 4.7. It's their best coding model yet — 13% better than Opus 4.6 on their internal 93-task benchmark, better vision, stronger at long-running agentic tasks.

It's also $5 per million input tokens and $25 per million output tokens. API only. Every character you type goes through Anthropic's servers.

Let's talk about what you can do locally for $0.

what opus 4.7 actually brings

Based on Anthropic's announcement:

  • 13% improvement over Opus 4.6 on a 93-task coding benchmark, including 4 tasks neither Opus 4.6 nor Sonnet 4.6 could solve
  • Better vision — higher resolution image understanding
  • Stronger agentic workflows — handles complex, multi-step tasks without losing context or stopping early
  • Self-verification — the model checks its own outputs before reporting back
  • Available on Claude API, Amazon Bedrock, Google Vertex AI, Microsoft Foundry

These are real improvements. Opus has been the go-to for serious coding work, and 4.7 makes it better.

But here's the thing.

the cost of frontier cloud AI

At $5/$25 per million tokens, a heavy coding session with Opus 4.7 can easily run $2-5/day. A team of developers using it as their primary coding agent? That's hundreds per month.

And every line of your proprietary code flows through someone else's infrastructure. Every prompt, every codebase context, every business logic snippet — stored, processed, potentially used for training (even with opt-outs, you're trusting the provider).

For hobby projects, fine. For anything sensitive — financial code, healthcare logic, proprietary algorithms — that's a real risk.

what runs locally right now

The local model landscape has changed dramatically in the last few months. Here's what's available today at $0/month:

Qwen3.6-35B-A3B (released this week)

  • 35B total parameters, 3B active (MoE architecture)
  • 73.4 on SWE-bench Verified — autonomous bug fixing on real GitHub repos
  • 51.5 on Terminal-Bench 2.0 — agentic terminal coding
  • Built-in vision, 262K context
  • Runs on 8 GB VRAM with Q4_K_M quantization
  • Apache 2.0 license

Is it as good as Opus 4.7? On raw capability, probably not — Anthropic has massive compute advantages. But on the tasks most developers actually do daily (fixing bugs, writing functions, understanding codebases, code review), Qwen3.6 is genuinely competitive. And it runs on hardware you already own.

the real comparison isn't benchmarks

It's this:

Claude Opus 4.7 Qwen3.6-35B-A3B
Cost $5/$25 per million tokens $0 forever
Privacy Cloud-processed Never leaves your machine
Speed Subject to API congestion As fast as your GPU
Availability Depends on Anthropic's uptime Runs offline
Rate limits Yes No
Data retention Anthropic's policy You control everything
License Proprietary Apache 2.0
Vision Yes Yes
Agentic coding Yes (strong) Yes (73.4 SWE-bench)
Setup API key + credit card Ollama + 10 minutes

how to set up the local alternative

ollama run qwen3.6:35b-a3b
Enter fullscreen mode Exit fullscreen mode

That's it. Or if you want a full desktop experience with a coding agent, vision support, and model management:

Locally Uncensored just shipped v2.3.3 with Qwen3.6 day-0 support. It wraps Ollama into a desktop app with a built-in coding agent that streams live between tool calls, agent mode with 13 tools and MCP integration, and remote access from your phone. Open source, AGPL-3.0.

when cloud still makes sense

Being honest: there are cases where Opus 4.7 is worth the money.

  • You need the absolute frontier of capability and $25/M output tokens is pocket change for your use case
  • You're doing something that requires Anthropic's specific safety features
  • You need the model to handle tasks that are genuinely beyond what open models can do today
  • You don't have a GPU (though even a laptop with 8GB VRAM works for Qwen3.6)

For everyone else — the gap between cloud and local is closing fast. A model that scores 73.4 on SWE-bench running on a gaming laptop would have been science fiction two years ago.

the trajectory matters more than today's snapshot

Every few months, a new open model drops that would have been frontier-class the year before. The pricing gap between cloud and local is permanent — cloud will always cost per token, local will always be free after hardware.

Opus 4.7 is impressive. But the question isn't whether it's good — it's whether it's $5/$25 per million tokens better than what you can run yourself.

For a growing number of developers, the answer is no.


Locally Uncensored — open-source desktop app for local AI. Chat, coding agents, image gen, video gen. Qwen3.6 day-0 support. AGPL-3.0.

Top comments (0)