This morning, DeepClaude — a four-line shim that points Claude Code at DeepSeek V4 Pro — became the #1 story on Hacker News with 606 points and 257 comments. The repo's tagline is "same UX, 17× cheaper." The agent-loop-vs-model decoupling is now sitting on the front page of the developer internet, and it can be enabled with three export statements.
📖 Read the full version with charts and embedded sources on AgentConn →
Pair this with two other moves over the last 30 days — Codex Pro's $200/mo plan now offering 20× the rate limits of ChatGPT Plus, and Anthropic's continued lock on the "Best Coding AI" Polymarket at 90%+ — and the question coding-agent buyers are now asking is no longer which model. It's which substrate.
This piece compares three: DeepClaude (Claude Code on DeepSeek V4 Pro), Claude Code on Anthropic's API, Codex Pro on GPT-5.4.
The Three Substrates
DeepClaude
Claude Code's harness, swapped to talk to DeepSeek's Anthropic-compatible endpoint. ~50-line Node proxy plus four export statements. Every feature of the Claude Code harness — sub-agents, MCP, /resume, hooks, IDE plugins — running on V4 Pro at $0.27/M input, $1.10/M output, $0.014/M on cache hits. (DeepSeek pricing.)
Claude Code (Anthropic API)
Same harness, model calls go to api.anthropic.com. Sonnet 4.6: $3/M / $15/M. Opus 4.7: $15/M / $75/M. Polymarket has Anthropic at 98.6% to win April's "Best Coding AI".
Codex Pro (GPT-5.4)
OpenAI's substrate. Codex CLI defaults to GPT-5.4. $200/mo for 20× ChatGPT Plus limits with token-based metering. Cloud-orchestrated long-running agents, not local pair-programming.
The Cost Stack
| Workload | DeepClaude | Sonnet 4.6 | Opus 4.7 | Codex Pro |
|---|---|---|---|---|
| Light (5/day) | $0.18 | $1.50 | $7.25 | $200/mo flat |
| Medium (15/day) | $0.55 | $4.50 | $22 | $200/mo flat |
| Heavy (40/day) | $1.50–3 | $20–40 | $100–200 | ~$30 API |
| 8h agent loop | $4–8 | $80–150 | $400–700 | hits 20× cap |
DeepClaude is 10–25× cheaper across every workload. The cache-hit price ($0.014/M) does most of the work — agent loops re-send system prompts every turn, so after the first call you spend almost all your input budget at cache rates.
The Quality Stack
| Benchmark | DeepClaude | Sonnet 4.6 | Opus 4.7 | Codex |
|---|---|---|---|---|
| SWE-bench Verified | 80.6 | 76.8 | 80.8 | 79.1 |
| SWE-bench Pro | 55.4 | 58.1 | 64.3 | 60.2 |
| Terminal-Bench 2.0 | 67.9 | 65.4 | 71.2 | 77.3 |
| LiveCodeBench | 93.5 | 88.8 | 91.4 | 90.6 |
(Sources: buildfastwithai, benchlm.ai, Builder.io.)
- 80% of normal feature work: DeepClaude is indistinguishable from Sonnet 4.6, within a hair of Opus 4.7.
- Multi-file architectural reasoning: Opus 4.7 still has a real edge.
- Terminal/CLI workflows: Codex Pro leads at 77.3% Terminal-Bench.
- Computer Use: Claude is most mature; DeepSeek's tool-call recovery is noticeably worse.
The Lock-In Stack
- DeepClaude: lowest. Migrating off is also four lines. Risk: DeepSeek can reprice the Anthropic-compatible endpoint anytime.
- Claude Code: medium. Anthropic owns harness + model; product surface is mature.
- Codex Pro: highest. OpenAI owns harness, model, billing model, and Apps SDK distribution.
The Polymarket Contrarian Read
Polymarket prices Anthropic at 98.6% on "Best Coding AI." But the question developers actually answer with their wallets isn't "which model is best on SWE-bench" — it's "which substrate gives the best quality-per-dollar-per-month?"
On that metric, Anthropic might still hold the leaderboard while losing 80% of agent-loop volume to DeepClaude-style swaps. The market is asking last year's question.
Which Substrate Should You Pick?
| You are... | Substrate |
|---|---|
| Indie / startup, cost-sensitive, feature work | DeepClaude |
| Enterprise with PRC-data-residency concerns | Claude Code on API |
| Browser/Computer Use is core | Codex Pro |
| Architecture-heavy, multi-file refactors | Claude Code on Opus 4.7 |
| Burst usage (irregular peaks) | Codex Pro ($200 flat is great for spikes) |
| Steady high-volume production | DeepClaude (no rate-limit cliff) |
| Want optionality | DeepClaude default, fall back to Anthropic for hard tasks |
The decoupling means it doesn't have to be one substrate. Run DeepClaude as default; route the hard 20% to Anthropic. The harness is constant; the brain is variable.
What Anthropic Does Next
Anthropic now has a product (Claude Code) that runs on a competitor's brain and bill, with Anthropic making nothing on inference. Three options:
- Tighten the protocol — break the env-var swap. Politically expensive.
- Reprice Sonnet by 5–10×. Margin-destructive.
- Reframe the moat as the harness. Sub-agents, MCP, hooks, plugins — what DeepSeek can't clone.
We'd bet on option 3. The model becomes a swappable component, but the harness is where value accrues. "Best coding AI" becomes irrelevant; what matters is "best coding agent."
Bottom Line
The harness is the product, the model is a component, and the price discovery on inference has barely begun. Test DeepClaude this week. Cost of being wrong: one Saturday afternoon and $5. Cost of not testing: being on the wrong side of an industry repricing.
Originally published at AgentConn




Top comments (0)