DEV Community

WDSEGA
WDSEGA

Posted on • Originally published at wdsega.github.io

Claude 4 Tops Code Benchmarks - Sonnet Now Beats GPT-4o at 40% Lower Price

Anthropic released Claude 4 in late June 2026. First time Claude comprehensively surpassed GPT-4o and Gemini 1.5 Pro on coding benchmarks.

Key Numbers

Metric Claude 4 Sonnet GPT-4o
HumanEval 96.4% 90.2%
SWE-bench 72.3% 65.1%
Price per 1M tokens $3 $5

What Changed

  • Cross-file dependency tracking in large codebases is noticeably better
  • Function calling failure rate dropped from 8% to under 2%
  • Context window expanded to 300K and actually works reliably

Developer Consensus

Sonnet is the sweet spot for daily coding. Some teams switching from GPT-4o for code while keeping GPT for long-form content generation.

Sonnet pricing is 40% below GPT-4o with better performance - a direct shot at market share.

Full article on Deskless Daily

Top comments (0)