DeepSeek V4 Pro Hits 80.6% on SWE-Bench: The New Open-Weight King

#modelrelease #opensource #ai #machinelearning

Originally published on AI Tech Connect.

The headline number and why it matters On the SWE-Bench Verified leaderboard, DeepSeek V4 Pro now sits at 80.6% — clear of every other open-weight model and ahead of all the closed-source frontier coders shipping in May 2026. Claude Sonnet 4 sits at roughly 77.2%, GPT-5 at about 74.9%, Gemini 2.5 around 71.8%. On GPQA Diamond, V4 Pro scores 90.1, putting it within striking distance of the top closed-source reasoning models. And it runs on a 1M-token context window. You can download the weights today, point them at your own GPU cluster, and never send a single byte of source code to a third-party API. That last sentence is the whole article, really. The benchmark lead will move within weeks — open-weight models leapfrog each other constantly, and Llama 4, Qwen 3.5, Gemma 4 and Mistral…

Read the full article on AI Tech Connect →

DEV Community

DeepSeek V4 Pro Hits 80.6% on SWE-Bench: The New Open-Weight King

Top comments (0)