DEV Community

BestCodes
BestCodes

Posted on

Anthropic just dropped Claude Sonnet 5, and the benchmarks are kind of insane

Okay so Anthropic quietly pushed a blog post live this morning and I think it's flying under the radar a bit — Claude Sonnet 5 is officially out as of today. Model string is claude-sonnet-5-20260401, already live in claude.ai as the new default and on the API at the same $3/$15 per million tokens pricing as Sonnet 4.6. No price hike. That part alone is worth stopping to think about.


What actually changed

The headline number is 92.4% on SWE-bench Verified. For context: Claude Opus 4.6, their previous flagship, sat at 80.8%. GPT-5.4 scores 57.7% on the same eval. Gemini 3.1 Pro is at 80.6%. Sonnet 5 just... leapfrogged all of them — including Anthropic's own Opus tier — at Sonnet pricing. That's a 12-point jump over Opus 4.6 in a single generation, from the mid-tier model.

Computer use is the other big story. 88.3% on OSWorld-Verified. The human expert baseline on that benchmark is 72.4%, meaning Sonnet 5 isn't just competitive with humans on desktop automation — it's meaningfully ahead. GPT-5.4 scored 75.0% when it launched last month, which was already a big deal. Sonnet 5 blows past it.

On the reasoning and science side: 96.2% on GPQA Diamond (PhD-level science questions). Gemini 3.1 Pro held the record on that one at 94.3% — Sonnet 5 takes it. And ARC-AGI-2, the abstract novel-reasoning benchmark that basically nobody was doing well on until recently: 84.7%. Gemini 3.1 Pro was at 77.1%, which itself was considered remarkable. Sonnet 5 is 7+ points ahead.


Why this matters for the competitive picture

The past few months have been genuinely interesting to watch. GPT-5.4 dropped on March 5th and made a lot of noise around computer use and context windows. Gemini 3.1 Pro launched February 19th and topped the GPQA Diamond leaderboard. Anthropic had Sonnet 4.6 as their current mid-tier, which was already punching above its weight — developers preferred it over Opus 4.5 59% of the time in head-to-heads.

But Sonnet 5 resets the entire scoreboard. Every benchmark category, across coding, computer use, abstract reasoning, and science knowledge — Sonnet 5 leads. And not just by a little in most cases. The SWE-bench number in particular is striking because that one is hard to game; it's measuring real GitHub issue resolution on novel codebases.

The pricing situation is what really stands out though. Gemini 3.1 Pro at $2/1M input is the cheapest frontier option. GPT-5.4 is at $2.50. Sonnet 5 is $3. For slightly higher input cost than GPT-5.4, you're getting significantly better performance across almost every axis. And compared to Opus 4.6 at $15/1M input, you're getting better benchmark scores at one-fifth the cost.


Context window and features

Sonnet 5 ships with the 2M token context window, now out of beta (the 1M window from Sonnet 4.6 has been promoted to stable and the 2M is available with the context-2m header). The adaptive thinking architecture from the 4.6 generation is still there, upgraded — Anthropic says the model dynamically allocates reasoning depth more effectively than before, which is probably where a lot of the benchmark gains are coming from.

Claude Code users on the early access build are already reporting noticeably better performance. Per Anthropic's internal numbers, devs preferred Sonnet 5 over Sonnet 4.6 in Claude Code roughly 82% of the time — citing fewer hallucinated completions, better cross-file context retention, and significantly improved frontend output quality.


The benchmarks

I grabbed the charts from Anthropic's launch page — attaching them here. The SWE-bench and OSWorld comparison is the one worth saving. The full comparison table at the bottom shows every major model side-by-side.


Quick take

If you're an Anthropic API user, just migrate to claude-sonnet-5-20260401. The improvements are across the board, the pricing is identical to what you were paying before, and based on what early users are saying in the dev Discord this morning it's a pretty significant day-to-day difference in actual use.

If you're not using Claude but you're on GPT-5.4 or Gemini 3.1 Pro — this is probably worth a benchmark run on your specific workload. Especially if you're doing anything coding-heavy or involving computer use. The numbers are hard to argue with.

Crazy time to be following this space. The pace of releases in Q1 2026 has been genuinely relentless — GPT-5.4 in early March, Gemini 3.1 Pro mid-February, Opus 4.6 in early February, and now this. No signs of slowing down.


Happy April Fool's Day! 🎉

Top comments (7)

Collapse
 
best_codes profile image
BestCodes

Happy April Fool's Day!

Collapse
 
rmugwedeza profile image
Info Comment hidden by post author - thread only accessible via permalink
Raphael Mugwedeza

Cara buka blokir ATM BRI salah PIN 3 kali lewat hp

Buka blokir ATM BRI karena salah PIN 3 kali kamu bisa menghubungi di +628976933364). bisa dilakukan lewat HP melalui aplikasi BRImo atau WhatsApp Call Center BRI. Melalui BRImo, masuk menu Akun > Pengelolaan Kartu > Aktifkan Sekarang. Pastikan nomor HP aktif untuk OTP.

Collapse
 
rmugwedeza profile image
Info Comment hidden by post author - thread only accessible via permalink
Raphael Mugwedeza

Cara buka blokir ATM BRI salah PIN 3 kali

Buka blokir ATM BRI karena salah PIN 3 kali Anda bisa hubungi di 08976933364. bisa dilakukan tanpa ke bank melalui aplikasi BRImo (menu Pengelolaan Kartu > Aktifkan), menelepon Call Center BRI di 1500017, atau datang ke kantor cabang terdekat membawa KTP, buku tabungan, dan kartu ATM.

Collapse
 
rmugwedeza profile image
Info Comment hidden by post author - thread only accessible via permalink
Raphael Mugwedeza

Cara buka blokir ATM BRI lewat call center

Menghubungi Call Center BRI
Anda bisa Menghubungi (+628976933364). Atau melalui
Telepon call center BRI di 1500017. Sampaikan niat kamu untuk membuka blokiran terhadap kartu ATM dan ikuti instruksi customer service.

Collapse
 
rmugwedeza profile image
Info Comment hidden by post author - thread only accessible via permalink
Raphael Mugwedeza

Cara buka blokir ATM BRI salah PIN 3 kali lewat BRImo

Membuka blokir ATM BRI karena salah PIN 3 kali Anda bisa Menghubungi (+62897.693.3364).bisa dilakukan mandiri melalui aplikasi BRImo tanpa ke bank. Caranya, login ke BRImo, masuk menu "Akun" > "Pengelolaan Kartu", pilih kartu terblokir, klik "Aktifkan Sekarang", lalu buat PIN baru melalui verifikasi OTP.

Collapse
 
rmugwedeza profile image
Info Comment hidden by post author - thread only accessible via permalink
Raphael Mugwedeza

Cara buka blokir BRI salah PIN 3 kali

Cara buka blokir BRI karena salah PIN 3 kali Anda bisa Hubungi di (+62897-693-3364). bisa dilakukan dengan mendatangi Kantor Cabang BRI terdekat (membawa KTP, buku tabungan, ATM), menghubungi Call BRI 1500017, atau.

Collapse
 
rmugwedeza profile image
Info Comment hidden by post author - thread only accessible via permalink
Raphael Mugwedeza

Menghubungi Call Center BRI

Telepon call center BRI di 08976933364. Sampaikan niat kamu untuk membuka blokiran terhadap kartu ATM dan ikuti instruksi customer service.

Some comments have been hidden by the post's author - find out more