DEV Community

ANIRUDDHA  ADAK
ANIRUDDHA ADAK Subscriber

Posted on

🥊 Grok 4 vs. Grok 4 Heavy

TL;DR

Grok 4 = single brain, fast, cheap

Grok 4 Heavy = committee of brains, slower, $300 / mo, but record-breaking scores


1️⃣ 30-Second Visual Cheat-Sheet

Grok 4 Grok 4 Heavy
Architecture 1 agent Multi-agent (committee of 4-6 copies)
Speed ⚡️ 2–3 s / 1 k tokens 🐌 15–25 s / 1 k tokens
Price X Premium+ ($16 / mo) SuperGrok Heavy ($300 / mo)
Humanity’s Last Exam 38.6 % 44.4 %
USAMO 37.5 % 61.9 %
AIME 91.7 % 100 %
Context Window 128 k (app) / 256 k (API) same
Best for Daily dev work, chat Research, theorem proving, PhD-level reasoning

2️⃣ Benchmark Heat-Map

(Higher = greener)

Benchmark Grok 4 Grok 4 Heavy
GPQA Science 87.5 % 88.4 %
LiveCodeBench 79.0 % 79.4 %
USAMO 37.5 % 🔥 61.9 %
AIME 91.7 % 🥇 100 %
ARC-AGI 15.9 % 15.9 % (same model core)
Humanity’s Last Exam 38.6 % 44.4 %

Source: xAI livestream & independent evals


4️⃣ When Should You Pick Which?

Use-case Pick Reason
Casual chat / general code Grok 4 Fast & cheap
Deep math proofs, PhD questions Grok 4 Heavy Highest score on record
Large-scale document analysis Grok 4 Heavy Multi-agent cross-checking
API on a budget Grok 4 $0.15 vs $0.30 per 1 k tokens
Enterprise research labs Grok 4 Heavy Accuracy > cost

5️⃣ Cost Reality Check

Monthly usage: 500 k tokens/day
├── Grok 4 (X Premium+)            ≈ $ 16
└── Grok 4 Heavy (SuperGrok)       ≈ $ 300
Enter fullscreen mode Exit fullscreen mode

6️⃣ One-Sentence Summary

Grok 4 is your everyday sports-car; Grok 4 Heavy is the F-1 racer you rent when the podium is the only acceptable outcome.


🔗 Try them right now

• Grok 4: grok.com with any X Premium+ account

• Grok 4 Heavy: toggle “Heavy” after subscribing to SuperGrok Heavy tier


Top comments (0)