TL;DR
Grok 4 = single brain, fast, cheap
Grok 4 Heavy = committee of brains, slower, $300 / mo, but record-breaking scores
1️⃣ 30-Second Visual Cheat-Sheet
Grok 4 | Grok 4 Heavy | |
---|---|---|
Architecture | 1 agent | Multi-agent (committee of 4-6 copies) |
Speed | ⚡️ 2–3 s / 1 k tokens | 🐌 15–25 s / 1 k tokens |
Price | X Premium+ ($16 / mo) | SuperGrok Heavy ($300 / mo) |
Humanity’s Last Exam | 38.6 % | 44.4 % |
USAMO | 37.5 % | 61.9 % |
AIME | 91.7 % | 100 % |
Context Window | 128 k (app) / 256 k (API) | same |
Best for | Daily dev work, chat | Research, theorem proving, PhD-level reasoning |
2️⃣ Benchmark Heat-Map
(Higher = greener)
Benchmark | Grok 4 | Grok 4 Heavy |
---|---|---|
GPQA Science | 87.5 % | 88.4 % |
LiveCodeBench | 79.0 % | 79.4 % |
USAMO | 37.5 % | 🔥 61.9 % |
AIME | 91.7 % | 🥇 100 % |
ARC-AGI | 15.9 % | 15.9 % (same model core) |
Humanity’s Last Exam | 38.6 % | 44.4 % |
Source: xAI livestream & independent evals
4️⃣ When Should You Pick Which?
Use-case | Pick | Reason |
---|---|---|
Casual chat / general code | Grok 4 | Fast & cheap |
Deep math proofs, PhD questions | Grok 4 Heavy | Highest score on record |
Large-scale document analysis | Grok 4 Heavy | Multi-agent cross-checking |
API on a budget | Grok 4 | $0.15 vs $0.30 per 1 k tokens |
Enterprise research labs | Grok 4 Heavy | Accuracy > cost |
5️⃣ Cost Reality Check
Monthly usage: 500 k tokens/day
├── Grok 4 (X Premium+) ≈ $ 16
└── Grok 4 Heavy (SuperGrok) ≈ $ 300
6️⃣ One-Sentence Summary
Grok 4 is your everyday sports-car; Grok 4 Heavy is the F-1 racer you rent when the podium is the only acceptable outcome.
🔗 Try them right now
• Grok 4: grok.com
with any X Premium+ account
• Grok 4 Heavy: toggle “Heavy” after subscribing to SuperGrok Heavy tier
Top comments (0)