GPT-4 Held ECI Lead for 18 Months, Epoch AI Data Shows

#ai #machinelearning #research #deeplearning

GPT-4 led the ECI for 18 months, the longest reign. GPT-4o and Claude 3.5 Sonnet broke the streak in September 2024.

GPT-4 led the Epoch AI Compute Index for 18 months. No other model has held the top spot that long.

Key facts

GPT-4 led ECI for 18 months from March 2023
GPT-4 estimated at 1.8 trillion parameters
GPT-4o surpassed it in September 2024
Claude 3.5 Sonnet also surpassed it in September 2024
ECI measures effective training compute only

GPT-4 held the top position on the Epoch AI Compute Index (ECI) for 18 months, from its March 2023 launch until September 2024. That is the longest stretch any model has ever topped the metric, which measures the effective compute used to train a large language model.

ECI normalizes for hardware and algorithmic efficiency, giving a single number that represents the total FLOP-equivalent of a training run. GPT-4's dominance reflects not just its scale — estimated at 1.8 trillion parameters and trained on roughly 13 trillion tokens — but also the slow pace of competitive releases during that window.

Key Takeaways

GPT-4 led the ECI for 18 months, the longest reign.
GPT-4o and Claude 3.5 Sonnet broke the streak in September 2024.

What broke the streak

OpenAI's own GPT-4o and Anthropic's Claude 3.5 Sonnet both surpassed GPT-4 on the ECI in September 2024. Since then, the index has seen more churn: models like Gemini 3 Pro, GPT-5.3-Codex-Spark, and others have traded the top spot. But none has held the lead for more than a few months.

The 18-month reign is notable because it spans a period when the broader AI field moved from pure scaling to mixture-of-experts architectures, chain-of-thought reasoning, and multimodal training. GPT-4 was a dense transformer; its successors increasingly use sparse activation and post-training compute scaling.

What ECI misses

The index only measures training compute, not inference compute, data quality, or post-training techniques like reinforcement learning from human feedback. A model that uses less training compute but achieves comparable results — through better data curation or algorithmic advances — would score lower on ECI even if it performs as well in practice. Epoch AI has acknowledged this limitation in past methodology papers.

Still, the metric remains the most widely cited proxy for the scale of AI training runs. GPT-4's long tenure at the top underscores how much the field's center of gravity has shifted from training scale to inference-time reasoning and data strategy.

What to watch

Watch for the next ECI update from Epoch AI, expected within weeks. If Gemini 3 Pro or GPT-5.3-Codex-Spark holds the top spot for more than 3 months, it would signal a return to training-scale competition over inference-time methods.

Source: news.google.com

Originally published on gentic.news