AI Just Scored 37.8% of Human Intelligence — Introducing the AGCI Benchmark

#ai #discuss #machinelearning #programming

For years, AI evaluation has been stuck in a loop — models acing short-term tasks, then forgetting everything the next day.

We built the AGCI Benchmark to measure something deeper:
how well an AI learns, remembers, and adapts over time.

It’s not about solving puzzles anymore — it’s about testing cognitive continuity.
How much of human-like intelligence can a system retain from experience?

In its first public run, Dropstone — our self-learning IDE — scored 37.8% of human intelligence on the AGCI Benchmark, leading every evaluated system to date.

This framework measures seven cognitive dimensions — from perception and memory persistence to adaptive reasoning — designed to capture the intelligence that unfolds over time, not in a single prompt.

📖 Read the benchmark and methodology here:
👉 https://www.dropstone.io/research/agci-benchmark

The AGCI Benchmark is open for replication and critique.
If you believe intelligence is more than one-shot reasoning, this might be the conversation that redefines how we measure it.

DEV Community

AI Just Scored 37.8% of Human Intelligence — Introducing the AGCI Benchmark

Top comments (0)