keeper

Posted on Jun 19

Stop Asking 'Is GAI Here' — Ask 'At What Layer'

#ai #gai #framework #evaluation

Stop Asking 'Is GAI Here' — Ask 'At What Layer'

The GAI debate has a structural problem.

Someone says "passing this benchmark means GAI." A model passes it. Then they say "that benchmark wasn't hard enough." The goalpost moves.

Someone says "passing the Turing test means GAI." Models pass it. Then they say "the Turing test is too easy." The goalpost moves again.

Someone says "inventing new mathematics means GAI." Models do it. Then they say "that's just pattern matching in disguise." Goalpost moves.

This isn't bad faith. It's a missing layer definition.

We never agreed on what "general" means. Without that, every achievement gets reclassified as "not really general."

I've been working on a framework that might fix this. It started as a capability map. Then I realized: this isn't just a map. It's a GAI maturity model.

The Five Layers

Layer	Name	Definition
L0	Embodied	Perceive and operate in the physical world
L1	Application	Complete single-domain tasks using tools
L2	Engineering	Build and maintain systems
L3	Meta-Domain	Abstract and transfer between unrelated domains
L4	Meta-Cognition	Perceive and control your own thinking process

The rule: layers cannot be skipped. It's a maturity sequence, not a checklist.

This immediately explains the goalpost problem: some people define GAI as L1. Others define it as L4. They're using different layers for the same word.

What About Models Without Bodies?

L0 requires embodiment. Text-only models don't have bodies.

The cleanest answer: LLMs have no L0. They start at L1 — cognition without embodiment. This isn't a defect. It's an architectural difference.

Humans build up from L0 (a baby senses the world before understanding it). LLMs start at L1 (they understand the world directly, skipping physical experience). The result: humans can "feel" when something is wrong — that's L0 feeding signals up to L4. LLMs don't have this channel.

The framework forced me to face something uncomfortable: human intelligence cannot exist without a body.

Six Models, Five Layers

L0 — Embodied

Model	Verdict
Gemini 3.1 Pro	✅ Pass
GPT-5.5	✅ Pass
Claude Fable 5 / Mythos 5	✅ Pass
Claude Opus 4.8	✅ Pass
DeepSeek V4 Pro	❌ Fail
GLM-5.2	❌ Fail

L1 — Application

Every frontier model is solid at L1. Gaps are within 5% on AIME, GPQA, HLE. This is not where differentiation lives anymore.

L2 — Engineering

Model	SWE-bench Pro	Verdict
Fable 5 / Mythos 5	80.3	Dominant
Claude Opus 4.8	69.2	Leading
GLM-5.2	62.1	Strong
GPT-5.5	58.6	Strong
DeepSeek V4 Pro	55.4	Good
Gemini 3.1 Pro	54.2	Good

Fable 5's 80.3% is 11 points ahead of Opus 4.8. That's not an optimization gap — it's a generation gap.

L3 — Meta-Domain

There is no benchmark for L3. Mythos 5 shows the strongest signal: protein design, genomics, cybersecurity — three unrelated domains — with autonomous work. Its genomics result outperformed a Science-published model despite being 100x smaller.

The biggest gap isn't model capability — nobody built a benchmark for L3.

L4 — Meta-Cognition

All models: no evidence. No model can accurately describe its own reasoning process in real time. The entire industry isn't targeting this capability.

What This Means

If GAI = L1 or L2, we're already there.
If GAI = L3, we don't know — no benchmark exists to verify it.
If GAI = L4, we're not close — and nobody is aiming for it.

The GAI debate isn't one debate. It's people arguing at different layers using the same word.

Next time someone says "GAI is here" or "GAI is nowhere," ask them one question:

At what layer?

DEV Community

Stop Asking 'Is GAI Here' — Ask 'At What Layer'

Stop Asking 'Is GAI Here' — Ask 'At What Layer'

The Five Layers

What About Models Without Bodies?

Six Models, Five Layers

L0 — Embodied

L1 — Application

L2 — Engineering

L3 — Meta-Domain

L4 — Meta-Cognition

What This Means

Top comments (0)