DEV Community

Cover image for World-Centric Agent Architecture: Why Your AI Agent Keeps Failing (And It's Not the Model's Fault)
Jung Sungwoo
Jung Sungwoo

Posted on

World-Centric Agent Architecture: Why Your AI Agent Keeps Failing (And It's Not the Model's Fault)

World-Centric Agent Architecture

Deterministic Runtime and Hierarchical Intelligence Orchestration


TL;DR

Your AI agent's failures aren't because the model isn't smart enough. They're because the World was never designed.

This post introduces World-Centric Architectureβ€”a paradigm where:

  • The World is explicit, immutable, and verifiable
  • Intelligence proposes changes but never executes them directly
  • System stability is independent of model size
  • Every value can answer "Why?"

If you're expecting another agent framework tutorial, this is not it.
This post is about why agent systems fail structurallyβ€”even with strong models.


The Problem Nobody's Talking About

LLM-based agents have made remarkable progress. Yet they keep failing in predictable ways:

Failure Mode What Happens
Impossible Action Loops Agent repeatedly tries actions that can't work
State Opacity No one knows when or why state changed
Irreproducibility Same input, different results
Unrecoverable Failures Can't rollback, can't try alternatives
Unexplainability "Why did it do that?" β†’ 🀷

Sound familiar?


The Misdiagnosis

The industry's response has been consistent:

"The model isn't smart enough. Let's use a bigger one."

Problem β†’ "Model too dumb"
       β†’ Bigger model / Longer reasoning / Complex planning
       β†’ Higher cost / More complexity / Problem persists
Enter fullscreen mode Exit fullscreen mode

We've been treating symptoms, not the disease.


The Real Problem

Here's the uncomfortable truth:

These failures aren't intelligence failures. They're world modeling failures.

In most agent architectures, the "world" exists only inside the model's head:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    THE MODEL'S HEAD                         β”‚
β”‚                                                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚ World         β”‚  β”‚ Action        β”‚  β”‚ State         β”‚    β”‚
β”‚  β”‚ Understanding β”‚  β”‚ Selection     β”‚  β”‚ Transition    β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β”‚                                                             β”‚
β”‚  Everything implicit. Nothing verifiable.                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Enter fullscreen mode Exit fullscreen mode

The model must:

  1. Understand the world (from context)
  2. Know what's possible (by reasoning)
  3. Pick the right action (while hallucinating)
  4. Track state changes (in its attention weights)
  5. Explain failures (good luck)

That's too much responsibility for a non-deterministic text predictor.


The Inversion: World-Centric Design

What if we flipped the script?

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      THE WORLD                               β”‚
β”‚                    (First-Class Citizen)                     β”‚
β”‚                                                              β”‚
β”‚       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”‚
β”‚       β”‚   Rules     β”‚  β”‚   State     β”‚  β”‚  Actions     β”‚     β”‚
β”‚       β”‚  (Schema)   β”‚  β”‚ (Snapshot)  β”‚  β”‚(Availability)β”‚     β”‚
β”‚       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚
β”‚                              β”‚                               β”‚
β”‚                              β–Ό                               β”‚
β”‚                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                       β”‚
β”‚                    β”‚   Intelligence  β”‚                       β”‚
β”‚                    β”‚   (Proposer)    β”‚                       β”‚
β”‚                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                       β”‚
β”‚                                                              β”‚
β”‚  Intelligence is OUTSIDE. World is explicit.                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Enter fullscreen mode Exit fullscreen mode

Key shifts:

  • World exists outside the model
  • State is explicit and immutable
  • Actions have verifiable availability
  • Intelligence proposes, doesn't execute
  • Everything can explain itself

The Constitution

World-Centric Architecture follows six principles:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  THE CONSTITUTION                                           β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                             β”‚
β”‚  1. COMPILER defines possibility.                           β”‚
β”‚     What CAN happen is defined at design time.              β”‚
β”‚                                                             β”‚
β”‚  2. CORE computes truth.                                    β”‚
β”‚     What IS true is computed deterministically.             β”‚
β”‚                                                             β”‚
β”‚  3. ACTOR proposes change.                                  β”‚
β”‚     Intelligence suggests, never executes.                  β”‚
β”‚                                                             β”‚
β”‚  4. AUTHORITY judges proposals.                             β”‚
β”‚     Independent verification before any change.             β”‚
β”‚                                                             β”‚
β”‚  5. ORCHESTRATOR manages worlds.                            β”‚
β”‚     Multiple world branches for exploration.                β”‚
β”‚                                                             β”‚
β”‚  6. PROJECTION transforms I/O.                              β”‚
β”‚     Presentation is separate from meaning.                  β”‚
β”‚                                                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Enter fullscreen mode Exit fullscreen mode

The Six Pillars

Here's how these principles become architecture:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   WORLD-CENTRIC SYSTEM                      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”‚
β”‚  β”‚ COMPILER  β”‚ β”‚  ACTOR   β”‚ β”‚AUTHORITY β”‚ β”‚PROJECTIONβ”‚       β”‚
β”‚  β”‚           β”‚ β”‚          β”‚ β”‚          β”‚ β”‚          β”‚       β”‚
β”‚  β”‚Possibilityβ”‚ β”‚ Propose  β”‚ β”‚  Judge   β”‚ β”‚ Encode/  β”‚       β”‚
β”‚  β”‚Definition β”‚ β”‚ Change   β”‚ β”‚ Proposal β”‚ β”‚ Render   β”‚       β”‚
β”‚  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜       β”‚
β”‚       β”‚             β”‚            β”‚            β”‚             β”‚
β”‚       β”‚    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β”‚
β”‚       β”‚    β”‚                                                β”‚
β”‚       β–Ό    β–Ό                                                β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                 β”‚
β”‚  β”‚            ORCHESTRATOR                β”‚                 β”‚
β”‚  β”‚   Runtime Lifecycle β”‚ Fork Management  β”‚                 β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β”‚
β”‚                        β”‚                                    β”‚
β”‚                        β–Ό                                    β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                 β”‚
β”‚  β”‚               CORE                     β”‚                 β”‚
β”‚  β”‚  Expression β”‚ Snapshot β”‚ Patch/Apply   β”‚                 β”‚
β”‚  β”‚  Action β”‚ Effect β”‚ Explain             β”‚                 β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β”‚
β”‚                                                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Enter fullscreen mode Exit fullscreen mode
Pillar Role Owns State? Makes Decisions?
Core Truth Engine βœ… Snapshot ❌
Actor Change Proposer ❌ ❌
Authority Proposal Judge ❌ βœ…
Projection I/O Transformer ❌ ❌
Orchestrator World Manager βœ… Runtimes ❌
Compiler Possibility Definer βœ… Schema ❌

Core: The Deterministic Truth Engine

Core is the foundationβ€”a seven-layer engine that computes truth without side effects:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Layer 7: EXPLAIN                                           β”‚
β”‚  "Why this value?" β€” Structural interpretation              β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Layer 6: PATCH / APPLY                                     β”‚
β”‚  The ONLY way to change truth                               β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Layer 5: EFFECT RUNTIME                                    β”‚
β”‚  External execution via handlers                            β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Layer 4: ACTION                                            β”‚
β”‚  Availability gate + Effect declaration                     β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Layer 3: SNAPSHOT                                          β”‚
β”‚  Immutable state at a point in time                         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Layer 2: DAG                                               β”‚
β”‚  Dependency graph for incremental recompute                 β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Layer 1: EXPRESSION                                        β”‚
β”‚  Pure computation (no side-effects)                         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Enter fullscreen mode Exit fullscreen mode

Snapshots Are Immutable

type Snapshot<T> = {
  readonly data: T;
  readonly computed: Record<string, unknown>;
  readonly validity: ConstraintResult;
  readonly version: number;
};

// WRONG β€” Direct mutation
snapshot.data.user.email = 'new@example.com';

// RIGHT β€” Patch/Apply produces NEW snapshot
const newSnapshot = applyPatches(snapshot, [
  set('user.email', 'new@example.com')
]);
Enter fullscreen mode Exit fullscreen mode

Actions Gate on Availability

const SubmitAction = defineAction(AppState, () => ({
  // Availability is a COMPUTED FACT
  availability: get('user.computed.isValid'),

  // Effect only runs IF available
  effect: effect('api.submit', {
    email: get('user.email'),
  }),
}));
Enter fullscreen mode Exit fullscreen mode

Impossible actions never reach execution. The system prevents them structurally, not through model reasoning.

Everything Explains Itself

const result = explainAvailability(SubmitAction, snapshot);

// Returns structured explanation:
{
  available: false,
  tree: {
    operator: 'and',
    contribution: 'false',
    children: [
      { path: 'user.computed.emailOk', value: false },  // ← ROOT CAUSE
      { path: 'user.computed.passwordOk', value: true },
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

No guessing. No "maybe the email was empty?" Just structural facts.


Hierarchical Intelligence: Student / Teacher / Orchestrator

Intelligence isn't monolithic. It's layered:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  LEVEL 3: ORCHESTRATOR                                      β”‚
β”‚  ─────────────────────                                      β”‚
β”‚  β€’ Manages multiple world branches (Fork tree)              β”‚
β”‚  β€’ Preserves failed worlds (no deletion)                    β”‚
β”‚  β€’ Selects successful path                                  β”‚
β”‚  β€’ NO intelligence required (deterministic)                 β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  LEVEL 2: TEACHER (Optional)                                β”‚
β”‚  ────────────────────────────                               β”‚
β”‚  β€’ Observes failures and execution traces                   β”‚
β”‚  β€’ Proposes world hypotheses                                β”‚
β”‚  β€’ Does NOT execute, has NO authority                       β”‚
β”‚  β€’ Can be wrong (not an oracle)                             β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  LEVEL 1: STUDENT                                           β”‚
β”‚  ────────────────────                                       β”‚
β”‚  β€’ Operates WITHIN current world                            β”‚
β”‚  β€’ Selects from AVAILABLE actions only                      β”‚
β”‚  β€’ Doesn't need to be smart (random works!)                 β”‚
β”‚  β€’ Proposes, never applies                                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Enter fullscreen mode Exit fullscreen mode

The Minimal Student

Here's the key insight: Student can be incredibly simple.

// This is a valid Student implementation
class RandomStudent {
  propose(view: SnapshotView): Proposal {
    const available = view.availableActions;
    const picked = available[Math.floor(Math.random() * available.length)];
    return createProposal(picked);
  }
}
Enter fullscreen mode Exit fullscreen mode

Why does this work? Because:

  • Impossible actions are already filtered out
  • State transitions are verified by Core
  • Failures are handled by Orchestrator

A random selector produces valid execution traces. Model intelligence is optional.

Teacher is NOT an Oracle

ORACLE (Not our Teacher):
β€’ Has access to ground truth
β€’ Guarantees correct answers
β€’ Unfalsifiable

TEACHER (Our design):
β€’ Observes only execution traces
β€’ Proposes hypotheses that MAY BE WRONG
β€’ All proposals verified through execution
β€’ Falsifiable and auditable
Enter fullscreen mode Exit fullscreen mode

Teacher = "A modeler who can be wrong"


Trust Boundaries

Not everything is trusted equally:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  TRUSTED ZONE                                               β”‚
β”‚  ────────────                                               β”‚
β”‚  Core β”‚ Orchestrator β”‚ Authority β”‚ Projection               β”‚
β”‚                                                             β”‚
β”‚  β€’ Deterministic                                            β”‚
β”‚  β€’ Auditable                                                β”‚
β”‚  β€’ Structurally verified                                    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β–²
                        β”‚ Verification Boundary
                        β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  UNTRUSTED ZONE                                             β”‚
β”‚  ─────────────                                              β”‚
β”‚  LLM Actor β”‚ External I/O                                   β”‚
β”‚                                                             β”‚
β”‚  β€’ Non-deterministic                                        β”‚
β”‚  β€’ May hallucinate                                          β”‚
β”‚  β€’ All outputs verified before trust                        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Enter fullscreen mode Exit fullscreen mode

The Firewall Principle:

LLM β†’ Proposal β†’ Authority β†’ Approval β†’ Core.apply()

NOT:
LLM β†’ State  // FORBIDDEN
Enter fullscreen mode Exit fullscreen mode

LLM outputs never directly mutate state. Ever.


The Runtime Flow

Here's how execution actually works:

1. PROJECTION renders Snapshot β†’ View
   (Full state filtered to what Actor needs)

2. ACTOR (Student) observes View
   (Sees only available actions)

3. ACTOR proposes ChangeSet
   (Selects action, does NOT execute)

4. Wrap ChangeSet β†’ Proposal
   (Attach identity for accountability)

5. AUTHORITY decides
   β†’ approved / rejected / changes_requested

6. If approved, ORCHESTRATOR forwards to Core

7. CORE.executeAction()
   β”œβ”€β”€ Check availability (again!)
   β”œβ”€β”€ Resolve effect parameters
   β”œβ”€β”€ Execute handler β†’ Patch[]
   β”œβ”€β”€ Apply patches β†’ New Snapshot
   └── Recompute affected values

8. Loop
Enter fullscreen mode Exit fullscreen mode

Failure & Fork

When things go wrong:

1. Core returns { ok: false, reason: 'UNAVAILABLE' }

2. Orchestrator captures failure
   β”œβ”€β”€ Preserve current Snapshot (no deletion!)
   └── Generate ExplainGraph

3. Teacher (if present) analyzes
   β”œβ”€β”€ Input: Trace + ExplainGraph
   └── Output: World Hypothesis

4. Authority decides on Fork
   β†’ approve / reject

5. Orchestrator creates Fork
   β”œβ”€β”€ Parent Runtime preserved
   └── Child Runtime with hypothesis applied

6. Execution continues in Child

7. On success, Authority selects winner
Enter fullscreen mode Exit fullscreen mode

Failed worlds are preserved, not deleted. You can always go back.


Why This Works

Intelligence Redistribution

BEFORE (Intelligence-Centric):
────────────────────────────
Intelligence = World Understanding
             + Rule Enforcement
             + Action Selection
             + State Management
             + Failure Diagnosis

AFTER (World-Centric):
──────────────────────
Intelligence = Action Selection (minimal)

World Understanding  β†’ Projection
Rule Enforcement     β†’ Core (availability)
State Management     β†’ Core (Patch/Apply)
Failure Diagnosis    β†’ Core (Explain)
Enter fullscreen mode Exit fullscreen mode

We moved 80% of the work out of the model.

Model Size Independence

Intelligence-Centric:

Stability
   β–²
   β”‚                    ╭────
   β”‚               ╭────╯
   β”‚          ╭────╯
   β”‚     ╭────╯
   └─────┴────────────────────► Model Size

(Bigger model = More stable)

─────────────────────────────────────────

World-Centric:

Stability
   β–²
   β”‚  ════════════════════════
   β”‚
   β”‚
   β”‚
   └──────────────────────────► Model Size

(Stability independent of model size)
Enter fullscreen mode Exit fullscreen mode

A random selector and a smart-enough model (e.g., GPT-5 or Claude Opus) both produce valid execution tracesβ€”because validity is enforced by the World (availability gates + Patch/Apply), not by the model’s reasoning.

A stronger model tends to be more efficient, but correctness does not depend on model capability.


When NOT to Use This

World-Centric isn't for everything:

Scenario Why It Might Not Fit
Pure creative writing No "correct" state
Open-ended chat No action structure
Real-time streaming Snapshot overhead
Trivial single-turn QA Overkill

This architecture is designed for structured, multi-step, verifiable tasks.


Key Takeaways

  1. Agent failures are often world modeling failures, not intelligence failures

  2. Make the World explicit: Immutable snapshots, declarative patches, computed availability

  3. Intelligence proposes, never executes: The firewall principle

  4. Stability is achievable without scaling: A random selector can produce valid execution

  5. Everything should explain itself: If it can't answer "Why?", it's a black box

  6. Preserve failures: Fork trees enable recovery and exploration


The Philosophy

"A system that cannot explain 'Why' is a dead system."

Most agent architectures are black boxes. You pump in tokens, hope for the best, and debug through prayer.

World-Centric Architecture is a white box. Every value has provenance. Every action has availability. Every failure has explanation.

That's not just good engineering. That's the difference between a demo and a production system.


See It In Action

Theory is nice. Working software is better.

TaskFlow is a playable proof-of-concept built before the full 7-layer architectureβ€”designed to demonstrate a simpler but critical idea:

LLMs can be reduced to intent interpreters and semantic carriers,

while the system itself remains fully deterministic.

TaskFlow shows that even without the complete World-Centric runtime,
a deterministic machine driven by explicit state and rules can already work.

πŸ‘‰ Try the demo: https://taskflow.manifesto-ai.dev


What's Next?

This architecture is implemented in Manifesto, an open-source framework. Future posts will cover:

  • Deep dive into Core's 7 layers
  • Building custom Actors with LLMs
  • Authority patterns for different governance needs
  • Fork strategies for exploration

Questions? Disagreements? Let me know in the comments.


Thanks for reading. If you're tired of debugging agent failures at 3 AM, maybe it's time to stop blaming the model and start designing the world.

Top comments (0)