Herrington Darkholme

Posted on Jan 16

Agent Calculus: A Unified Framework for AI Agent Design

#agents #ai #architecture #llm

Abstract

This document presents a unified formal framework for understanding AI agents through a calculus-like model. We introduce the concept of entities as the fundamental unit of agent computation, and define a harness that manages the flow of entities between the LLM's context and the external world. This framework elegantly unifies disparate concepts such as skills, tools, memory, subagents, and dynamic context loading under a single coherent model.

1. Introduction & Motivation

The Problem

Modern AI agent systems involve numerous distinct concepts:

Tools: Functions the agent can call to interact with the world
Skills: Reusable prompt templates and workflows
Memory: Persistent state from previous interactions
Subagents: Spawned agents for subtasks
Dynamic context loading: Just-in-time injection of relevant information
System prompts: Static instructions and behavior definitions

These concepts are typically treated as separate mechanisms, leading to:

Conceptual fragmentation in agent design
Difficulty reasoning about agent behavior holistically
Lack of composability between different agent patterns

The Solution

We propose treating all inputs to an agent as entities that flow through a harness which manages:

Loading: Filtering and packing entities into the LLM's limited context
Execution: Handling LLM actions and returning new entities

This abstraction enables:

Unified reasoning about agent behavior
Compositional design patterns
Systematic approaches to context management
Formal analysis of multi-agent systems

2. Core Assumptions

To simplify our calculus, we make three foundational assumptions:

Assumption 1: Limited Context

LLM context windows are finite and constrained. Context is the primary scarce resource in agent systems.

Note: LLM's context window can be larger in future. For example,

a conditional memory lookup can spare more attention to longer context https://github.com/deepseek-ai/Engram?tab=readme-ov-file
Extending the Context of Pretrained LLMs by Dropping their Positional Embeddings (DroPE) https://pub.sakana.ai/DroPE/

Assumption 2: Static Capabilities

LLMs do not perform continual learning during inference. Their capabilities are fixed at deployment.

Note: this can be changed in future. So some routine context loading can be done by updating model weights

see Nested Learning: The Illusion of Deep Learning Architecture. https://abehrouz.github.io/files/NL.pdf

Assumption 3: LLM Homogeneity

For the purposes of this calculus, we treat different LLMs as interchangeable.

Note: In practice they differ, but this simplifies our model.

3. Fundamental Definitions

3.1 LLM as Pure Function

We model an LLM as a pure function:

LLM: Context → (Reasoning, Actions)

Inputs:

Context: The text and structured data directly accessible to the LLM

Outputs:

Reasoning: Internal thought process, chain-of-thought, analysis
Actions: Structured requests to interact with the world (tool calls, responses, queries)

The LLM has no direct access to anything outside its context. It cannot see files, networks, databases, or any other state unless that information is explicitly loaded into its context.

3.2 Context vs World

Context: The observable, directly accessible information within the LLM's attention window.

Limited in size (e.g., 128K tokens)
Directly influences LLM outputs
Managed by the harness

World: Everything outside the context that the agent might need.

File systems
Databases
APIs
Previous conversation history (not currently in context)
External knowledge bases

The harness acts as the bridge between Context and World.

3.3 Agent Decomposition

An agent is the composition of two components:

Agent = LLM + Harness

The LLM performs reasoning and generates actions.

The Harness manages the agent loop:

Loads entities into context
Executes actions in the world
Handles context window constraints

4. The Entity Abstraction

Core Insight: Everything that can be loaded into an LLM's context is an entity.

4.1 Entity Definition

An entity is a unit of information with:

Content: The actual data or text
Metadata: How it should be loaded, when it's relevant, its size

4.2 Entity Types by Loading Strategy

Entity Type	Content Nature	Loading Strategy	Example
System Prompt	Static	Preloaded	Agent role definition, rules
Tool Description	Static	Dynamic or preloaded	Function signature, usage docs
Skill	Static	Dynamic	Reusable prompt templates
Memory	Dynamic	Preloaded	Conversation summaries
User Input	Dynamic	Preloaded	Current user message
Tool Result	Dynamic	Dynamic	Data returned from actions

4.3 Entity Dimensions

Entities can be characterized along multiple dimensions:

1. Content Mutability

Static: Content doesn't change (tool definitions, skills)
Dynamic: Content changes during execution (memory, tool results)

2. Loading Time

Preloaded: Always in context (system prompt, current memory)
Dynamic: Loaded on-demand (skills when invoked, tool descriptions when relevant)

3. Verbosity Levels

Full: Complete content loaded
Summary: Condensed version (e.g., skill titles only)
Digest: Compressed representation (e.g., large tool results summarized)
Reference: Pointer only (content remains in world, accessed via actions)

4.4 Examples

Example: Tool as Entity

Entity: FileReadTool
  Content (full):
    Name: read_file
    Description: Reads contents of a file from disk
    Parameters:
      - path: string (absolute path)
      - offset: int (optional, start line)
      - limit: int (optional, number of lines)
    Returns: string (file contents)

  Content (summary):
    read_file: Read file contents

  Metadata:
    type: tool_description
    static: true
    loading: dynamic (loaded when file operations relevant)

Example: Memory as Entity

Entity: ConversationMemory
  Content (full):
    [Last 50 conversation turns with full context]

  Content (digest):
    Summary: User is implementing auth system for web app.
    Currently debugging JWT token validation.
    Tech stack: Node.js, Express, PostgreSQL.

  Metadata:
    type: memory
    static: false (updated after each turn)
    loading: preloaded
    compression: digest after 10 turns

Example: Skill as Entity

Entity: GitCommitSkill
  Content (full):
    # Git Commit Workflow
    1. Run git status to see changes
    2. Review git diff for staged changes
    3. Draft commit message following repo conventions
    4. Execute git commit with message
    5. Verify with git log

  Content (summary):
    GitCommitSkill: Create git commits following best practices

  Metadata:
    type: skill
    static: true
    loading: dynamic (loaded when user requests git commit)

5. The Harness

The harness is the orchestration layer that manages the agent loop. It has two core responsibilities:

5.1 Load Function

load: (Context, Entity, List[Entity]) → Context'

Purpose: Intelligently pack entities into the limited context window.

Inputs:

Context: Current context state
Entity: New entity to incorporate (e.g., fresh user input, tool result)
List[Entity]: Available entities that could be loaded

Output:

Context': Updated context ready for LLM consumption

Responsibilities:

Filtering: Select only relevant entities from available pool
Compression: Choose appropriate verbosity level for each entity
Ordering: Arrange entities for optimal LLM performance
Eviction: Remove or compress old entities if context is full

5.2 Execute Function

execute: (Action, World) → (Entity, World')

Purpose: Perform actions in the world and return results as entities.

Inputs:

Action: LLM-generated action (tool call, query, response)
World: Current world state

Outputs:

Entity: Result data packaged as an entity
World': Updated world state after action

Examples:

Action = read_file("config.json") → Entity = {type: "tool_result", content: "{...json...}"}
Action = spawn_subagent("research task") → Entity = {type: "subagent_result", content: "..."}
Action = respond("Done!") → Entity = {type: "agent_response", content: "Done!"}

5.3 Context Window Management

The load function implements sophisticated strategies to handle context constraints:

Strategy 1: Relevance Filtering

Only load entities relevant to current task:

if task involves file operations:
  load file-related tool descriptions
else:
  omit file tools (even if available)

Strategy 2: Progressive Compression

# Fresh tool result: load full content
load(entity=tool_result, verbosity=full)

# After LLM reasoning: compress it
load(entity=tool_result, verbosity=digest)

# After several turns: remove entirely if no longer relevant
omit(entity=tool_result)

Strategy 3: Hierarchical Summarization

# If 50 tools available:
Group into categories: [file_ops, network, database, ...]
Load only category summaries initially
Load full descriptions only when category selected

Strategy 4: Swap Full↔Digest

Context before LLM call:
  [system_prompt] [memory_digest] [tool_result_FULL] [user_input]

Context after LLM reasoning:
  [system_prompt] [memory_digest] [tool_result_DIGEST] [llm_reasoning] [new_action]

5.4 Atomic Load Operations

The load function is composed of atomic operations:

Operation	Purpose	Example
summarize	Reduce entity size	Compress 50 messages → 1 paragraph
elaborate	Add more context	Expand terse user input with clarifications
omit	Remove entity completely	Drop tool result from 10 turns ago
paraphrase	Rewrite for clarity	Standardize user input phrasing
group	Combine related entities	Bundle related tool descriptions

6. The Agent Loop

Now we can express the complete agent execution as a simple loop:

6.1 Pseudocode

def agent_loop(user_input, world):
    ctx = Context()
    entity = Entity(content=user_input, type="user_input")
    entities = discover_available_entities(world)  # tools, skills, memory, etc.

    while not should_stop(ctx):
        # LOAD PHASE
        # Pack relevant entities into context
        ctx = harness.load(ctx, entity, entities)

        # REASONING PHASE
        # LLM processes context and generates action
        reasoning, action = llm(ctx)

        # Add reasoning to context for continuity
        ctx = ctx.append(reasoning)

        # EXECUTION PHASE
        # Execute action in world, get result as new entity
        entity, world = harness.execute(action, world)

        # New entity becomes input for next iteration

    return ctx, world

6.2 Example Execution Trace

User: "Read config.json and fix the database port"

Turn 1

Load Phase:
  ctx = [system_prompt, memory_summary, file_tools, user_input]

LLM Phase:
  reasoning: "Need to read config.json first"
  action: read_file("config.json")

Execute Phase:
  entity: {type: "tool_result", content: "{\n  \"db_port\": 3306\n  ...\n}"}
  world: unchanged

Turn 2

Load Phase:
  ctx = [system_prompt, memory_summary, file_tools,
         tool_result_FULL, previous_reasoning, edit_tools]
  # Note: file tool result loaded in full for LLM to analyze

LLM Phase:
  reasoning: "Database port is 3306 but should be 5432 for PostgreSQL"
  action: edit_file("config.json", old="3306", new="5432")

Execute Phase:
  entity: {type: "tool_result", content: "Successfully edited config.json"}
  world: config.json modified

Turn 3

Load Phase:
  ctx = [system_prompt, memory_summary,
         previous_tool_result_DIGEST,  # compressed now!
         current_tool_result,
         previous_reasoning]
  # Note: first tool result compressed to save space

LLM Phase:
  reasoning: "Task complete. Config fixed."
  action: respond("Fixed! Changed database port from 3306 to 5432")

Execute Phase:
  entity: {type: "agent_response"}
  should_stop: true

7. Multi-Agent Design Patterns

Using the entity calculus, we can formally describe common agent patterns.

7.1 Pattern: Tool-Use Agent

Description: Agent with access to external tools.

Implementation:

entities = [
    Entity(system_prompt),
    Entity(memory),
    Entity(user_input),
    *[Entity(tool_desc) for tool in available_tools]
]

# Tools are just entities with:
# 1. Description (for LLM to understand)
# 2. Execution handler (for harness.execute)

Key Insight: Tools are simultaneously entities (their descriptions load into context) and actions (their implementations execute in world).

7.2 Pattern: Skill-Enhanced Agent

Description: Agent that can load predefined workflows on-demand.

Implementation:

# Skills available but not preloaded
skill_entities = [
    Entity(GitCommitSkill, loading="dynamic"),
    Entity(DebugWorkflow, loading="dynamic"),
    Entity(RefactorPattern, loading="dynamic")
]

# Harness.load uses semantic search:
if user_input mentions "commit" or "git":
    load(GitCommitSkill, verbosity=full)
else:
    load(GitCommitSkill, verbosity=summary)  # just title

Key Insight: Skills are entities loaded at different verbosity levels based on relevance.

7.3 Pattern: Subagent Spawning

Description: Agent delegates subtasks to other agents.

Implementation:

# Define subagent spawn as a tool
SubAgentTool = Tool(
    name="spawn_subagent",
    description="Create a new agent for a subtask",
    execute=lambda prompt, world: {
        # Create new agent with fresh context
        subagent_ctx = Context([system_prompt, Entity(prompt)])

        # Run subagent loop until completion
        result_ctx, world' = agent_loop(prompt, world)

        # Return subagent's result as entity
        return Entity(
            type="subagent_result",
            content=extract_result(result_ctx)
        ), world'
    }
)

Key Insight: Subagents are just recursive invocations of the agent loop. The result is returned as an entity to the parent agent.

Example Flow:

Parent Agent:
  User: "Research React hooks and write a summary"

  Turn 1:
    action: spawn_subagent("Research React hooks from documentation")

    Subagent Loop:
      Turn 1: search("React hooks documentation")
      Turn 2: read(url)
      Turn 3: summarize(content)
      Turn 4: respond(summary)

    entity: {type: "subagent_result", content: "React hooks are..."}

  Turn 2:
    Context: [subagent_result, user_input]
    action: write_file("react-hooks-summary.md", content=subagent_result)

7.4 Pattern: RAG (Retrieval-Augmented Generation)

Description: Agent retrieves relevant documents before generating responses.

Implementation:

# RAG is just a special load strategy
def rag_load(ctx, entity, entities):
    # Extract query from latest entity (user input or reasoning)
    query = extract_query(entity)

    # Search knowledge base (world operation)
    relevant_docs = semantic_search(world.knowledge_base, query, top_k=5)

    # Convert docs to entities
    doc_entities = [Entity(doc, type="retrieved_doc") for doc in relevant_docs]

    # Standard load with doc entities included
    return load(ctx, entity, entities + doc_entities)

Key Insight: RAG is just a sophisticated entity discovery mechanism. Retrieved documents are entities loaded into context.

7.5 Pattern: ReAct (Reasoning + Acting)

Description: Agent alternates between reasoning and tool use.

Implementation:

# ReAct is the default agent loop!
# The loop naturally alternates:

Turn 1:
  LLM: reasoning → "Need to check database status"
  Action: execute(check_db_status)

Turn 2:
  LLM: reasoning → "Database is down, need to restart"
  Action: execute(restart_db)

Turn 3:
  LLM: reasoning → "Restart successful, task complete"
  Action: respond("Done!")

Key Insight: ReAct emerges naturally from the agent loop structure. No special implementation needed.

7.6 Pattern: Reflection

Description: Agent reviews and critiques its own work.

Implementation:

# Reflection as a tool that spawns a critic subagent
ReflectionTool = Tool(
    name="reflect",
    description="Review your work for errors and improvements",
    execute=lambda work, world: {
        critique_prompt = f"""
        Review this work: {work}

        Identify:
        1. Errors or bugs
        2. Missed requirements
        3. Potential improvements
        """

        # Spawn critic agent with different system prompt
        critic_ctx = Context([critic_system_prompt, Entity(critique_prompt)])
        result_ctx, world' = agent_loop(critique_prompt, world)

        return Entity(type="reflection", content=extract_result(result_ctx)), world'
    }
)

Usage:

Agent:
  Turn 1: write_code(feature)
  Turn 2: reflect(code)
  Turn 3: revise(code, based_on=reflection)

Key Insight: Reflection is subagent spawning with a specialized critic prompt.

7.7 Pattern: Multi-Agent Collaboration

Description: Multiple agents work in parallel on different aspects of a task.

Implementation:

def parallel_agents(task, world):
    # Decompose task
    subtasks = decompose(task)

    # Spawn agent for each subtask
    results = []
    for subtask in subtasks:
        # Each agent runs independently
        result_ctx, world = agent_loop(subtask, world)
        results.append(extract_result(result_ctx))

    # Coordinator agent synthesizes results
    synthesis_prompt = f"Combine these results: {results}"
    final_ctx, world = agent_loop(synthesis_prompt, world)

    return final_ctx, world

Key Insight: Multi-agent systems are orchestrated by spawning multiple independent agent loops and synthesizing their results.

8. Advanced Topics

8.1 Dynamic Tool Loading

Problem: Loading descriptions of 100+ tools wastes context on irrelevant tools.

Solution: Treat tool discovery as a two-phase load.

Implementation:

# Phase 1: Load tool categories only
tool_categories = [
    Entity("File Operations: read, write, delete, ..."),
    Entity("Network Operations: fetch, post, websocket, ..."),
    Entity("Database Operations: query, insert, update, ..."),
]

ctx = load(ctx, user_input, tool_categories)
reasoning, action = llm(ctx)  # LLM might say "I need file operations"

# Phase 2: Load full tool descriptions only for selected category
if "file" in reasoning.lower():
    file_tools = [Entity(read_tool), Entity(write_tool), ...]
    ctx = load(ctx, Entity(reasoning), file_tools)

Result: Context usage reduced from O(all_tools) to O(relevant_tools).

8.2 Entity Discovery Mechanisms

Entities can specify how they should be discovered:

Entity.metadata = {
    "discovery": {
        "keywords": ["git", "commit", "version control"],  # Keyword match
        "semantic": "Create version control commits",      # Semantic search
        "dependencies": ["read_file", "write_file"],       # Linked entities
        "context_requirements": ["user_in_git_repo"],      # Conditional
    }
}

Use case: GitCommitSkill specifies it should be loaded when:

User mentions "commit" or "git" (keyword)
Current directory is a git repo (context requirement)
If loaded, also load file reading tools (dependencies)

8.3 Tool Data Handling Strategies

Problem: Tool returns 10MB JSON response. Loading fully into context is wasteful.

Strategy 1: Digest on Return

entity, world = execute(action)

# Immediate digest
if entity.size > 10KB:
    entity.content_full = entity.content
    entity.content = llm_summarize(entity.content, max_tokens=500)
    entity.verbosity = "digest"

    # Store full content in world for re-access if needed
    world.store(entity.id, entity.content_full)

Strategy 2: Streaming / Pagination

# Don't load entire result
entity = Entity(
    type="tool_result",
    content="Large file detected. Use read_file(offset=N, limit=M) to page through.",
    metadata={"file_size": "10MB", "total_lines": 50000}
)

Strategy 3: Structured Filtering

# Tool returns structured data with filtering instructions
entity = Entity(
    type="tool_result",
    content={"users": [...]},  # 1000 users
    filter_instructions="Use filter_tool_result(key='users', condition='age > 30') to refine"
)

8.4 Context Compression Strategies

The load function can employ LLM-based compression:

Summarization Compression:

# When context is 80% full
old_messages = ctx.messages[:-10]  # All but last 10
summary = llm_summarize(old_messages)
ctx.messages = [summary] + ctx.messages[-10:]

Semantic Deduplication:

# Remove redundant information
if semantic_similarity(new_entity, existing_entity) > 0.9:
    # Information already in context
    omit(new_entity)

Importance-Based Eviction:

# Score each entity by relevance to current task
scores = [score_relevance(e, current_task) for e in ctx.entities]

# Remove lowest-scored entities when context is full
if ctx.is_full():
    ctx.entities = [e for e, s in sorted(zip(ctx.entities, scores), key=lambda x: -x[1])][:max_entities]

8.5 JIT (Just-In-Time) Context Loading

Concept: Tool results can include instructions for processing their data.

Example:

# Tool execution
entity = execute(search_codebase("authentication"))

# Entity includes loading instructions
entity.content = {
    "results": [
        {"file": "auth.py", "line": 45, "snippet": "..."},
        {"file": "login.py", "line": 12, "snippet": "..."},
        # ... 50 more results
    ],
    "loading_instructions": {
        "default_verbosity": "summary",  # Show only file names + line counts
        "expand_on_request": True,       # Allow LLM to request full snippets
        "expansion_tool": "expand_search_result(index=N)"
    }
}

# Initial load: summary only
load(entity, verbosity="summary")
# Context: "Found authentication code in 52 files. Use expand_search_result(index=N) for details."

# LLM can then selectively expand:
action = expand_search_result(index=0)
# Returns full snippet from auth.py

Benefit: LLM gets overview first, then drills down only where needed.

8.6 Multi-Level Tool Descriptions

Tools can define descriptions at multiple granularities:

Tool.descriptions = {
    "title": "read_file",

    "one_liner": "Read file contents",

    "summary": """
        read_file(path) → string
        Reads and returns file contents from disk.
    """,

    "detailed": """
        read_file(path, offset=0, limit=None) → string

        Reads file contents from the filesystem.

        Parameters:
        - path: Absolute path to file
        - offset: Starting line number (0-indexed)
        - limit: Maximum number of lines to read

        Returns: File contents as string

        Errors: FileNotFoundError, PermissionError

        Example:
            content = read_file("/home/user/config.json")
    """,
}

Load Strategy:

# When context is spacious: load detailed
# When context is tight: load summary
# When very tight: load one_liner only
# When nearly full: load title only (just function name)

verbosity = choose_verbosity_level(ctx.available_space)
tool_entity.content = tool.descriptions[verbosity]

9. Integration with Existing Systems

9.1 Mapping to Real Agent Frameworks

LangChain:

# LangChain concepts → Entity Calculus
Agent = LLM + Harness
Tools = List[Entity(tool_description)] + execute handlers
Memory = Entity(conversation_buffer, type="memory", loading="preloaded")
Chains = Predefined entity sequences
Callbacks = Instrumentation hooks in harness.load and harness.execute

AutoGen:

# AutoGen concepts → Entity Calculus
ConversableAgent = Agent (LLM + Harness)
UserProxyAgent = Agent with (no LLM, execute only harness)
GroupChat = Multi-agent with shared entity pool
Human-in-loop = User input injected as entity during loop

Cursor / Copilot:

# IDE agent concepts → Entity Calculus
Codebase Context = List[Entity] from semantic search over code
Active File = Entity(current_file, loading="preloaded", verbosity="full")
Related Files = List[Entity(file, loading="dynamic", verbosity="summary")]
LSP Information = Entity(type_info + references, loading="on-demand")

9.2 RAG Systems

Traditional RAG:

query = user_input
docs = vector_db.search(query)
context = [system_prompt, docs, query]
response = llm(context)

Entity Calculus RAG:

# Documents are just entities!
entities = [
    Entity(system_prompt, loading="preloaded"),
    Entity(user_input, loading="preloaded"),
    *[Entity(doc, loading="dynamic", discovered_by="semantic_search")
      for doc in vector_db.search(user_input)]
]

# Standard agent loop
ctx = harness.load(Context(), entities)
reasoning, action = llm(ctx)

Key Insight: RAG is entity discovery via semantic search.

10. Future Directions & Open Questions

10.1 Entity Discovery as a Graph

Idea: Entities can link to related entities, forming a graph.

Entity(GitCommitSkill).links = [
    Link(Entity(ReadFileTool), relation="requires"),
    Link(Entity(WriteFileTool), relation="requires"),
    Link(Entity(GitStatusTool), relation="uses"),
    Link(Entity(PRCreationSkill), relation="related_to"),
]

Use Case: When GitCommitSkill is loaded, harness can automatically load linked tools.

Question: How to prevent exponential blow-up of linked entities?

10.2 Entity-Provided Processing Instructions

Idea: Entities specify how they should be processed after use.

Entity(tool_result).processing_hints = {
    "after_llm_reads": "compress_to_digest",
    "after_N_turns": "omit_if_not_referenced",
    "if_context_full": "move_to_world_storage"
}

Benefit: Declarative context management instead of imperative harness logic.

Question: How to balance entity autonomy with global context optimization?

10.3 Learned Context Management

Idea: Use ML to learn optimal loading strategies.

# Train a model to predict:
# - Which entities to load given task
# - What verbosity level to use
# - When to compress/omit entities

load_policy = train(
    inputs=[task_embedding, available_entities, context_state],
    outputs=[entities_to_load, verbosity_levels],
    objective=maximize(task_success_rate) - penalize(context_usage)
)

Question: How to collect training data? What are the right features?

10.4 Hierarchical Entities

Idea: Entities can contain sub-entities.

Entity(Codebase) = {
    "type": "collection",
    "children": [
        Entity(Module1),
        Entity(Module2),
        ...
    ]
}

# Load strategy:
# - Initially load: Entity(Codebase, verbosity="summary")
#   → "Codebase contains 50 modules in 3 categories"
# - On demand: Entity(Module1, verbosity="full")

Use Case: Representing complex structured knowledge (codebases, documentation sites, databases).

10.5 Entity Lifecycle Hooks

Idea: Entities can define callbacks for lifecycle events.

Entity.hooks = {
    "on_load": lambda ctx: validate_dependencies(ctx),
    "on_compress": lambda content: custom_summarize(content),
    "on_evict": lambda: persist_to_world(),
    "on_access": lambda: log_usage_analytics(),
}

Benefit: Entities become active components, not passive data.

10.6 Cross-Agent Entity Sharing

Idea: Multiple agents share a common entity pool.

# Shared world with entity store
world.entity_store = {
    "current_task": Entity(...),
    "research_results": Entity(...),
    "code_changes": Entity(...),
}

# Agent A updates entity
agent_a.execute(update_entity("research_results"))

# Agent B reads updated entity
ctx_b = load(ctx_b, world.entity_store["research_results"])

Use Case: Multi-agent collaboration with shared knowledge.

Question: How to handle conflicts? Consistency guarantees?

10.7 Speculative Entity Loading

Idea: Preemptively load entities the LLM might need.

# Predict future entity needs
predicted_entities = predict_next_entities(
    ctx.current_state,
    llm.recent_actions
)

# Load them in background (if context space available)
for e in predicted_entities:
    if ctx.has_space():
        ctx = load(ctx, e, verbosity="summary")

Benefit: Reduced latency for multi-turn interactions.

Question: How to predict accurately without wasting context?

10.8 Differential Context Updates

Idea: Instead of reloading full context, send only diffs.

# Current approach: send full context every turn
llm(full_context) → action

# Differential approach:
llm.update(
    add=[new_entity],
    remove=[old_entity_id],
    modify=[entity_id, new_content]
)

Benefit: Reduced token usage, faster inference.

Challenge: Requires stateful LLM API (not common yet).

10.9 Entity Versioning

Idea: Track entity changes over time.

Entity.versions = [
    (timestamp=0, content="Initial user query"),
    (timestamp=5, content="Elaborated with clarifications"),
    (timestamp=10, content="Further refined based on context"),
]

# Load appropriate version based on temporal context
ctx = load(ctx, entity.version_at(time=5))

Use Case: Understanding how agent's understanding evolved. Debugging. Time-travel debugging.

10.10 Meta-Entities

Idea: Entities that describe other entities.

MetaEntity(
    target=Entity(large_tool_result),
    metadata={
        "summary": "Database query returned 10K rows",
        "schema": {columns: [...], types: [...]},
        "relevance_to_task": 0.85,
        "recommended_verbosity": "digest",
    }
)

Benefit: Richer metadata for smarter loading decisions.

11. Conclusion

The Entity Calculus provides a unified lens for understanding AI agents:

Everything is an entity: Skills, tools, memory, data—all flow through the same abstraction.
Harness manages entity flow: The load and execute functions orchestrate entity movement between context and world.
Context is the bottleneck: All optimizations revolve around the limited context window.
Patterns emerge naturally: Common agent patterns (ReAct, RAG, multi-agent) are special cases of entity flow.
Composability: Because everything is an entity, components compose cleanly.

Key Insights

Agent = LLM + Harness cleanly separates reasoning (LLM) from orchestration (harness).
Entity abstraction unifies disparate concepts under one model.
Load/Execute duality captures the full agent loop: load entities into context, execute actions in world.
Multi-agent systems are recursive applications of the same calculus.

Practical Value

For researchers, this framework provides:

Formal vocabulary for discussing agent architectures
Basis for systematic analysis of agent behaviors
Foundation for developing new context management techniques

For engineers, this framework provides:

Clear mental model for designing agents
Reusable patterns for common agent tasks
Principled approach to context optimization

Next Steps

We invite the community to:

Implement reference harnesses following this calculus
Develop benchmarks for entity loading strategies
Explore the open questions outlined in Section 10
Extend the calculus to new domains (e.g., multimodal agents, embodied agents)

References

Appendix: Notation Reference

Symbol	Meaning
`LLM: Context → (Reasoning, Actions)`	LLM as pure function
`Agent = LLM + Harness`	Agent decomposition
`load: (Context, Entity, List[Entity]) → Context'`	Harness load function
`execute: (Action, World) → (Entity, World')`	Harness execute function
`Entity`	Unit of information loadable into context
`Context`	LLM's directly accessible information
`World`	External state outside context
`verbosity ∈ {full, summary, digest, reference}`	Entity loading granularity
`loading ∈ {preloaded, dynamic}`	Entity loading strategy

Document Version: 1.0
Last Updated: January 15, 2026