Boussaden Taha

Posted on May 9 • Edited on May 11

Agents

#ai #agents

Introduction

This article is my point of view on agents with a technical deep dive on them. I'll be sharing my journey on how I built a working AI agent from scratch, decomposing every component and discussing the trade offs, the latency, the cost and reliability all along.
My goal is to make this deterministic system that's wrapped around a probabilistic core explicit.

Defining an AI Agent

Before building anything, we need to draw hard boundaries between three commonly conflated systems, which are scripts, chatbots and finally agents.

Assumptions

You should be comfortable with JavaScript (async/await, APIs), Basic HTTP concepts and JSON data structures.

Scripts (Deterministic Program)

A script is a just fixed mapping:

    y=f(x)

Same input → same output
No adaptation
No internal state beyond execution context for example:

    function classify(input){
      if(input.includes("error")) return "bug";
      return "general";
    }

With no notion of iteration, decision under uncertainty or external tool usage.

Chatbots (Single-Step LLM System)

A chatbot introduces probabilistic behavior:

y∼p(y∣x)

The output is generated from a probability distribution, so still just a single step with no iterative reasoning loop and no explicit action execution.

for example:

    const response = await llm("Explain recursion simply");

Even with conversation history, this remains a mapping, not a system, with no persistent goal tracking and no structured interaction with the environment.

Agent (Iterative, Stateful System)

An agent is fundamentally different:

    at∼π(a∣st),st+1=f(st,at,rt)

Mathematical Term	Meaning	Code Representation
( st )	Current state	`state` object
( at )	Chosen action	`action` JSON
( rt )	Tool execution result	`result`
( \pi )	Policy (decision model)	`llm()` function
( f )	State transition	`updateState()`

    async function step(state, memory){
      const action = await policy(state, memory);   // at
      const result = await execute(action);         // rt
      const nextState = updateState(state, result); // s_{t+1}

      return { nextState, action, result };
    }

It's iterative with multi-step execution, stateful maintaining memory across stepsand action-oriented interacting with tools/environment

for example:

    while(!done){
        const action = decide(state);
        const result = act(action);
        state = update(state, result);
    }

This loop is the defining feature, without it you just have a wrapper around an API.

Overview

I myself, just began learning about AI agents, and this is my very first small one, tinybot:

    import { groq } from '@ai-sdk/groq';
    import { generateText } from 'ai';

    const model = groq('llama-3.3-70b-versatile', {
      apiKey: 'groq api key goes here',
    });

    const { text } = await generateText({
      model: model,
      system: 'Answer everything in exactly 3 words.',
      prompt: 'What is the meaning of life?',
    });

    console.log(text);

tinybot response:

    taha@192 tinybot % node tinybot.js                                                               
    Find True Happiness

Why Most AI Agent Tutorials Fall Short

For me, I think most tutorials treat AI agents as black boxes, thus creating an over abstraction because of the reliance on frameworks that hide core mechanics.
Like this:

    const agent = new Agent({...});
    await agent.run();

And as a result many or some cannot debug failures, extend functionality and reason about performance.

A More Precise View

At its core, an AI agent can be modeled as a discrete-time control system.

At each time step t, the agent:

Observes a state st
Chooses an action a
Receives a result rt
Transitions to a new state st+1

We can express this formally:

    st+1=f(st,at,rt)

Where:

st = current state (input + memory)
at = action chosen by the agent
rt = result of executing the action
f = state transition function

State Representation (st)

State is the most underexplained part of agent systems.
Formally, it is everything the agent conditions on:

    st=(x,mt,ht)

Where:

x = current input
mt= memory (retrieved knowledge)
ht= interaction history

example:

    const state = {
      input: "Find a good fishing rod under $1000",
      memory: [...retrievedDocs],
      history: [...previousSteps]
    };

Key insight:

The LLM never “sees” your system, only the serialized state you provide; meaning Bad state design = bad decisions.

Deterministic System, Probabilistic Core

An important distinction is that an agent system (loop, tools, memory) is deterministic and the policy (LLM) is probabilistic.
We can think of the full system as:

    Deterministic Runtime + Probabilistic Policy = AI Agent

Or more formally:

    Agent=Runtime(π,T,M)

Where:

π = policy (LLM)
T = set of tools
M = memory system

Why This Matters

This framing is not academic, it just impacts how you build systems, for example if you don’t control st, the agent behaves unpredictably, if you don’t constrain at, the agent may hallucinate actions, and if f is poorly designed, the system becomes unstable.

Stateless vs Stateful Systems

Stateless

Stateless, means each decision is independent:

at∼π(a∣x)

No memory
No accumulation of knowledge
Limited reasoning depth

Stateful

Decisions depend on history:

at∼π(a∣st)

Enables multi-step reasoning
Allows correction and refinement
Introduces complexity (memory growth, noise)

Code Comparison

Stateless:

await llm("Summarize this article");

Stateful:

    await llm(buildPrompt({
      input,
      history,
      retrievedMemory
    }));

From Theory to Execution: Full Step Trace

Let’s walk one iteration concretely:

Step 1: Initial state

    state = {
        input: "Find a good fishing rod under $1000",
        history: [],
        memory: []
    };

Step 2: Policy decision

    {
        "action": "search_products",
        "args": { "query": "fishing rod under 1500" }
    }

Step 3: Tool execution

    result = [
      { name: "Rod 1", price: 800 },
      { name: "Rod 2", price: 650 }
    ];

Step 4: Policy decision

    {
        "action": "search_products",
        "args": { "query": "fishing rod under 1000" }
    }

Step 5: State transition

    state = {
        ...state,
        history: [
          {
            action: "search_products",
            result
          }
        ]
    };

Key Takeaways

An agent is defined by its loop, not its model
State design directly determines decision quality
The LLM is just a policy function, not the system itself
Determinism is a configuration choice, not a default

Core Architecture

After defining what an agent, we need to see the structure of this system. The questions we need to answer is
how do we decompose an agent into components that are modular, testable, and maybe scalable?
The answer is that an agent can be represented as a composition of interacting modules:

    Agent=(π,M,T,E)

Where:

π = policy (LLM decision function)
M = memory system
T = toolset
E = execution runtime (loop + orchestration)

Conceptual Architecture

    User Input
        ↓
    State Builder (input + memory + history)
        ↓
    Policy (LLM)
        ↓
    Action (JSON)
        ↓
    Tool Executor
        ↓
    Result
        ↓
    Memory Update
        ↓
    Loop (repeat or terminate)

How I see it is this conceptual architecture you see above is that its more of a feedback system than a pipeline.

Data Flow

State → Policy

Serialize state into a prompt

Policy → Action

LLM outputs structured decision

Action → Tool

System executes external function

Tool → Result

Returns data to agent

Result → State Update

Incorporated into next iteration

Concrete representation:

    async function agentStep(state, memory){
      const prompt = buildPrompt(state, memory);

      const action = await llm(prompt);       // π(st)
      const parsed = parseAction(action);     // structured at

      const result = await execute(parsed);   // T(at)

      const nextState = updateState(state, parsed, result); // f(...)

      return { nextState, parsed, result };
    }

Serialization Boundary

A serialization boundary is the checkpoint, as for an agent to "packs its bags" to travel across a network or wait in storage it needs to take a formal format, like a JSON, YAML and TOON formats.

AT the end, the keypoint to remember is that the LLM cannot operate on objects, it operates on text.

So we define a serialization function:

    function buildPrompt(state) {
      return `
        You are an agent.

        User goal:
        ${state.input}

        History:
        ${JSON.stringify(state.history)}

        Available tools:
        ${JSON.stringify(toolSchemas)}
      `;
    }

Final Verdict: The serialization function is the encoding half of the process and the decoding half happens inside the LLM's "brain" when it parses your prompt to understand the context.

Memory Systems

Without memory, the agent reduces to just a stateless function, as it turns an agent from a reactive loop into a system capable of contextual reasoning and personalization.

Short Term Memory

Short term memory is what you pass directly into the model.

Implementation

    const history = [
      {
        action: { name: "search_products", args: { query: "fishing rod" } },
        result: [{ name: "Rod 1", price: 850 }]
      }
    ];

Injecting into prompt

    function buildPrompt(state) {
      return `
        User goal:
        ${state.input}

        History:
        ${JSON.stringify(state.history, null, 2)}
      `;
    }

Long Term Memory

Short term memory is insufficient for the agent to remember large documents, user preferences and cross session knowledge; well here persistent memory is introduced.

Storage Options

Database (PostgreSQL, MongoDB)
Vector database (for semantic search)
File based storage (simple file systems)

for example:

    await db.insert({
      userId: "123",
      text: "User prefers Scorpion fishing rods",
      createdAt: Date.now()
    });

Tooling and Action Execution

Memory allows an agent to think with context but tools allow an agent to act on the world.
With tools, it becomes an interactive system capable of retrieving data, triggering workflows, and producing side effects; and without these tools, an agent is limited to text generation.

What Makes a “Tool”

A tool is any callable function that:

Accepts structured input
Performs an operation (internal or external)
Returns a result to the agent

Examples of Tools

API calls (weather, search, payments)
File system operations
Computation utilities

for example:

    const tools = {
      getWeather: async ({ city }) => {
        const res = await fetch(`https://api.weather.com/${city}`);
        return res.json();
      }
    };

Bare in mind that in tools scope, timeouts matter a lot as without constraints, latency can go endelessly.
One slow tool can block the entire agent loop.

Core Runtime

The center of the entire system is the agent loop. Everything we’ve built so far, from policy, memory to tools, only becomes meaningful when orchestrated through a controlled execution loop

Minimal loop

    async function runAgent(input) {
      let state = {
        input,
        history: [],
        memory: []
      };

      for (let step = 0; step < 10; step++) {
        const action = await policy(state);
        const result = await execute(action);

        state = updateState(state, action, result);

        if (isDone(state, action)) break;
      }

      return state.output;
    }

Termination Conditions

Without termination logic, the loop is unbounded.

Practical Conditions

1. Explicit Final Action

if (action.type === "final"){
  return action.output;
}

2. Max Step Limit

if (step >= MAX_STEPS){
  throw new Error("Max steps exceeded");
}

3. Heuristic Completion

function isDone(state){
  return state.history.length > 0 &&
         state.history[state.history.length - 1].action.type === "final";
}

Why This Matters

Without termination, we will have:

Infinite loops
Unbounded cost
API rate issues

Conclusion

This article walked through what an AI agent actually looks like under the hood, from the control loop to memory and tools with small minimal JavaScript implementation. Keep in mind that this is not a deep or complete system just a minimal, educational implementation, basically just what I learned while exploring AI agents and there’s still a lot missing.
If you’re trying to learn this too, my advice is don’t start with frameworks, just try to build a small agent yourself; even a basic version will force you to understand a lot and that’s where the real learning happens.