DEV Community

Boussaden Taha
Boussaden Taha

Posted on

Agents

Introduction

This article is my point of view on agents with a technical deep dive on them. I'll be sharing my journey on how I built a working AI agent from scratch, decomposing every component and discussing the trade offs, the latency, the cost and reliability all along.
My goal is to make this deterministic system that's wrapped around a probabilistic core explicit.

Defining an AI Agent

Before building anything, we need to draw hard boundaries between three commonly conflated systems, which are scripts, chatbots and finally agents.

Assumptions

You should be comfortable with JavaScript (async/await, APIs), Basic HTTP concepts and JSON data structures.

Scripts (Deterministic Program)

A script is a just fixed mapping:

    y=f(x)
Enter fullscreen mode Exit fullscreen mode
  • Same input → same output
  • No adaptation
  • No internal state beyond execution context for example:
    function classify(input){
      if(input.includes("error")) return "bug";
      return "general";
    }
Enter fullscreen mode Exit fullscreen mode

With no notion of iteration, decision under uncertainty or external tool usage.

Chatbots (Single-Step LLM System)

A chatbot introduces probabilistic behavior:

y∼p(y∣x)
Enter fullscreen mode Exit fullscreen mode

The output is generated from a probability distribution, so still just a single step with no iterative reasoning loop and no explicit action execution.

for example:

    const response = await llm("Explain recursion simply");
Enter fullscreen mode Exit fullscreen mode

Even with conversation history, this remains a mapping, not a system, with no persistent goal tracking and no structured interaction with the environment.

Agent (Iterative, Stateful System)

An agent is fundamentally different:

    at∼π(a∣st),st+1=f(st,at,rt)
Enter fullscreen mode Exit fullscreen mode
Mathematical Term Meaning Code Representation
( st ) Current state state object
( at ) Chosen action action JSON
( rt ) Tool execution result result
( \pi ) Policy (decision model) llm() function
( f ) State transition updateState()
    async function step(state, memory){
      const action = await policy(state, memory);   // at
      const result = await execute(action);         // rt
      const nextState = updateState(state, result); // s_{t+1}

      return { nextState, action, result };
    }
Enter fullscreen mode Exit fullscreen mode

It's iterative with multi-step execution, stateful maintaining memory across stepsand action-oriented interacting with tools/environment

for example:

    while(!done){
        const action = decide(state);
        const result = act(action);
        state = update(state, result);
    }
Enter fullscreen mode Exit fullscreen mode

This loop is the defining feature, without it you just have a wrapper around an API.

Overview

I myself, just began learning about AI agents, and this is my very first small one, tinybot:

    import { groq } from '@ai-sdk/groq';
    import { generateText } from 'ai';

    const model = groq('llama-3.3-70b-versatile', {
      apiKey: 'groq api key goes here',
    });

    const { text } = await generateText({
      model: model,
      system: 'Answer everything in exactly 3 words.',
      prompt: 'What is the meaning of life?',
    });

    console.log(text);
Enter fullscreen mode Exit fullscreen mode

tinybot response:

    taha@192 tinybot % node tinybot.js                                                               
    Find True Happiness
Enter fullscreen mode Exit fullscreen mode

Why Most AI Agent Tutorials Fall Short

For me, I think most tutorials treat AI agents as black boxes, thus creating an over abstraction because of the reliance on frameworks that hide core mechanics.
Like this:

    const agent = new Agent({...});
    await agent.run();
Enter fullscreen mode Exit fullscreen mode

And as a result many or some cannot debug failures, extend functionality and reason about performance.

A More Precise View

At its core, an AI agent can be modeled as a discrete-time control system.

At each time step t, the agent:

  • Observes a state st
  • Chooses an action a​
  • Receives a result rt
  • Transitions to a new state st+1

We can express this formally:

    st+1=f(st,at,rt)
Enter fullscreen mode Exit fullscreen mode

Where:

  • st = current state (input + memory)
  • at = action chosen by the agent
  • rt = result of executing the action
  • f = state transition function

State Representation (st)

State is the most underexplained part of agent systems.
Formally, it is everything the agent conditions on:

    st=(x,mt,ht)
Enter fullscreen mode Exit fullscreen mode

Where:

  • x = current input
  • mt= memory (retrieved knowledge)
  • ht= interaction history

example:

    const state = {
      input: "Find a good fishing rod under $1000",
      memory: [...retrievedDocs],
      history: [...previousSteps]
    };
Enter fullscreen mode Exit fullscreen mode

Key insight:

  • The LLM never “sees” your system, only the serialized state you provide; meaning Bad state design = bad decisions.

Deterministic System, Probabilistic Core

An important distinction is that an agent system (loop, tools, memory) is deterministic and the policy (LLM) is probabilistic.
We can think of the full system as:

    Deterministic Runtime + Probabilistic Policy = AI Agent
Enter fullscreen mode Exit fullscreen mode

Or more formally:

    Agent=Runtime(π,T,M)
Enter fullscreen mode Exit fullscreen mode

Where:

  • π = policy (LLM)
  • T = set of tools
  • M = memory system

Why This Matters

This framing is not academic, it just impacts how you build systems, for example if you don’t control st, the agent behaves unpredictably, if you don’t constrain at, the agent may hallucinate actions, and if f is poorly designed, the system becomes unstable.

Stateless vs Stateful Systems

Stateless

Stateless, means each decision is independent:

at∼π(a∣x)
Enter fullscreen mode Exit fullscreen mode
  • No memory
  • No accumulation of knowledge
  • Limited reasoning depth

Stateful

Decisions depend on history:

at∼π(a∣st)
Enter fullscreen mode Exit fullscreen mode
  • Enables multi-step reasoning
  • Allows correction and refinement
  • Introduces complexity (memory growth, noise)

Code Comparison

  • Stateless:
await llm("Summarize this article");
Enter fullscreen mode Exit fullscreen mode
  • Stateful:
    await llm(buildPrompt({
      input,
      history,
      retrievedMemory
    }));
Enter fullscreen mode Exit fullscreen mode

From Theory to Execution: Full Step Trace

Let’s walk one iteration concretely:

Step 1: Initial state

    state = {
        input: "Find a good fishing rod under $1000",
        history: [],
        memory: []
    };
Enter fullscreen mode Exit fullscreen mode

Step 2: Policy decision

    {
        "action": "search_products",
        "args": { "query": "fishing rod under 1500" }
    }
Enter fullscreen mode Exit fullscreen mode

Step 3: Tool execution

    result = [
      { name: "Rod 1", price: 800 },
      { name: "Rod 2", price: 650 }
    ];
Enter fullscreen mode Exit fullscreen mode

Step 4: Policy decision

    {
        "action": "search_products",
        "args": { "query": "fishing rod under 1000" }
    }
Enter fullscreen mode Exit fullscreen mode

Step 5: State transition

    state = {
        ...state,
        history: [
          {
            action: "search_products",
            result
          }
        ]
    };
Enter fullscreen mode Exit fullscreen mode

Key Takeaways

  • An agent is defined by its loop, not its model
  • State design directly determines decision quality
  • The LLM is just a policy function, not the system itself
  • Determinism is a configuration choice, not a default

Core Architecture

After defining what an agent, we need to see the structure of this system. The questions we need to answer is
how do we decompose an agent into components that are modular, testable, and maybe scalable?
The answer is that an agent can be represented as a composition of interacting modules:

    Agent=(π,M,T,E)
Enter fullscreen mode Exit fullscreen mode

Where:

  • π = policy (LLM decision function)
  • M = memory system
  • T = toolset
  • E = execution runtime (loop + orchestration)

Conceptual Architecture

    User Input
        ↓
    State Builder (input + memory + history)
        ↓
    Policy (LLM)
        ↓
    Action (JSON)
        ↓
    Tool Executor
        ↓
    Result
        ↓
    Memory Update
        ↓
    Loop (repeat or terminate)
Enter fullscreen mode Exit fullscreen mode

How I see it is this conceptual architecture you see above is that its more of a feedback system than a pipeline.

Data Flow

  • State → Policy

Serialize state into a prompt

  • Policy → Action

LLM outputs structured decision

  • Action → Tool

System executes external function

  • Tool → Result

Returns data to agent

  • Result → State Update

Incorporated into next iteration

Concrete representation:

    async function agentStep(state, memory){
      const prompt = buildPrompt(state, memory);

      const action = await llm(prompt);       // π(st)
      const parsed = parseAction(action);     // structured at

      const result = await execute(parsed);   // T(at)

      const nextState = updateState(state, parsed, result); // f(...)

      return { nextState, parsed, result };
    }
Enter fullscreen mode Exit fullscreen mode


plaintext

Serialization Boundary

A serialization boundary is the checkpoint, as for an agent to "packs its bags" to travel across a network or wait in storage it needs to take a formal format, like a JSON, YAML and TOON formats.

AT the end, the keypoint to remember is that the LLM cannot operate on objects, it operates on text.

So we define a serialization function:

    function buildPrompt(state) {
      return `
        You are an agent.

        User goal:
        ${state.input}

        History:
        ${JSON.stringify(state.history)}

        Available tools:
        ${JSON.stringify(toolSchemas)}
      `;
    }
Enter fullscreen mode Exit fullscreen mode


plaintext
Final Verdict: The serialization function is the encoding half of the process and the decoding half happens inside the LLM's "brain" when it parses your prompt to understand the context.

Memory Systems

Without memory, the agent reduces to just a stateless function, as it turns an agent from a reactive loop into a system capable of contextual reasoning and personalization.

Short Term Memory

Short term memory is what you pass directly into the model.

Implementation

    const history = [
      {
        action: { name: "search_products", args: { query: "fishing rod" } },
        result: [{ name: "Rod 1", price: 850 }]
      }
    ];
Enter fullscreen mode Exit fullscreen mode


plaintext

Injecting into prompt

    function buildPrompt(state) {
      return `
        User goal:
        ${state.input}

        History:
        ${JSON.stringify(state.history, null, 2)}
      `;
    }
Enter fullscreen mode Exit fullscreen mode


plaintext

Long Term Memory

Short term memory is insufficient for the agent to remember large documents, user preferences and cross session knowledge; well here persistent memory is introduced.

Storage Options

  • Database (PostgreSQL, MongoDB)
  • Vector database (for semantic search)
  • File based storage (simple file systems)

for example:

    await db.insert({
      userId: "123",
      text: "User prefers Scorpion fishing rods",
      createdAt: Date.now()
    });
Enter fullscreen mode Exit fullscreen mode


plaintext

Tooling and Action Execution

Memory allows an agent to think with context but tools allow an agent to act on the world.
With tools, it becomes an interactive system capable of retrieving data, triggering workflows, and producing side effects; and without these tools, an agent is limited to text generation.

What Makes a “Tool”

A tool is any callable function that:

  • Accepts structured input
  • Performs an operation (internal or external)
  • Returns a result to the agent

Examples of Tools

  • API calls (weather, search, payments)
  • File system operations
  • Computation utilities

for example:

    const tools = {
      getWeather: async ({ city }) => {
        const res = await fetch(`https://api.weather.com/${city}`);
        return res.json();
      }
    };
Enter fullscreen mode Exit fullscreen mode


plaintext

Bare in mind that in tools scope, timeouts matter a lot as without constraints, latency can go endelessly.
One slow tool can block the entire agent loop.

Core Runtime

The center of the entire system is the agent loop. Everything we’ve built so far, from policy, memory to tools, only becomes meaningful when orchestrated through a controlled execution loop

Minimal loop

    async function runAgent(input) {
      let state = {
        input,
        history: [],
        memory: []
      };

      for (let step = 0; step < 10; step++) {
        const action = await policy(state);
        const result = await execute(action);

        state = updateState(state, action, result);

        if (isDone(state, action)) break;
      }

      return state.output;
    }
Enter fullscreen mode Exit fullscreen mode


plaintext

Termination Conditions

Without termination logic, the loop is unbounded.

Practical Conditions

1. Explicit Final Action
if (action.type === "final"){
  return action.output;
}
Enter fullscreen mode Exit fullscreen mode


plaintext

2. Max Step Limit
if (step >= MAX_STEPS){
  throw new Error("Max steps exceeded");
}
Enter fullscreen mode Exit fullscreen mode


plaintext

3. Heuristic Completion
function isDone(state){
  return state.history.length > 0 &&
         state.history[state.history.length - 1].action.type === "final";
}
Enter fullscreen mode Exit fullscreen mode

Why This Matters

Without termination, we will have:

  • Infinite loops
  • Unbounded cost
  • API rate issues

Conclusion

This article walked through what an AI agent actually looks like under the hood, from the control loop to memory and tools with small minimal JavaScript implementation. Keep in mind that this is not a deep or complete system just a minimal, educational implementation, basically just what I learned while exploring AI agents and there’s still a lot missing.
If you’re trying to learn this too, my advice is don’t start with frameworks, just try to build a small agent yourself; even a basic version will force you to understand a lot and that’s where the real learning happens.

Top comments (0)