Ayi NEDJIMI

Posted on May 22

Building an AI agent with tool use in Python (from scratch, no framework)

#ai #tutorial #agents #python

LangChain, AutoGen, CrewAI — the framework ecosystem for AI agents is crowded. Most tutorials jump straight into one of these, which is fine for getting something running fast. It is not fine for understanding what is actually happening.

This tutorial builds a minimal ReAct-style agent from scratch: no framework dependencies, no magic, ~150 lines of Python. Once you have built it, you will understand exactly what any framework is abstracting — and when that abstraction is worth its cost.

What is a ReAct agent?

ReAct (Reason + Act) is a prompting pattern where the model alternates between:

Thinking — reasoning about the current state and what to do next
Acting — calling a tool and observing the result
Repeating — until the task is complete or a step limit is hit

The loop looks like:

Thought: I need to know the current time to answer this.
Action: get_current_time({})
Observation: 2025-11-14T09:32:00Z
Thought: Now I can answer.
Final Answer: It is 9:32 AM UTC on November 14, 2025.

The key insight: the model is not "executing" anything. It is generating text that describes what it wants to do. Your code parses that text, runs the actual tool, and feeds the result back as context.

Setup

pip install openai python-dotenv

import os
import json
import re
import math
import datetime
from typing import Any, Callable
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

Step 1: Define the tools

Each tool is a plain Python function with a schema describing its interface. The schema is what the model sees; the function is what your code calls.

def web_search(query: str) -> str:
    """Mock web search — replace with a real search API in production."""
    results = {
        "python asyncio tutorial": "asyncio is Python's built-in library for writing concurrent code...",
        "latest python version": "Python 3.13 was released in October 2024...",
        "what is a buffer overflow": "A buffer overflow occurs when a program writes more data to a buffer...",
    }
    for key in results:
        if key.lower() in query.lower():
            return results[key]
    return f"No results found for '{query}'. Try a more specific query."

def calculator(expression: str) -> str:
    """Evaluate a safe mathematical expression."""
    # Allow only safe characters
    if not re.match(r'^[\d\s\+\-\*\/\.\(\)\%\^]+$', expression):
        return "Error: expression contains invalid characters"
    try:
        # Replace ^ with ** for Python exponentiation
        safe_expr = expression.replace("^", "**")
        result = eval(safe_expr, {"__builtins__": {}}, {"math": math})
        return str(result)
    except Exception as e:
        return f"Error evaluating expression: {e}"

def get_current_time(timezone: str = "UTC") -> str:
    """Return the current date and time."""
    now = datetime.datetime.utcnow()
    return f"{now.isoformat()}Z (UTC)"

# Tool registry: maps tool name → (function, schema)
TOOLS: dict[str, tuple[Callable, dict]] = {
    "web_search": (
        web_search,
        {
            "name": "web_search",
            "description": "Search the web for information. Use for factual questions or recent events.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "The search query"
                    }
                },
                "required": ["query"]
            }
        }
    ),
    "calculator": (
        calculator,
        {
            "name": "calculator",
            "description": "Evaluate a mathematical expression. Supports +, -, *, /, %, ^.",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {
                        "type": "string",
                        "description": "Mathematical expression to evaluate, e.g. '2 * (3 + 4)'"
                    }
                },
                "required": ["expression"]
            }
        }
    ),
    "get_current_time": (
        get_current_time,
        {
            "name": "get_current_time",
            "description": "Get the current date and time in UTC.",
            "parameters": {
                "type": "object",
                "properties": {
                    "timezone": {
                        "type": "string",
                        "description": "Timezone name (currently only UTC is supported)",
                        "default": "UTC"
                    }
                },
                "required": []
            }
        }
    )
}

Step 2: The tool dispatcher

The dispatcher takes the model's tool call request, validates it, runs the function, and returns the result as a string.

def dispatch_tool(name: str, arguments: dict) -> str:
    if name not in TOOLS:
        return f"Error: unknown tool '{name}'. Available tools: {', '.join(TOOLS.keys())}"

    func, schema = TOOLS[name]

    # Validate required parameters
    required = schema["parameters"].get("required", [])
    missing = [r for r in required if r not in arguments]
    if missing:
        return f"Error: missing required parameters: {', '.join(missing)}"

    try:
        result = func(**arguments)
        return str(result)
    except TypeError as e:
        return f"Error calling {name}: {e}"
    except Exception as e:
        return f"Unexpected error in {name}: {e}"

Step 3: The agent system prompt

The system prompt teaches the model the ReAct format and tells it about available tools.

def build_system_prompt() -> str:
    tool_descriptions = "\n".join(
        f"- {name}: {schema['description']}"
        for name, (_, schema) in TOOLS.items()
    )

    return f"""You are a helpful assistant with access to the following tools:

{tool_descriptions}

To use a tool, respond with a JSON object in this exact format:
{{"action": "tool_name", "arguments": {{"param": "value"}}}}

When you have gathered enough information and are ready to give the final answer,
respond with:
{{"action": "final_answer", "answer": "your complete answer here"}}

Think step by step. Use tools when you need external information or computation.
Only use one tool per response. After observing the result, decide whether to use
another tool or provide the final answer.
"""

Step 4: The agent loop

This is the heart of the agent. It runs the think/act/observe cycle with a maximum iteration guard.

class AgentResult:
    def __init__(self, answer: str, steps: list[dict], iterations: int):
        self.answer = answer
        self.steps = steps
        self.iterations = iterations

def run_agent(user_query: str, max_iterations: int = 8, verbose: bool = True) -> AgentResult:
    messages = [
        {"role": "system", "content": build_system_prompt()},
        {"role": "user", "content": user_query}
    ]

    steps = []

    for iteration in range(1, max_iterations + 1):
        if verbose:
            print(f"\n--- Iteration {iteration} ---")

        # Call the model
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
            temperature=0
        )

        content = response.choices[0].message.content.strip()

        if verbose:
            print(f"Model: {content}")

        # Parse the model's response
        try:
            # Strip markdown code fences if present
            clean = re.sub(r'^```

(?:json)?\n?', '', content)
            clean = re.sub(r'\n?

```$', '', clean)
            parsed = json.loads(clean)
        except json.JSONDecodeError:
            # Model did not follow format — treat as final answer
            return AgentResult(
                answer=content,
                steps=steps,
                iterations=iteration
            )

        action = parsed.get("action")

        # Check for final answer
        if action == "final_answer":
            answer = parsed.get("answer", content)
            steps.append({"iteration": iteration, "action": "final_answer", "result": answer})
            return AgentResult(answer=answer, steps=steps, iterations=iteration)

        # Tool call
        if action not in TOOLS:
            observation = f"Error: '{action}' is not a valid action. Use a tool name or 'final_answer'."
        else:
            arguments = parsed.get("arguments", {})
            observation = dispatch_tool(action, arguments)

            if verbose:
                print(f"Tool '{action}' returned: {observation}")

        steps.append({
            "iteration": iteration,
            "action": action,
            "arguments": parsed.get("arguments", {}),
            "observation": observation
        })

        # Add model response and observation to conversation
        messages.append({"role": "assistant", "content": content})
        messages.append({
            "role": "user",
            "content": f"Observation: {observation}\n\nContinue."
        })

    # Exceeded max iterations
    return AgentResult(
        answer="I could not complete this task within the iteration limit.",
        steps=steps,
        iterations=max_iterations
    )

Step 5: Running the agent

def main():
    queries = [
        "What is 15% of 847, and then multiply that by 3.14?",
        "What time is it right now, and what year is Python 3.13 from?",
        "Search for information about buffer overflow vulnerabilities and summarize it briefly.",
    ]

    for query in queries:
        print(f"\n{'='*60}")
        print(f"Query: {query}")
        print('='*60)

        result = run_agent(query, max_iterations=6, verbose=True)

        print(f"\nFinal Answer: {result.answer}")
        print(f"Completed in {result.iterations} iteration(s)")

if __name__ == "__main__":
    main()

Sample output for the first query:

--- Iteration 1 ---
Model: {"action": "calculator", "arguments": {"expression": "847 * 0.15"}}
Tool 'calculator' returned: 127.05000000000001

--- Iteration 2 ---
Model: {"action": "calculator", "arguments": {"expression": "127.05 * 3.14"}}
Tool 'calculator' returned: 398.937

--- Iteration 3 ---
Model: {"action": "final_answer", "answer": "15% of 847 is approximately 127.05. Multiplied by 3.14, the result is approximately 398.94."}

Final Answer: 15% of 847 is approximately 127.05. Multiplied by 3.14, the result is approximately 398.94.
Completed in 3 iteration(s)

What frameworks actually do

Now that you have built this, you can see what frameworks like LangChain abstract:

Tool registration — LangChain's @tool decorator is your TOOLS registry
Message history management — the messages list in the loop
Output parsing — the JSON extraction with markdown fence stripping
Retry and error handling — the tenacity wrapping that LangChain does internally
Memory — persisting the conversation across sessions (not shown here)
Streaming — yielding observations as they arrive

None of this is magic. The framework saves you 150 lines. The trade-off is opacity: when something breaks, you are debugging someone else's abstraction layer.

Adding a real tool

Replacing the mock web search with a real one is straightforward:

import urllib.request

def web_search(query: str) -> str:
    # Using a search API with OpenAI-compatible interface
    api_key = os.environ.get("SEARCH_API_KEY", "")
    encoded = urllib.parse.quote(query)
    url = f"https://api.search-provider.com/search?q={encoded}&key={api_key}"

    req = urllib.request.Request(url, headers={"Accept": "application/json"})
    with urllib.request.urlopen(req, timeout=5) as resp:
        data = json.loads(resp.read())

    snippets = [r.get("snippet", "") for r in data.get("results", [])[:3]]
    return " | ".join(snippets) if snippets else "No results."

What to add next

Streaming output: use stream=True and yield content chunks for real-time UX
Parallel tool calls: some models support calling multiple tools in one turn — handle tool_calls array
Persistent memory: serialize and restore messages between sessions
Tool result caching: cache deterministic tool results (like calculator) to avoid redundant calls
Human-in-the-loop: pause before executing high-stakes tools and request confirmation

Building from scratch, even once, is the best way to develop a reliable mental model of what agents actually do. After that, using a framework is a conscious trade-off — not a black box.

DEV Community