LangChain, AutoGen, CrewAI — the framework ecosystem for AI agents is crowded. Most tutorials jump straight into one of these, which is fine for getting something running fast. It is not fine for understanding what is actually happening.
This tutorial builds a minimal ReAct-style agent from scratch: no framework dependencies, no magic, ~150 lines of Python. Once you have built it, you will understand exactly what any framework is abstracting — and when that abstraction is worth its cost.
What is a ReAct agent?
ReAct (Reason + Act) is a prompting pattern where the model alternates between:
- Thinking — reasoning about the current state and what to do next
- Acting — calling a tool and observing the result
- Repeating — until the task is complete or a step limit is hit
The loop looks like:
Thought: I need to know the current time to answer this.
Action: get_current_time({})
Observation: 2025-11-14T09:32:00Z
Thought: Now I can answer.
Final Answer: It is 9:32 AM UTC on November 14, 2025.
The key insight: the model is not "executing" anything. It is generating text that describes what it wants to do. Your code parses that text, runs the actual tool, and feeds the result back as context.
Setup
pip install openai python-dotenv
import os
import json
import re
import math
import datetime
from typing import Any, Callable
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
Step 1: Define the tools
Each tool is a plain Python function with a schema describing its interface. The schema is what the model sees; the function is what your code calls.
def web_search(query: str) -> str:
"""Mock web search — replace with a real search API in production."""
results = {
"python asyncio tutorial": "asyncio is Python's built-in library for writing concurrent code...",
"latest python version": "Python 3.13 was released in October 2024...",
"what is a buffer overflow": "A buffer overflow occurs when a program writes more data to a buffer...",
}
for key in results:
if key.lower() in query.lower():
return results[key]
return f"No results found for '{query}'. Try a more specific query."
def calculator(expression: str) -> str:
"""Evaluate a safe mathematical expression."""
# Allow only safe characters
if not re.match(r'^[\d\s\+\-\*\/\.\(\)\%\^]+$', expression):
return "Error: expression contains invalid characters"
try:
# Replace ^ with ** for Python exponentiation
safe_expr = expression.replace("^", "**")
result = eval(safe_expr, {"__builtins__": {}}, {"math": math})
return str(result)
except Exception as e:
return f"Error evaluating expression: {e}"
def get_current_time(timezone: str = "UTC") -> str:
"""Return the current date and time."""
now = datetime.datetime.utcnow()
return f"{now.isoformat()}Z (UTC)"
# Tool registry: maps tool name → (function, schema)
TOOLS: dict[str, tuple[Callable, dict]] = {
"web_search": (
web_search,
{
"name": "web_search",
"description": "Search the web for information. Use for factual questions or recent events.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query"
}
},
"required": ["query"]
}
}
),
"calculator": (
calculator,
{
"name": "calculator",
"description": "Evaluate a mathematical expression. Supports +, -, *, /, %, ^.",
"parameters": {
"type": "object",
"properties": {
"expression": {
"type": "string",
"description": "Mathematical expression to evaluate, e.g. '2 * (3 + 4)'"
}
},
"required": ["expression"]
}
}
),
"get_current_time": (
get_current_time,
{
"name": "get_current_time",
"description": "Get the current date and time in UTC.",
"parameters": {
"type": "object",
"properties": {
"timezone": {
"type": "string",
"description": "Timezone name (currently only UTC is supported)",
"default": "UTC"
}
},
"required": []
}
}
)
}
Step 2: The tool dispatcher
The dispatcher takes the model's tool call request, validates it, runs the function, and returns the result as a string.
def dispatch_tool(name: str, arguments: dict) -> str:
if name not in TOOLS:
return f"Error: unknown tool '{name}'. Available tools: {', '.join(TOOLS.keys())}"
func, schema = TOOLS[name]
# Validate required parameters
required = schema["parameters"].get("required", [])
missing = [r for r in required if r not in arguments]
if missing:
return f"Error: missing required parameters: {', '.join(missing)}"
try:
result = func(**arguments)
return str(result)
except TypeError as e:
return f"Error calling {name}: {e}"
except Exception as e:
return f"Unexpected error in {name}: {e}"
Step 3: The agent system prompt
The system prompt teaches the model the ReAct format and tells it about available tools.
def build_system_prompt() -> str:
tool_descriptions = "\n".join(
f"- {name}: {schema['description']}"
for name, (_, schema) in TOOLS.items()
)
return f"""You are a helpful assistant with access to the following tools:
{tool_descriptions}
To use a tool, respond with a JSON object in this exact format:
{{"action": "tool_name", "arguments": {{"param": "value"}}}}
When you have gathered enough information and are ready to give the final answer,
respond with:
{{"action": "final_answer", "answer": "your complete answer here"}}
Think step by step. Use tools when you need external information or computation.
Only use one tool per response. After observing the result, decide whether to use
another tool or provide the final answer.
"""
Step 4: The agent loop
This is the heart of the agent. It runs the think/act/observe cycle with a maximum iteration guard.
class AgentResult:
def __init__(self, answer: str, steps: list[dict], iterations: int):
self.answer = answer
self.steps = steps
self.iterations = iterations
def run_agent(user_query: str, max_iterations: int = 8, verbose: bool = True) -> AgentResult:
messages = [
{"role": "system", "content": build_system_prompt()},
{"role": "user", "content": user_query}
]
steps = []
for iteration in range(1, max_iterations + 1):
if verbose:
print(f"\n--- Iteration {iteration} ---")
# Call the model
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
temperature=0
)
content = response.choices[0].message.content.strip()
if verbose:
print(f"Model: {content}")
# Parse the model's response
try:
# Strip markdown code fences if present
clean = re.sub(r'^```
(?:json)?\n?', '', content)
clean = re.sub(r'\n?
```$', '', clean)
parsed = json.loads(clean)
except json.JSONDecodeError:
# Model did not follow format — treat as final answer
return AgentResult(
answer=content,
steps=steps,
iterations=iteration
)
action = parsed.get("action")
# Check for final answer
if action == "final_answer":
answer = parsed.get("answer", content)
steps.append({"iteration": iteration, "action": "final_answer", "result": answer})
return AgentResult(answer=answer, steps=steps, iterations=iteration)
# Tool call
if action not in TOOLS:
observation = f"Error: '{action}' is not a valid action. Use a tool name or 'final_answer'."
else:
arguments = parsed.get("arguments", {})
observation = dispatch_tool(action, arguments)
if verbose:
print(f"Tool '{action}' returned: {observation}")
steps.append({
"iteration": iteration,
"action": action,
"arguments": parsed.get("arguments", {}),
"observation": observation
})
# Add model response and observation to conversation
messages.append({"role": "assistant", "content": content})
messages.append({
"role": "user",
"content": f"Observation: {observation}\n\nContinue."
})
# Exceeded max iterations
return AgentResult(
answer="I could not complete this task within the iteration limit.",
steps=steps,
iterations=max_iterations
)
Step 5: Running the agent
def main():
queries = [
"What is 15% of 847, and then multiply that by 3.14?",
"What time is it right now, and what year is Python 3.13 from?",
"Search for information about buffer overflow vulnerabilities and summarize it briefly.",
]
for query in queries:
print(f"\n{'='*60}")
print(f"Query: {query}")
print('='*60)
result = run_agent(query, max_iterations=6, verbose=True)
print(f"\nFinal Answer: {result.answer}")
print(f"Completed in {result.iterations} iteration(s)")
if __name__ == "__main__":
main()
Sample output for the first query:
--- Iteration 1 ---
Model: {"action": "calculator", "arguments": {"expression": "847 * 0.15"}}
Tool 'calculator' returned: 127.05000000000001
--- Iteration 2 ---
Model: {"action": "calculator", "arguments": {"expression": "127.05 * 3.14"}}
Tool 'calculator' returned: 398.937
--- Iteration 3 ---
Model: {"action": "final_answer", "answer": "15% of 847 is approximately 127.05. Multiplied by 3.14, the result is approximately 398.94."}
Final Answer: 15% of 847 is approximately 127.05. Multiplied by 3.14, the result is approximately 398.94.
Completed in 3 iteration(s)
What frameworks actually do
Now that you have built this, you can see what frameworks like LangChain abstract:
-
Tool registration — LangChain's
@tooldecorator is yourTOOLSregistry -
Message history management — the
messageslist in the loop - Output parsing — the JSON extraction with markdown fence stripping
-
Retry and error handling — the
tenacitywrapping that LangChain does internally - Memory — persisting the conversation across sessions (not shown here)
- Streaming — yielding observations as they arrive
None of this is magic. The framework saves you 150 lines. The trade-off is opacity: when something breaks, you are debugging someone else's abstraction layer.
Adding a real tool
Replacing the mock web search with a real one is straightforward:
import urllib.request
def web_search(query: str) -> str:
# Using a search API with OpenAI-compatible interface
api_key = os.environ.get("SEARCH_API_KEY", "")
encoded = urllib.parse.quote(query)
url = f"https://api.search-provider.com/search?q={encoded}&key={api_key}"
req = urllib.request.Request(url, headers={"Accept": "application/json"})
with urllib.request.urlopen(req, timeout=5) as resp:
data = json.loads(resp.read())
snippets = [r.get("snippet", "") for r in data.get("results", [])[:3]]
return " | ".join(snippets) if snippets else "No results."
What to add next
-
Streaming output: use
stream=Trueand yield content chunks for real-time UX -
Parallel tool calls: some models support calling multiple tools in one turn — handle
tool_callsarray -
Persistent memory: serialize and restore
messagesbetween sessions -
Tool result caching: cache deterministic tool results (like
calculator) to avoid redundant calls - Human-in-the-loop: pause before executing high-stakes tools and request confirmation
Building from scratch, even once, is the best way to develop a reliable mental model of what agents actually do. After that, using a framework is a conscious trade-off — not a black box.
Top comments (0)