I've built a lot of AI demos that looked impressive in a notebook and fell apart in production. The usual culprit? Treating an LLM like a search engine, one prompt in, one answer out, instead of what it actually is: a reasoning engine you can wire into real workflows.
This tutorial is about doing it properly. We're going to build a functional AI agent using Anthropic's Claude API from the ground up, not a wrapper around a framework, but the actual mechanics: a ReAct loop, custom tool use, and a structure you can actually deploy. By the end you'll have running code and a mental model that makes every agent tutorial after this one make sense.
Let's get into it.
What We're Actually Building
The agent we're building will:
- Accept a user query
- Decide which tools it needs to answer
- Call those tools, observe the results
- Reason over the results and either call more tools or return a final answer
This pattern is called ReAct (Reasoning + Acting). It's the backbone of most production agents and it maps cleanly onto how Claude's tool use API works.
Prerequisites
bash
pip install anthropic python-dotenv
You'll need a Claude API key from console.anthropic.com. Store it safely:
bash
.env
ANTHROPIC_API_KEY=your_key_here
Step 1: Basic Claude API Setup
Before building the agent, let's confirm you can talk to Claude.
python
import os
import anthropic
from dotenv import load_dotenv
load_dotenv()
client = anthropic.Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
def ask_claude(prompt: str) -> str:
message = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1024,
messages=[
{"role": "user", "content": prompt}
]
)
return message.content[0].text
# Quick test
print(ask_claude("What is 2 + 2? Answer in one word."))
This is the foundation. If this runs cleanly, you're ready to build on it. For a deeper breakdown of model selection and API parameters, the how to use Claude API tutorial from Dextra Labs is worth reading before you go further.
Step 2: Define Your Tools
Tools are the agent's hands. Without them, Claude can only reason, it can't act. We'll define three tools that our agent can use: a calculator, a web search simulator, and a file writer.
In Claude's API, tools are defined as JSON schemas. Claude reads these schemas and decides when and how to call them.
python
tools = [
{
"name": "calculator",
"description": "Performs basic arithmetic. Use this for any math operations.",
"input_schema": {
"type": "object",
"properties": {
"expression": {
"type": "string",
"description": "Math expression to evaluate, e.g. '15 * 24 + 100'"
}
},
"required": ["expression"]
}
},
{
"name": "web_search",
"description": "Searches the web for current information on a topic.",
"input_schema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query"
}
},
"required": ["query"]
}
},
{
"name": "save_to_file",
"description": "Saves text content to a local file.",
"input_schema": {
"type": "object",
"properties": {
"filename": {"type": "string"},
"content": {"type": "string"}
},
"required": ["filename", "content"]
}
}
]
Now let's write the actual Python functions that execute when Claude calls these tools:
python
import math
def calculator(expression: str) -> str:
try:
# Safe eval for math expressions
allowed = {k: v for k, v in math.__dict__.items()
if not k.startswith("__")}
result = eval(expression, {"__builtins__": {}}, allowed)
return f"Result: {result}"
except Exception as e:
return f"Error: {str(e)}"
def web_search(query: str) -> str:
# In production, wire this to SerpAPI, Tavily, or Brave Search
# Simulated response for tutorial purposes
return (f"Search results for '{query}': "
f"[Simulated] Top result: Relevant information about {query} "
f"from authoritative sources. Published 2025.")
def save_to_file(filename: str, content: str) -> str:
try:
with open(filename, 'w') as f:
f.write(content)
return f"Successfully saved to {filename}"
except Exception as e:
return f"Error saving file: {str(e)}"
Tool dispatcher
def execute_tool(tool_name: str, tool_input: dict) -> str:
if tool_name == "calculator":
return calculator(tool_input["expression"])
elif tool_name == "web_search":
return web_search(tool_input["query"])
elif tool_name == "save_to_file":
return save_to_file(tool_input["filename"], tool_input["content"])
else:
return f"Unknown tool: {tool_name}"
The dispatcher is intentionally simple here. In production you'd want a registry pattern, but for learning, explicit is better than clever.
Step 3: Build the ReAct Agent Loop
This is the core of the tutorial. The ReAct loop works like this:
- Send the user query + available tools to Claude
- Claude either returns a final answer OR a tool call request
- If tool call → execute it, send result back to Claude
- Repeat until Claude returns a final answer
python
def run_agent(user_query: str, max_iterations: int = 10) -> str:
print(f"\n{'='*50}")
print(f"User: {user_query}")
print(f"{'='*50}")
messages = [
{"role": "user", "content": user_query}
]
system_prompt = """You are a helpful AI agent with access to tools.
Think step by step. Use tools when you need real data or calculations.
When you have enough information, provide a clear final answer."""
for iteration in range(max_iterations):
print(f"\n[Iteration {iteration + 1}]")
# Call Claude with tools
response = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=4096,
system=system_prompt,
tools=tools,
messages=messages
)
print(f"Stop reason: {response.stop_reason}")
# If Claude is done reasoning, return the final answer
if response.stop_reason == "end_turn":
final_answer = ""
for block in response.content:
if hasattr(block, 'text'):
final_answer += block.text
print(f"\nFinal Answer: {final_answer}")
return final_answer
# If Claude wants to use tools
if response.stop_reason == "tool_use":
# Add Claude's response to message history
messages.append({
"role": "assistant",
"content": response.content
})
# Process each tool call
tool_results = []
for block in response.content:
if block.type == "tool_use":
print(f" Tool: {block.name}")
print(f" Input: {block.input}")
# Execute the tool
result = execute_tool(block.name, block.input)
print(f" Result: {result}")
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})
# Send tool results back to Claude
messages.append({
"role": "user",
"content": tool_results
})
return "Max iterations reached without a final answer."
The key insight here is the message history. Every tool call and result gets appended to messages, so Claude always has full context of what it's already tried. This is what separates a stateful agent from a stateless chatbot.
Step 4: Run It
python
if __name__ == "__main__":
# Test 1: Math + file output
result = run_agent(
"Calculate compound interest on $10,000 at 7% for 10 years, "
"then save the result to 'investment.txt'"
)
# Test 2: Research + synthesis
result = run_agent(
"Search for information about RAG architecture "
"and summarize the key components."
)
Test 3: Multi-step reasoning
result = run_agent(
"What is the square root of 144 multiplied by the number of days in a leap year?"
)
Run this and watch the agent reason through each step in your terminal. The iteration logs show you exactly how Claude decides which tool to call and when to stop.
Step 5: Adding Memory (The Production Upgrade)
The agent above is stateless, each run_agent call starts fresh. For real applications you need conversation memory. Here's a minimal implementation:
python
class AgentWithMemory:
def __init__(self):
self.conversation_history = []
self.client = anthropic.Anthropic(
api_key=os.getenv("ANTHROPIC_API_KEY")
)
def chat(self, user_message: str) -> str:
# Add user message to history
self.conversation_history.append({
"role": "user",
"content": user_message
})
response = self.client.messages.create(
model="claude-sonnet-4-5",
max_tokens=4096,
system="You are a helpful assistant with memory of our conversation.",
tools=tools,
messages=self.conversation_history
)
# Handle tool use within persistent history
if response.stop_reason == "tool_use":
self.conversation_history.append({
"role": "assistant",
"content": response.content
})
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})
self.conversation_history.append({
"role": "user",
"content": tool_results
})
# Recursive call to get final answer
return self.chat("")
assistant_message = response.content[0].text
self.conversation_history.append({
"role": "assistant",
"content": assistant_message
})
return assistant_message
## **Usage**
agent = AgentWithMemory()
print(agent.chat("My budget is $50,000. Calculate 7% annual return over 5 years."))
print(agent.chat("Now do the same calculation but for 10 years."))
**Claude remembers the $50,000 and 7% from the first message**
The conversation_history list is doing all the heavy lifting here. In production you'd persist this to Redis or a database between sessions.
What to Build Next
Once this is running, the natural next steps are:
Streaming responses — use client.messages.stream() for real-time output in web apps.
Error handling and retries — wrap tool calls in try/except with exponential backoff.
Async execution — parallel tool calls with asyncio cut latency significantly on multi-tool queries.
Structured outputs — use Pydantic models to enforce tool input/output schemas.
For the full architecture patterns and production deployment strategies, Dextra Labs published an in-depth guide on Claude AI agents architecture and deployment covering containerization, monitoring, and scaling patterns beyond what fits in a single tutorial.
The full repo for this tutorial is available at: github.com/dextralabs/claude-agent-tutorial
Quick Recap
What you just built is a genuine ReAct agent, not a chatbot with a system prompt, but a reasoning loop that can call real functions, observe results, and chain multiple steps together. The same pattern powers production agents handling customer support, code review, document analysis, and research workflows at scale.
The code here is intentionally minimal. Strip away the frameworks and this is what's underneath all of them.
Top comments (0)