DEV Community

Jamie Cole
Jamie Cole

Posted on

Building a Self-Correcting Python Agent with Claude's Tool Use

Most Claude agent tutorials show the happy path: model calls a tool, tool returns a result, model uses it. That's fine for demos. Production is messier.

In production, tools fail. The API you're calling is down. The file doesn't exist. The regex didn't match. What does your agent do then?

Bad answer: crash with an unhandled exception.

Worse answer: the model silently ignores the error and hallucinates a response.

Better answer: the agent detects the failure and tries something different.

Here's how to build that.


The Basic Agent Loop

First, the minimal scaffolding. We're using the anthropic Python library with tool use:

import anthropic
import json

client = anthropic.Anthropic()

def run_agent(user_task: str, tools: list, tool_handler) -> str:
    messages = [{"role": "user", "content": user_task}]

    while True:
        response = client.messages.create(
            model="claude-sonnet-4-5",
            max_tokens=1024,
            tools=tools,
            messages=messages
        )

        if response.stop_reason == "end_turn":
            # Extract text from final response
            return next(
                block.text for block in response.content
                if hasattr(block, "text")
            )

        if response.stop_reason == "tool_use":
            # Process all tool calls in this turn
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    result = tool_handler(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": result
                    })

            # Append assistant response + tool results to history
            messages.append({"role": "assistant", "content": response.content})
            messages.append({"role": "user", "content": tool_results})
Enter fullscreen mode Exit fullscreen mode

This is the core. The loop runs until stop_reason == "end_turn", which means the model is done and ready to give a final answer.


Adding Self-Correction

The self-correction comes from what you return in tool_handler. The model reads your tool result the same way it reads anything else — it's just text in the conversation.

If your tool fails, return a description of the failure, not an exception:

def tool_handler(tool_name: str, tool_input: dict) -> str:
    if tool_name == "read_file":
        path = tool_input.get("path", "")
        try:
            with open(path) as f:
                content = f.read()
            return content
        except FileNotFoundError:
            return f"ERROR: File '{path}' not found. Check the path and try again."
        except PermissionError:
            return f"ERROR: No read permission for '{path}'."
        except Exception as e:
            return f"ERROR: {type(e).__name__}: {str(e)}"

    return f"ERROR: Unknown tool '{tool_name}'"
Enter fullscreen mode Exit fullscreen mode

The key is the error format: ERROR: [what failed]. [what to try instead]. The hint at the end is optional but helps. Claude will read it, understand the tool failed, and either retry with corrected parameters or try a different approach — automatically.


A Real Example: File Search Agent

Here's a minimal agent that searches for a file by name, self-corrects when the initial path is wrong:

tools = [
    {
        "name": "read_file",
        "description": "Read the contents of a file at the given path",
        "input_schema": {
            "type": "object",
            "properties": {
                "path": {"type": "string", "description": "Full path to the file"}
            },
            "required": ["path"]
        }
    },
    {
        "name": "list_directory",
        "description": "List files in a directory",
        "input_schema": {
            "type": "object",
            "properties": {
                "path": {"type": "string", "description": "Directory path"}
            },
            "required": ["path"]
        }
    }
]

def tool_handler(name, inputs):
    import os
    if name == "read_file":
        try:
            return open(inputs["path"]).read()
        except FileNotFoundError:
            return f"ERROR: '{inputs['path']}' not found. Use list_directory to find the correct path."

    elif name == "list_directory":
        try:
            files = os.listdir(inputs["path"])
            return "\n".join(files)
        except Exception as e:
            return f"ERROR: {e}"

result = run_agent(
    "Read the config file from the /app directory and tell me what database host is configured.",
    tools,
    tool_handler
)
print(result)
Enter fullscreen mode Exit fullscreen mode

If the model tries read_file("/app/config.json") and gets a FileNotFoundError, it'll call list_directory("/app") next, see what's actually there, and try again with the right filename. No retry logic in your code. The model handles it.


Two Things That Break This

1. Returning exceptions as exceptions. If your tool raises instead of returning a string, the whole loop crashes. Always catch exceptions and convert them to descriptive strings.

2. Infinite loops. If a tool consistently fails in the same way, the model will keep retrying. Cap your loop:

MAX_TURNS = 10
turn = 0

while turn < MAX_TURNS:
    turn += 1
    response = client.messages.create(...)
    # ... rest of loop

if turn >= MAX_TURNS:
    return "Agent reached turn limit without completing task."
Enter fullscreen mode Exit fullscreen mode

Ten turns is usually plenty for non-trivial tasks. If it's hitting the limit, the problem is usually that the tool description is misleading the model, not that the task is too hard.


Production Checklist

For any agent going into production:

  • [ ] All tool handlers return strings, never raise
  • [ ] Turn limit in place (10 is a reasonable default)
  • [ ] Tool descriptions include what the tool returns on success AND what errors look like
  • [ ] Errors include hints about what to try next
  • [ ] Logging on every tool call (what was called, what was returned, how long it took)

If you want the full checklist — I've got a 9-page PDF covering deployment, error handling, cost control, and monitoring: AI Agent Production Checklist ($9).

For a deeper treatment with full agent architecture patterns, state management, memory tiers, and real code: Autonomous AI Agents with Claude (£25).


Jamie Cole — indie dev, UK. Building agents in Python, writing about what actually breaks.

Top comments (0)