Prabhat Kumar

Posted on Feb 27

Wiring Claude Into Real Systems With Tool Use

#python #ai #agents #llm

Claude isn't just a chat interface. With Tool Use (function calling), you can wire it up to real systems — databases, APIs, queues — and build agents that actually do things autonomously. This article walks you through the mental model and real production-grade examples using Python.

Who This Is For

You're a backend engineer. You know REST APIs, you've seen microservices, you understand distributed systems at some level. You've probably played with ChatGPT or the Anthropic API and thought "okay, it answers questions — but how do I actually build something useful with this?"

This article is exactly for that moment.

No PhD in ML required. Just solid engineering instinct.

The Problem With Chatbots

Most engineers first encounter with LLM APIs that looks like below:

import anthropic

client = anthropic.Anthropic(api_key="your-key")

response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "What is the refund status of order #12345?"}]
)

print(response.content[0].text)
# → "I don't have access to your order system, so I can't check that."

And that's the wall. Claude is smart, but it's stateless and isolated. It knows nothing about your business, your database, your users.

The typical workaround? Stuff everything into the prompt:

# ❌ Bad pattern — brittle, expensive, leaks data
prompt = f"""
Here is the full customer database export:
{entire_database_dump}

Now answer: What is the refund status of order #12345?
"""

This doesn't scale. It's expensive, slow, and a security nightmare.

Tool Use is the real answer.

What Is Tool Use, Really?

Think of it like this: you're the orchestrator. Claude is your smart analyst. Instead of dumping all data at Claude upfront, you give Claude a menu of tools it can call — and Claude decides which tools to invoke, in what order, to answer the question.

The flow looks like this:

User Request
     │
     ▼
┌─────────────┐     "I need order info"    ┌──────────────────┐
│   Claude    │ ─────────────────────────► │  get_order_info  │
│   (Brain)   │ ◄───────────────────────── │  (Your API/DB)   │
└─────────────┘     returns order data     └──────────────────┘
     │
     │ "I need refund status too"
     ▼
┌─────────────┐                            ┌──────────────────┐
│   Claude    │ ─────────────────────────► │ get_refund_status│
│   (Brain)   │ ◄───────────────────────── │  (Your API/DB)   │
└─────────────┘     returns refund data    └──────────────────┘
     │
     ▼
  Final Response to User

Claude doesn't call your functions directly. It tells you which function to call and with what args. You execute it. You hand the result back. This is intentional — you stay in control.

Real Use Case #1: Customer Support Agent

Let's build a support agent for an e-commerce platform. It should handle questions about orders, refunds, and shipping — without any human in the loop.

Step 1 — Define Your Tools

tools = [
    {
        "name": "get_order_details",
        "description": "Fetches order details including items, total, and current status for a given order ID.",
        "input_schema": {
            "type": "object",
            "properties": {
                "order_id": {
                    "type": "string",
                    "description": "The order ID, e.g. ORD-98765"
                }
            },
            "required": ["order_id"]
        }
    },
    {
        "name": "get_refund_status",
        "description": "Returns the refund status and expected timeline for a given order.",
        "input_schema": {
            "type": "object",
            "properties": {
                "order_id": {
                    "type": "string",
                    "description": "The order ID to check refund status for"
                }
            },
            "required": ["order_id"]
        }
    },
    {
        "name": "initiate_refund",
        "description": "Initiates a refund for an order if it is eligible. Only call this if the customer explicitly requests a refund.",
        "input_schema": {
            "type": "object",
            "properties": {
                "order_id": {
                    "type": "string",
                    "description": "The order ID to refund"
                },
                "reason": {
                    "type": "string",
                    "description": "Reason for the refund"
                }
            },
            "required": ["order_id", "reason"]
        }
    }
]

Notice the descriptions are crisp and behavioral — they tell Claude when to use the tool, not just what it does. This is critical. Claude uses these descriptions to make decisions.

Step 2 — Wire Up Your Actual Implementations

import anthropic
import json
from typing import Any

client = anthropic.Anthropic(api_key="your-key")

# --- Tool implementations — thin wrappers that delegate to service clients ---

def get_order_details(order_id: str) -> dict:
    return OrderClient.get_order_by_id(order_id)


def get_refund_status(order_id: str) -> dict:
    return RefundClient.get_refund_by_order_id(order_id)


def initiate_refund(order_id: str, reason: str) -> dict:
    return RefundClient.create_refund(order_id, reason)


# --- Tool Executor (dispatcher) ---

def execute_tool(tool_name: str, tool_input: dict) -> Any:
    """Central dispatcher. Claude tells us what to call — we call it."""
    if tool_name == "get_order_details":
        return get_order_details(tool_input["order_id"])
    elif tool_name == "get_refund_status":
        return get_refund_status(tool_input["order_id"])
    elif tool_name == "initiate_refund":
        return initiate_refund(tool_input["order_id"], tool_input["reason"])
    else:
        return {"error": f"Unknown tool: {tool_name}"}

Step 3 — The Agentic Loop

This is the core pattern. It's just a loop — keep calling Claude until it stops asking for tools:

def run_support_agent(user_message: str) -> str:
    """
    The agentic loop:
    1. Send message to Claude with available tools
    2. If Claude wants to call a tool → execute it → send result back
    3. Repeat until stop_reason is anything other than "tool_use"
       (end_turn = final answer, max_tokens = truncation, etc.)
    """
    messages = [{"role": "user", "content": user_message}]

    system_prompt = """You are a helpful customer support agent for an e-commerce platform.
    Use the available tools to fetch real data before answering.
    Be concise, empathetic, and always confirm before initiating any refund.
    Never make up order details — always use tools to fetch accurate information."""

    print(f"\n🧑 User: {user_message}")
    print("-" * 60)

    while True:
        response = client.messages.create(
            model="claude-opus-4-6",
            max_tokens=1024,
            system=system_prompt,
            tools=tools,
            messages=messages
        )

        # Any stop reason other than "tool_use" means Claude is done reasoning.
        # This cleanly handles end_turn, max_tokens, stop_sequence, etc.
        if response.stop_reason != "tool_use":
            final_text = next(
                block.text for block in response.content if hasattr(block, "text")
            )
            print(f"\n🤖 Agent: {final_text}")
            return final_text

        # Claude wants to call tools — execute them and feed results back
        messages.append({"role": "assistant", "content": response.content})

        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                print(f"🔧 Calling tool: {block.name}({block.input})")
                result = execute_tool(block.name, block.input)
                print(f"   ↳ Result: {result}")

                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": json.dumps(result)
                })

        messages.append({"role": "user", "content": tool_results})

Step 4 — Run It

if __name__ == "__main__":
    # Test 1: Order status query
    run_support_agent("Hi, what's the status of my order ORD-12345?")

    # Test 2: Refund request
    run_support_agent("I want a refund for order ORD-12345. The shoes don't fit.")

    # Test 3: Multi-tool chain — Claude calls two tools in one go
    run_support_agent("Can you check order ORD-12345 and also tell me if my refund is on the way?")

Sample Output: Test 3

🧑 User: Can you check order ORD-12345 and also tell me if my refund is on the way?
------------------------------------------------------------
🔧 Calling tool: get_order_details({'order_id': 'ORD-12345'})
   ↳ Result: {'order_id': 'ORD-12345', 'status': 'DELIVERED', ...}
🔧 Calling tool: get_refund_status({'order_id': 'ORD-12345'})
   ↳ Result: {'refund_status': 'PENDING', 'amount': 142.99, ...}

🤖 Agent: Your order ORD-12345 for Nike Air Max and Running Socks was delivered on January 10th.
Regarding your refund — it's currently pending for $142.99, initiated on January 12th,
and you can expect it by January 17th. Let me know if you need anything else!

Claude autonomously decided to call two tools in the same turn because the question warranted it. You wrote zero routing logic for that.

Key Design Principles

Before you go build everything as an agent, here's what to internalize:

1. Tool Descriptions Are Your Decision Logic

The description field is not documentation — it's behavioral guidance. Claude reads it to decide when to invoke the tool. Be precise.

# ❌ Vague — Claude doesn't know when to use this
"description": "Gets order info"

# ✅ Behavioral and specific — tells Claude WHEN to use it
"description": "Fetches live order details including status, items, and delivery date.
                Use this whenever the user asks about a specific order or its current state.
                Requires a valid order ID."

2. You Own the Execution Boundary

Claude never calls your functions directly. It emits a structured tool_use block with the function name and typed arguments. You execute it. This is the trust boundary.

3. The Loop Termination Contract

Your while loop must handle these stop_reason values:

`stop_reason`	Meaning	Your Action
`end_turn`	Claude has a final answer	Extract text, return to user
`tool_use`	Claude wants to call tool(s)	Execute tools, append results
`max_tokens`	Response was cut off	Handle gracefully, retry or fail

4. Message History Is Your Agent's Working Memory

Every tool call and result gets appended to the messages list. Claude uses this full conversation context to reason across multiple steps.

5. Parallel vs Sequential Tool Calls

Claude can call multiple tools in a single turn when they're independent (like fetching order details and refund status simultaneously). This reduces round trips significantly. Design your tools to be stateless and parallelizable where possible — it makes the agent faster.

When NOT to Use Agents

Agents aren't always the answer. Know when to reach for them:

Scenario	Use Agent?	Why
Single lookup + format response	❌ No	Overkill — simple prompt + one tool call is enough
Dynamic multi-step investigation	✅ Yes	Core strength
Sub-100ms latency requirement	❌ No	Agents add multiple LLM round trips
Fully deterministic workflow	❌ No	Use a state machine — don't introduce LLM variance
Complex reasoning with unknown steps	✅ Yes	Agents shine here

The Mental Model, Summarized

Traditional App:  User → Hardcoded Logic → DB → Response
                  (every decision path handwritten by you)

Agentic App:      User → Claude (reasons about what to do)
                              ↕  tools
                         Your Systems (DB, APIs, queues)
                              ↕  results
                         Claude (synthesizes final answer)
                              ↓
                         Response

The shift is: you define capabilities (tools), Claude defines the execution path.

This is powerful because you don't need to anticipate every question or decision tree in advance. Claude reasons through it dynamically — and as you add more tools, the agent naturally becomes more capable without a single line of routing logic change.

Key Takeaways

Tool Use turns Claude from a stateless chat box into an autonomous worker connected to your real systems
The agentic loop is simple: send → get tool request → execute → send result → repeat
You control execution — Claude plans, never acts directly. Use this boundary for security and observability
Tool descriptions are decision logic, not documentation — write them with behavioral precision
Design tools to be stateless and parallelizable — Claude will exploit parallelism automatically

Follow me for more practical engineering content.

Tags: #claude #anthropic #ai #backend #python #agentic #llm #softwaredevelopment #devto

DEV Community