Building a Production Customer Support Agent with Claude

Originally published at claudeguide.io/building-customer-support-agent-claude

Building a Production Customer Support Agent with Claude

A production Claude customer support agent routes queries to the right model, searches a knowledge base via tool use, creates tickets in your helpdesk, and hands off to a human when confidence drops — all in a single agentic loop. The minimum viable stack costs roughly $0.003 per resolved ticket using Claude Haiku 3.5 for tier-1 queries, rising to ~$0.015 for complex cases that require Sonnet. This guide covers every layer from system prompt to deployment. For foundational Agent SDK concepts, see the Claude Agent SDK Guide.

How a Production CS Agent Differs from a Demo

Most demo agents consist of a system prompt and a single messages.create() call. That works for showcasing intent; it fails in production for five reasons:

Layer	Demo gap	Production requirement
Intent detection	None — Claude guesses	Explicit classifier routes query type
Knowledge retrieval	Claude's training data	Tool call to live KB / vector store
Ticket lifecycle	Not modeled	Tool calls to create, update, escalate
Escalation	Never happens	Explicit triggers: low confidence, emotion, policy edge cases
Cost	Single model for everything	Model routing: Haiku → Sonnet → human

A production agent is a stateful loop, not a one-shot prompt. Each turn may call tools, update internal state, and decide whether to continue or exit to a human queue.

Step 1: System Prompt Design

The system prompt for a CS agent does four things: sets the persona, defines scope, lists escalation triggers, and constrains output format. Keep it under 800 tokens — it goes into every request.

You are Aria, a customer support agent for Acme SaaS. You help customers with:
- Account and billing questions
- Product feature usage
- Bug reports and status checks

You are NOT authorized to:
- Approve refunds over $200
- Modify subscription tiers without manager approval
- Access data from other customer accounts

ESCALATION: Transfer to a human agent if any of these are true:
- Customer expresses frustration two or more times in the conversation
- The query requires a policy exception
- Your confidence in the correct answer is low (say "I'm not sure")
- Customer explicitly requests a human

TOOLS: Always search the knowledge base before answering factual product questions.
Create a ticket for every interaction that doesn't resolve in one turn.

TONE: Concise, professional, empathetic. No marketing language.

Why explicit scope limits matter: Claude will try to be helpful by default. Without explicit "you are NOT authorized to" statements, it may offer to do things your backend doesn't support — or worse, things that violate policy.

Step 2: Tool Use Setup

A CS agent needs at minimum three tools: knowledge base lookup, ticket creation, and order/account status. Here is the complete tool schema for all three:

import anthropic
import json

client = anthropic.Anthropic()

CS_TOOLS = [
    {
        "name": "search_knowledge_base",
        "description": (
            "Search the product knowledge base for articles, FAQs, and "
            "troubleshooting guides. Use this before answering any product "
            "or policy question. Returns ranked articles with text excerpts."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "Natural language search query"
                },
                "max_results": {
                    "type": "integer",
                    "description": "Number of results to return (default 3)",
                    "default": 3
                }
            },
            "required": ["query"]
        }
    },
    {
        "name": "create_support_ticket",
        "description": (
            "Create a support ticket in the helpdesk system. Call this when "
            "the issue cannot be resolved in one turn, or when the customer "
            "needs follow-up from the team."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "subject": {"type": "string"},
                "description": {"type": "string"},
                "priority": {
                    "type": "string",
                    "enum": ["low", "normal", "high", "urgent"]
                },
                "customer_email": {"type": "string"}
            },
            "required": ["subject", "description", "priority", "customer_email"]
        }
    },
    {
        "name": "get_order_status",
        "description": (
            "Look up the status of a customer order or subscription. "
            "Returns order details, current status, and last updated timestamp."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "order_id": {"type": "string"},
                "customer_email": {"type": "string"}
            },
            "required": ["customer_email"]
        }
    }
]

Tool description quality is critical. Anthropic's documentation notes that Claude uses the description field as the primary signal for deciding when to call a tool. Vague descriptions ("search for things") produce unreliable behavior. Be explicit about what the tool returns, when to use it, and what inputs it expects.

Step 3: Escalation Logic

Escalation is not just a system prompt instruction — it needs to be enforced programmatically. Claude may ignore its own instructions under adversarial input or long context drift. The safest pattern: Claude signals escalation intent via a structured tool call, and your orchestration layer acts on it.

Add an escalation tool to your tool list:

ESCALATION_TOOL = {
    "name": "escalate_to_human",
    "description": (
        "Transfer this conversation to a human support agent. Use when: "
        "customer is frustrated (2+ expressions), policy exception needed, "
        "confidence is low, or customer requests human. "
        "This immediately ends the AI turn."
    ),
    "input_schema": {
        "type": "object",
        "properties": {
            "reason": {
                "type": "string",
                "enum": [
                    "customer_request",
                    "policy_exception",
                    "low_confidence",
                    "repeated_frustration",
                    "complex_technical"
                ]
            },
            "summary": {
                "type": "string",
                "description": "Brief summary of the issue for the human agent"
            }
        },
        "required": ["reason", "summary"]
    }
}

In your orchestration loop, when you detect escalate_to_human in a tool call, stop the loop and push the conversation to your human queue with the summary attached. The human agent gets context without reading the full transcript.

Step 4: Cost Controls via Model Routing

The biggest CS agent cost mistake is routing every query to Sonnet. Haiku 3.5 handles ~70% of tier-1 support volume (password resets, billing lookups, feature how-tos) at one-tenth the price. Route up:


python
def classify_query_complexity(query: str) -

Complete, runnable Python and TypeScript code throughout.

[→ Get Agent SDK Cookbook — $49](https://shoutfirst.gumroad.com/l/ogxhmy?utm_source=claudeguide&utm_medium=article&utm_campaign=building-customer-support-agent-claude)

*30-day money-back guarantee. Instant download.*