Tool calling is one of those LangChain features that looks simple on the surface — give the LLM a function, it decides when to call it, it returns the result. Most tutorials stop there.
But there's a more powerful way to use it. One that doesn't involve the LLM calling any function at all.
In IhuSale — my AI SaaS for Instagram vendors — I use tool calling not to fetch data, but to classify intent. The LLM reads the schemas, picks a tool, and returns structured data. My code does the routing and execution. The result is a classification system reliable enough to run in production.
Here's how it works.
The problem with plain text classification
The naive approach to intent classification is to ask the LLM a question and read the answer:
response = llm.invoke("What is the intent of this message: 'I want to buy rice'")
# Returns: "order", "ORDER", "place_order", "purchase intent", ...
The LLM is smart enough to understand the message. The problem is the output. You get unpredictable strings — different capitalizations, different phrasings, edge cases you didn't anticipate. Now you're writing brittle string matching logic to interpret the interpreter.
Tool calling solves this cleanly.
The insight: define intents as tools
Instead of asking the LLM "what is the intent?", you define each intent as a tool with a schema. The LLM is forced to pick one and fill in its fields. It cannot return something malformed — the structure is the output.
from langchain_core.tools import tool
@tool
def handle_order(product_name: str, quantity: int) -> dict:
"""Customer wants to place an order for a product."""
pass
@tool
def handle_inquiry(question: str) -> dict:
"""Customer is asking a general question about a product."""
pass
@tool
def handle_complaint(issue: str) -> dict:
"""Customer is reporting a problem with an order."""
pass
@tool
def handle_greeting(message: str) -> dict:
"""Customer is sending a greeting or starting a conversation."""
pass
The function bodies are empty. The LLM never executes them. It only reads the signatures and docstrings — which LangChain serializes into JSON schema and sends with the API request. The tool is serving two purposes: a schema guide for the LLM, and a source of
tool_nameandtool_argsyou extract from the response.
The pipeline: model → tools → classifier → handler
Think of it like this. You're a manager. You hire a smart assistant — the LLM — and hand them a sheet of paper listing four intents with descriptions. When a customer message comes in, the assistant reads it, picks the right bucket, and fills in the relevant details. They hand the sheet back to you. You run the code.
In practice, that's four components:
Models — define the shape of data per intent. Two kinds: intent models that describe what fields belong to which intent, and request/response models for the API layer.
Tools — teach the LLM what intents exist. When you call llm.bind_tools(ALL_TOOLS), LangChain serializes those function signatures and docstrings into JSON schema that gets sent to the model in every API request.
Classifier — chains the prompt and LLM together, calls it with the customer message, and reads the response. The response comes back with a tool_calls field — the LLM saying: "I picked handle_order and here are the args: {product_name: 'rice', quantity: 2}." The classifier pulls out tool_name and tool_args. It doesn't execute anything.
ALL_TOOLS = [handle_order, handle_inquiry, handle_complaint, handle_greeting]
llm = ChatAnthropic(model="claude-3-5-sonnet-20241022")
classifier_llm = llm.bind_tools(ALL_TOOLS, tool_choice="any")
prompt = ChatPromptTemplate.from_messages([
("system", "Classify the customer message and extract relevant data."),
("human", "{message}")
])
chain = prompt | classifier_llm
response = chain.invoke({"message": "I want to buy 2 bags of rice"})
tool_name = response.tool_calls[0]["name"]
tool_args = response.tool_calls[0]["args"]
# tool_name = "handle_order"
# tool_args = {"product_name": "rice", "quantity": 2}
tool_choice="any"is the constraint that makes this work. It forces the LLM to always pick a tool — it cannot return a plain text response. Without this, the model might occasionally just answer the question instead of classifying it.
Handler — one function per intent. The classifier passes tool_args into the matching handler, and the handler constructs the final response — a suggested reply, an escalation flag, whatever the intent requires.
handlers = {
"handle_order": handle_order_handler,
"handle_inquiry": handle_inquiry_handler,
"handle_complaint": handle_complaint_handler,
"handle_greeting": handle_greeting_handler,
}
handler = handlers[tool_name]
result = handler(tool_args)
The full flow in five steps
- Receive customer message
- Ask LLM: which bucket does this fall into?
- LLM says: "Bucket 3 — handle_order — here's the extracted data"
- Run the handler for bucket 3
- Return response
The LLM never runs any Python function. It reads schemas, makes a decision, and returns a structured JSON object. Your code does the routing and execution.
This pattern is used in production not because tool calling is fancy, but because it makes the LLM predictable. And predictable is what you need when you're routing to handlers, building graphs in LangGraph, or processing Instagram DMs at scale.
Next: LangGraph — what happens when your AI needs to remember where it left off across multiple messages.
Top comments (0)