Tool Use: Giving AI Hands

#ai #api #tooluse #beginners

AI in Practice, No Fluff — Day 7/10

I use Supabase for a few of my projects, and I regularly ask my AI for help with configuration. At first, when I would ask, the answers kept being subtly wrong. Not hallucinated, just outdated. The API had changed, a config option had moved, or a default had been updated. The AI was confident and technically coherent, but it was working from training data that was six months behind the documentation.

Then I started asking it to look up the current docs before answering. One extra sentence in my prompt, and the answers got accurate. What changed was not the model, it was that the AI made a tool call: it searched the web, read the current documentation, and used that instead of its stale training data.

That is tool use. The AI reaches outside of itself to get information or take action it could not do from memory alone.

The loop

In the first series, we covered agents and MCP. Those posts explained what tools are and how they connect. This post goes one level deeper: how tool use actually works when you are building something.

The mechanism is a loop with four steps:

You send a message to the AI, along with a list of tools it is allowed to use.
The AI reads your message, decides it needs to use a tool, and responds with a tool request instead of a final answer. That request includes the tool name and the specific inputs it wants to pass.
Your code executes the tool (checks the docs, queries the database, calls the API) and sends the result back to the AI.
The AI reads the result and gives you its answer.

The important part is step 3. The AI never executes the tool itself. It requests, you execute, you return the result. The AI is making the decision about which tool to use and what inputs to pass, but your application controls what actually happens. That separation is the safety model. You decide what the AI can touch.

What a tool definition looks like

When you send tools to the API, each one is a JSON object with three parts: a name, a description, and an input schema that defines what parameters the tool accepts.

The name is what the AI uses to request the tool. The input schema describes the parameters using JSON Schema, the same format used for structured output. But the description is the piece that matters most, and it is the one most people underwrite.

The AI reads the description to decide whether this tool is relevant to the current request. A tool named check_calendar with a description of "Checks the calendar" gives the AI almost nothing to work with. A description of "Returns all calendar events for a given date range. Use this before suggesting meeting times to avoid conflicts" tells the AI exactly when to reach for it.

Early in my exploration of MCP servers, I had a tool that searched a knowledge base. I wasn't sure why, but the AI wasn't calling it. The name was clear, the schema was correct, the tool worked perfectly when called manually. The description said "Searches the knowledge base." I changed it to "Searches internal documentation for answers to technical questions. Use this when the user asks about system behavior, configuration, or troubleshooting steps that would be in the docs." The AI started calling it immediately.

The description is not metadata. It is an instruction.

Common tool patterns

Most tools fall into a handful of categories:

Read tools retrieve information the AI does not have. Calendar lookups, database queries, file reads, API calls that return data. These are the most common and the safest, since they do not change anything.

Write tools create or modify something. Sending an email, creating a task, updating a record, writing a file. These need more careful thought about when the AI should be allowed to act autonomously versus asking for confirmation.

Search tools find relevant information from a larger set. Semantic search over documents, keyword search in a database, web search. The AI decides the query; you execute it and return results.

Compute tools perform calculations or transformations the AI would struggle to do reliably in text. Running code, performing math, converting formats, validating data.

All of these can work together so that you give the AI a read tool for your database, a search tool for your documentation, a write tool for creating support tickets, and it can handle a customer question end to end: search the docs, check the customer's account, and create a ticket if it cannot resolve the issue.

Where tools get wired up

The tool-use loop works the same way regardless of where you set it up, but the setup itself varies:

The API directly. You define tools in your API request and handle the execution loop in your code. Most flexible, most work.

MCP servers. If you read the MCP post in the first series, this is where it connects. An MCP server wraps a tool (your calendar, your file system, a database) in the standard protocol. AI tools that support MCP can discover and use these tools without custom code for each one.

Claude Desktop, ChatGPT, and other products. These wire up tools behind the scenes. When Claude reads a file or ChatGPT browses the web, they are using the same tool-use loop. You just do not see the wiring.

Agent frameworks and SDKs. Tools like Claude's Agent SDK, LangChain, or CrewAI manage the loop for you. You define tools, the framework handles the back-and-forth. Less control, faster setup.

When the AI ignores your tool

When the AI does not call a tool you defined, the fix is almost always the description.

The AI is making a judgment call about whether a tool is relevant to the current request, and it is making that call based on the description you wrote. If the description is vague, the AI will not know when to reach for it. If the description is specific about when and why to use the tool, the AI will call it reliably.

This is true across providers. I have seen the same pattern with Claude, with OpenAI's function calling, and with open-source models. The description is the decision-maker, and investing the time in rewriting it often solves problems that may seem like they need architectural changes.

Tomorrow: embeddings. How AI knows that two things mean a similar thing, even when they use completely different words.