Robert Tidball

Posted on Jun 22

Your MCP Server Doesn't Need 40 Tools

#mcp #api #ai #architecture

MCP demos make it look like the win is exposing everything.

"Here are all my endpoints. Here are all my database tables. Here are all my internal actions. The agent can call anything now."

That feels powerful for about ten minutes.

Then the model calls the wrong tool, passes the right argument in the wrong shape, asks for a chart from a search endpoint, retries a destructive action, or returns an answer that sounds confident because the tool name was vague enough to mean three different things.

The problem is not MCP. The problem is treating MCP like a magic adapter for your backend.

An MCP server is not just "my API, but agent-accessible." It is a product surface for a very literal, very distractible user.

That user happens to be a language model.

The trap: one endpoint, one tool

The first instinct is obvious:

GET /users/:id        -> get_user
GET /users            -> list_users
GET /invoices/:id     -> get_invoice
GET /invoices         -> list_invoices
GET /events           -> list_events
POST /events/search   -> search_events
POST /reports         -> create_report
GET /reports/:id      -> get_report

This looks clean because it mirrors the API.

It is often bad for agents.

Humans can read docs, understand product context, and choose between similar routes. Models do not really "understand" your product. They pattern-match over names, descriptions, schemas, previous messages, and whatever context still fits in the window.

If you give the model 40 similar tools, you did not give it power. You gave it 40 ways to be almost right.

What changed in production

The lesson became obvious while building the MCP layer for FXMacroData.

The backend has normal API routes for calendar events, FX pairs, rates, COT positioning, commodities, bond yields, metadata, and docs. Mirroring every route into MCP would have looked comprehensive, but it would have made the agent choose between too many near-matches.

The useful MCP boundary was smaller:

one tool for the market summary;
one for the release calendar;
one for an FX pair snapshot;
one for chartable indicator history;
one for COT positioning;
one for commodities;
one for bond yields;
one for data coverage and metadata.

That is less impressive in a demo.

It is more useful in a real chat.

The agent no longer needs to know whether a question maps to an endpoint, a cached dashboard payload, or a normalized data table. It only needs to pick the user intent.

A tool is a promise

An API route says:

If you send this request, the server will respond.

An MCP tool should say:

Use me for this exact job, with these exact inputs, and expect this kind of result.

That means a good tool description is not marketing copy. It is routing logic for the model.

Bad:

{
  "name": "get_data",
  "description": "Gets data from the API."
}

Better:

{
  "name": "lookup_release_calendar",
  "description": "Return scheduled economic release events for one currency and date range. Use this before answering questions about upcoming macro events."
}

Even if you do not care about finance, the pattern matters. The second tool tells the model when to use it, what it returns, and what kind of user question it supports.

That is the bar.

Fewer tools, sharper edges

I would rather give an agent 8 boring tools than 45 clever ones.

The boring tools should map to user intent, not to your internal route structure.

User intent	Better tool shape
"What is coming up?"	`lookup_calendar`
"What happened recently?"	`lookup_recent_events`
"Show me the chartable history"	`get_time_series`
"Explain this result"	not a tool; let the model write from tool output
"Export this"	`create_export` only if exports are a real product action

The important part is that the model does not need to assemble your backend architecture in its head before making a useful call.

That is your job.

Names matter more than you want them to

Developers love compact names.

Models need boring names.

I used to like names like:

query
fetch
resolve
search
get_context

Those names are convenient for us and mushy for a model. They force the model to infer too much from the description.

Prefer names that carry product intent:

search_docs
lookup_account_status
get_time_series
list_recent_errors
create_support_summary
check_deployment_status

Longer is fine. Ambiguous is expensive.

Do not return your whole database row

The response shape matters as much as the input schema.

If a tool returns a giant nested object, the model will happily use the wrong field. It may cite an internal note, confuse created_at with updated_at, or summarize an implementation detail that was never meant for users.

Return the smallest shape that supports the job.

{
  "status": "ok",
  "items": [
    {
      "title": "Release calendar",
      "date": "2026-06-22",
      "url": "https://example.com/calendar"
    }
  ],
  "next_action": "Use these rows to answer the user's calendar question."
}

That next_action field looks silly until you watch models behave better with it.

You are not only returning data. You are returning affordances.

Build for failure, not the happy path

Most tool demos show the successful call.

Production quality comes from boring failure states.

What should the tool return when:

no records match;
the user asked for something outside their permissions;
the input is valid but too broad;
the upstream data is stale;
the operation is risky and needs confirmation;
the model picked the wrong tool?

Do not make the model infer all of that from a 500 or an empty array.

For example:

{
  "status": "no_results",
  "message": "No matching rows were found for that date range.",
  "suggested_next_action": "Ask the user whether they want to widen the date range."
}

This is not just nicer UX. It reduces hallucination pressure. The model has something accurate to say instead of filling the silence.

Your OpenAPI schema still matters

MCP does not replace API design.

If anything, it makes sloppy API design more visible.

A clean OpenAPI schema gives you a source of truth for types, descriptions, auth requirements, and examples. An MCP layer can then expose a smaller set of agent-friendly tools on top.

The stack I like looks like this:

Backend API
  -> documented OpenAPI schema
  -> small MCP tool layer
  -> agent instructions and examples
  -> smoke tests that call tools like a model would

The MCP server should not become a second undocumented API. That just creates two surfaces to debug.

Test the tool list like a UI

If a button says "Delete draft", you review the label.

If an MCP tool says run_action, you should review that label too.

I like a simple test that dumps the tool list and asks human questions:

Can I tell when each tool should be used?
Do two tools sound like they do the same thing?
Is any tool name too generic?
Are destructive tools clearly marked?
Are required arguments obvious?
Does each tool return a shape the model can summarize safely?

Then run actual tool calls with boring prompts:

What changed this week?
Show me recent errors.
Find docs about API keys.
Create a short summary for a new user.

If the model keeps choosing the wrong tool, do not just "improve the prompt." Fix the tool boundary.

Where this came from

I have been thinking about this while working on the public API, OpenAPI, docs, and MCP surfaces around FXMacroData.

The domain is not the point. The lesson is.

Once you expose product data to agents, your API is no longer only for deterministic callers. It is also for a model that needs strong names, tight schemas, clear failures, and fewer choices than a human developer.

The better the boundary, the less "agentic" magic you need.

The checklist I use now

Before adding a new MCP tool, I ask:

Is this a real user intent, or just an endpoint I happen to have?
Could an existing tool answer the same question?
Is the name boring and specific?
Does the description say when to use it?
Are the arguments narrow enough?
Is the response shape smaller than the internal object?
Does it handle no-results and permission failures clearly?
Would I be comfortable with the model quoting the response directly?
Do I have at least one smoke prompt that should call this tool?

If the answer is fuzzy, I probably do not need a new tool yet.

Closing

MCP is useful because it gives agents a standard way to use external systems.

But standards do not save bad boundaries.

Your MCP server does not need to expose everything your backend can do. It needs to expose the few things an agent should do, with names and schemas that make the right call obvious.

The best agent tool is not the most powerful one.

It is the one the model cannot easily misunderstand.

DEV Community