DEV Community

Nishant prakash
Nishant prakash

Posted on

Stop stuffing tools into your agent 😤

There is a point in almost every agent project where the excitement starts to fade.

At first, it feels magical. You wire an LLM to a few Python functions, wrap them as tools, and suddenly your assistant can calculate, search, transform, and automate. But then the project grows. A few tools become ten. Ten become thirty. Business logic starts mixing with agent logic. Logging becomes messy. Reuse becomes painful. One agent needs the same tools as another, so you copy code. Then you copy it again. And somewhere in that process, your “smart system” quietly turns into a pile of tightly coupled Python.

That is exactly where MCP starts to make sense.

MCP Meme

Model Context Protocol (MCP) is an open standard for exposing tools, resources, and prompts to LLM applications in a structured way. The official docs describe it as a standardized way for AI apps to connect to external systems, and even compare it to a “USB-C port for AI applications.” (Model Context Protocol)

And once that clicks, a very important design shift becomes obvious:

Your tools do not have to live inside your agent code anymore.
You can build them once, run them as an MCP server, and let agents consume them cleanly from the outside. (LangChain Docs)

That is the idea this post is about.

I’ll show you how I built a tiny MCP server with math and dice tools, added logging to observe tool calls, and then plugged that server into a LangChain agent exposed via FastAPI. Along the way, the architecture changed from “my agent has tools” to something much cleaner:

my tools live in their own server, and my agent just uses them.


Why this matters more than it looks

Imagine a normal backend team.

One person owns business logic. Another owns APIs. Another owns observability. Now imagine if every API consumer copied that business logic into their own codebase. That would quickly turn into chaos.

That’s essentially what happens when we embed tools directly inside agent code.

The issue isn’t that it breaks, it’s that it doesn’t scale.

With MCP, things get cleaner:

  • the agent focuses on reasoning
  • the tool server handles execution
  • logging and latency stay observable
  • tools can be reused across agents
  • you can evolve tools without touching the agent

That separation is exactly what MCP brings to the table, and libraries like langchain-mcp-adapters make this integration seamless(LangChain Docs).


So what is MCP, in plain English?

Let’s strip away the buzzwords.

When an LLM needs to do something real, like querying a database, calling an API, or running a workflow, we usually define those as tools inside the agent code.

MCP changes that idea:

What if tools were exposed through a standard protocol instead?

Now, the same tool server can be used by multiple agents, clients, or even frameworks(FASTMCP).

A simple way to think about it:

  • MCP server → where tools live
  • MCP client → connects to the server
  • Agent → decides when to use tools

It sounds like a small shift, but it’s the difference between a quick demo and a system you can actually scale.


Building a tiny MCP server (and making it observable)

To understand MCP, I didn’t start with anything complex.

I built a small server with just two kinds of tools:

  • basic math operations
  • a dice roll

Instead of pasting the full code here, you can check the clean version directly:

Simple MCP server: server_simple.py

At its core, it looks like this:

from mcp.server.fastmcp import FastMCP

mcp = FastMCP("math-server")

@mcp.tool()
def add(a: float, b: float) -> float:
    return a + b
Enter fullscreen mode Exit fullscreen mode

That’s all it takes to expose a function as a tool.

Now here’s the important part:

  • These tools are not inside your agent
  • They are running in a separate server

Running the MCP server

You can start the server using:

fastmcp run server.py --transport http --port 9090
Enter fullscreen mode Exit fullscreen mode

This exposes your tools over HTTP at:

http://127.0.0.1:9090/mcp
Enter fullscreen mode Exit fullscreen mode

Now any agent (or client) can connect to it.


The moment things start to feel real

At this point, everything worked.
But something was missing.

I could call tools…
but I couldn’t see what was happening inside them.

And that’s when logging becomes non-negotiable.


Adding logging (turning it into a real service)

Instead of rewriting everything, I added a simple decorator to log:

  • when a tool starts
  • what inputs it received
  • what it returned
  • how long it took

You can check the full version here:

Logged MCP server: server.py

The core idea looks like this:

def log_tool(func):
    def wrapper(*args, **kwargs):
        print(f"START {func.__name__} {kwargs}")
        result = func(*args, **kwargs)
        print(f"END {func.__name__} -> {result}")
        return result
    return wrapper
Enter fullscreen mode Exit fullscreen mode

And then:

@mcp.tool()
@log_tool
def roll_dice(faces: int = 6) -> int:
    ...
Enter fullscreen mode Exit fullscreen mode

What this gives you

Now when your agent calls a tool, you don’t just get a result,
you get visibility:

TOOL START | roll_dice | faces=20
TOOL END   | roll_dice | result=15
Enter fullscreen mode Exit fullscreen mode

And this is where MCP starts to click.

Your “tool layer” is no longer hidden inside your agent.
It’s running as a separate, observable service.


Why this step matters

This small setup already gives you:

  • tools defined independently
  • a server that can be reused
  • logging and latency visibility
  • a clean boundary between reasoning and execution

And we haven’t even touched the agent yet.

That’s where things get interesting next.


Now! Can I plug this server into an agent?

And the answer is yes.

LangChain now provides an MCP adapter library, langchain-mcp-adapters, which lets agents consume tools directly from MCP servers. Its MultiServerMCPClient can connect to one or more MCP servers, and by default it is stateless: each tool invocation opens a fresh MCP session, executes, and cleans up. (LangChain Docs)

That stateless behavior turned out to match my use case nicely.


The agent side: FastAPI + LangChain + MCP

Now comes the satisfying part.

Instead of embedding tools inside the agent, I made the agent connect to the MCP server over HTTP.

You can check the full working code here:

Agent (FastAPI + MCP): agent.py


What the agent really does

At a high level, the agent is surprisingly simple.

It:

  • connects to the MCP server
  • fetches available tools
  • initializes an LLM
  • lets the agent use those tools when needed

The key piece looks like this:

client = MultiServerMCPClient({
    "math": {
        "transport": "http",
        "url": "http://127.0.0.1:9090/mcp",
    }
})

tools = await client.get_tools()
Enter fullscreen mode Exit fullscreen mode

That’s it.

MCP tools → automatically become usable by the agent


Adding the agent on top

Then we plug those tools into a LangChain agent:

agent = create_agent(
    model=llm,
    tools=tools,
    system_prompt="You are a helpful assistant that can use tools."
)
Enter fullscreen mode Exit fullscreen mode

And expose it via FastAPI:

@app.post("/chat")
async def chat(request: ChatRequest):
    result = await agent.ainvoke({
        "messages": [
            {"role": "user", "content": request.query}
        ]
    })

    return {"response": result["messages"][-1].content}
Enter fullscreen mode Exit fullscreen mode

And here’s the shift

Notice what’s missing.

There is no math logic here.
No add, no divide, no roll_dice.

The FastAPI app simply says:

  • here is my LLM
  • here is my MCP client
  • give me the tools
  • let the agent use them

What the full flow looks like

Let’s say the user sends:

Roll a 20 sided dice and then add 5 to it

The request travels like this:

  1. user hits the FastAPI /chat endpoint
  2. the agent receives the query
  3. the agent decides it needs a tool
  4. LangChain calls the MCP adapter
  5. the MCP adapter calls the MCP server over HTTP
  6. roll_dice(faces=20) executes
  7. result comes back
  8. the agent uses that numeric output in the next tool call
  9. add(a=15, b=5) executes
  10. final answer is returned to the user

And because the server has logging, you can actually watch this happen.

Here is the kind of trace I saw:

2026-04-08 12:34:34,822 | INFO | [7cb8650b] TOOL START | roll_dice | args=() kwargs={'faces': 20}
2026-04-08 12:34:34,822 | INFO | [7cb8650b] TOOL END   | roll_dice | result=15 | 0.03ms

2026-04-08 12:34:36,231 | INFO | [149d5f59] TOOL START | add | args=() kwargs={'a': 15.0, 'b': 5.0}
2026-04-08 12:34:36,231 | INFO | [149d5f59] TOOL END   | add | result=20.0 | 0.03ms
Enter fullscreen mode Exit fullscreen mode

This is one of those moments where the architecture suddenly feels real.

You are not just “calling functions from an LLM.”
You are watching an agent orchestrate a tool server.


The part that stayed with me

What stood out wasn’t the dice roll or the API.

It was the separation.

  • the MCP server owns the tools
  • the agent handles reasoning
  • the API manages interaction

Each piece has a clear role and can evolve independently.

MCP challenges the habit of packing everything into one place and instead gives you a cleaner way to separate intelligence from execution.

And once you see that, it’s hard not to think:
Why was all of this in one file to begin with?


What’s next

This is just the starting point.

In upcoming blogs, I’ll go deeper into:

  • making MCP tools more robust
  • improving reliability and performance
  • adding security and access control
  • and turning this into something closer to production-ready

Because building tools is one thing —
building safe, scalable, and reliable tool systems is where things get really interesting.

Top comments (0)