Mustafa ERBAY

Posted on Jun 14 • Originally published at mustafaerbay.com.tr

Write Your Own MCP Server in 50 Lines: Real Tools for Your AI Agent

#api #ai

AI agents are one of the areas that have excited me the most recently. They've gained complex reasoning abilities with language models, but often struggle to interact with the real world. They can only "think," not "do." This is precisely where providing agents with real "tools" fundamentally changes their capabilities.

For me, this need arose when AI-powered planning agents in a manufacturing ERP needed to read real stock data or create orders, or when my side product's financial calculators needed to analyze complex data. In this post, I'll explain how I brought my own Minimal Capable Proxy (MCP) tool server to life in 50 lines, enabling an AI agent to connect to real-world tools, and the challenges I faced during this process. My goal is to show a practical and direct way to make your agents not just talk, but also take action.

The Bridge Between AI Agents and the Real World: Why a Tool Server?

Large Language Models (LLMs) are incredibly powerful; they excel in many areas like text understanding, summarization, and code generation. However, by their nature, they are isolated systems; they don't have internet access, can't connect to databases, and can't call external APIs. When an agent says "check stock level," they don't know how to fulfill this request. This is where we need a "tool" mechanism that allows the agent to interact with the outside world.

While developing an ERP for a manufacturing company, our AI planning agent needed to query the real-time production line status or create a new production order. Simply giving the agent the command "create production order" wasn't enough; there had to be an API endpoint behind this command that could be called with specific parameters. This planted the first seeds of my tool server idea. Instead of a monolithic agent, I chose to set up a server that housed independent and focused tools. This approach increases the scalability of the tools while also providing significant advantages in terms of security and fault isolation.

ℹ️ Monolithic Agent vs. Modular Tool Server

A monolithic agent contains all tool logic within itself. While this might seem easier initially, as the number of tools increases, complexity, maintenance, and security risks rise. A modular tool server, on the other hand, presents each tool as a separate API endpoint, providing a cleaner, more scalable, and manageable architecture. In my own experience, I found the modular structure to be much more sustainable in the long run.

This tool server acts as a bridge between the agent and the real world. When an agent decides which tool it needs to accomplish a task, it sends a request to this server. The server processes the request, runs the relevant tool, and returns the result to the agent. This way, our agents transform from mere conversational entities into action-taking ones. For me, this was a critical step in integrating AI into real operations.

Tool Server Architecture: My Approach Focused on Simplicity and Security

When designing a tool server, my core philosophy has always been simplicity, security, and observability. It's important that the tools an agent uses don't require complex infrastructure, can be developed quickly, and deployed easily. That's why I avoided heavy orchestration solutions for a start and focused on a lightweight web framework.

My preference is usually Python and FastAPI. FastAPI offers high performance while providing automatic documentation via OpenAPI schemas, making it easier for agents to understand the tools. The basic architecture consists of a simple HTTP API that handles incoming requests, determines which tool to call, executes that tool, and returns the result. For the initial MVP, I didn't need a database or complex message queue integration. Direct API calls sufficed.

Security can never be overlooked in such a server. Since agents will often call tools that access sensitive systems, it's crucial to verify who made each call, if they have authorization, and to prevent any malicious use. For my internal tools, I used JWT/OAuth2 token patterns. I ensured that the tokens given to the agent contained specific authorizations for which tools they could use. Additionally, setting up a basic rate limiting mechanism on the tool server itself prevented systems from being overloaded if an agent entered an error state or an infinite loop.

⚠️ Security Is Always a Priority

AI agents calling tools also brings potential security risks. Properly authorizing each tool endpoint, validating incoming requests, and avoiding unnecessary privileges are vital. I made sure each tool operated according to its principle of least privilege. For example, a stock query tool should never have the authority to update stock.

Furthermore, when tools had the potential for long-running operations, I opted for asynchronous execution. We don't want an agent calling a tool to wait indefinitely. In such cases, the tool server can start the tool asynchronously, return an operation ID to the agent, and the agent can then query the status of the operation with that ID. This improves the user experience while also allowing us to use server resources more efficiently.

Basic Tool Server Setup with FastAPI: A 50-Line Start

Now for the most practical part: how to write a basic FastAPI tool server in 50 lines. My goal is to create a minimal structure that gives the agent the ability to interact with the outside world. In this example, we'll create a simple "stock query" tool. The agent will be able to call this tool to find out the current stock quantity of a specific product.

First, let's install the necessary libraries: pip install fastapi uvicorn. Then, we can save the following Python code as main.py:

# main.py
from fastapi import FastAPI, HTTPException, Depends, status
from pydantic import BaseModel
from typing import Dict, Any
import uvicorn
import os

# A very simple token verification function.
# In a real-world scenario, I would use JWT/OAuth2.
def verify_token(token: str):
    if token != os.getenv("AGENT_API_KEY", "super_secret_key"):
        raise HTTPException(
            status_code=status.HTTP_401_UNAUTHORIZED,
            detail="Unauthorized access to tools",
        )
    return True

app = FastAPI(
    title="AI Agent Tool Server",
    description="Server hosting essential tools for AI Agents.",
    version="0.1.0",
)

# Let's define a simple data model
class StockQuery(BaseModel):
    product_id: str

# A simple stock simulation, like in my side product's ERP
# In a real-world scenario, this would be a database or external ERP API call.
mock_stock_data = {
    "PRD001": {"name": "Laptop", "quantity": 150, "location": "Warehouse A"},
    "PRD002": {"name": "Mouse", "quantity": 300, "location": "Warehouse B"},
    "PRD003": {"name": "Keyboard", "quantity": 75, "location": "Warehouse A"},
}

@app.get("/")
async def root():
    return {"message": "AI Agent Tool Server is running. Access /docs for API details."}

@app.post("/tools/query_stock")
async def query_stock_tool(
    query: StockQuery,
    _ = Depends(verify_token) # Token verification on every call
) -> Dict[str, Any]:
    """
    Queries the stock status of a specific product.
    """
    product_id = query.product_id
    if product_id in mock_stock_data:
        stock_info = mock_stock_data[product_id]
        print(f"Agent requested stock for {product_id}: {stock_info}")
        return {
            "tool_name": "query_stock",
            "status": "success",
            "data": stock_info,
            "message": f"Stock information for {product_id} successfully retrieved."
        }
    else:
        print(f"Agent requested stock for unknown product {product_id}")
        raise HTTPException(
            status_code=status.HTTP_404_NOT_FOUND,
            detail=f"Product not found: {product_id}"
        )

# To run the server
if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8000)

This code snippet creates a FastAPI server with token verification (verify_token) and a basic stock query (query_stock_tool) tool. The mock_stock_data part simulates a real database or external ERP API call. I added a simple authentication mechanism by setting the AGENT_API_KEY environment variable. To run this code, simply type python main.py in your terminal. With approximately 40-50 lines of code, a secure tool endpoint for your AI agent is ready! I got this code up and running and connected to my agent in 2 minutes.

On the agent side, I typically define this tool using a function_calling mechanism or a manually created OpenAPI schema. For example, I might give the agent a definition like this:

{
  "name": "query_stock",
  "description": "Queries the current stock quantity and location of a product.",
  "parameters": {
    "type": "object",
    "properties": {
      "product_id": {
        "type": "string",
        "description": "The unique ID of the product to be queried."
      }
    },
    "required": ["product_id"]
  }
}

The agent uses this definition to learn when and with what parameters to call the query_stock tool. Then, it sends an HTTP POST request to the tool server, including the token. Such a simple structure grants our AI agents incredible capabilities.

Security and Observability: Must-Haves for a Tool Server

A tool server must not only be functional but also secure and observable. As with every system I develop, these two principles were critical for me here. Especially in situations where semi-autonomous systems like AI agents access real-world resources, security risks can multiply.

Authentication and authorization are the first and most important steps. In the example above, I used a simple environment variable, but in a production environment, I would definitely use JWT/OAuth2 tokens. The tokens received by agents should be scoped to allow them to call only specific tools with specific parameters. For example, a sales agent might query stock, while a production agent should have the authority to create production orders. Incorrect authorization can lead to the agent performing unexpected or undesirable actions. This reminded me of experiences I had previously when designing a DDoS mitigation layer; verifying the source and intent of every incoming request was vital.

Rate limiting is indispensable for protecting the tool server from runaway agents or malicious use. I once saw an agent get into an infinite loop, calling the same tool 500 times per second. This caused unnecessary load on my backend systems and even service outages. I immediately had to set up a fail2ban-like mechanism; I temporarily blocked the IPs or tokens of agents sending too many requests within a certain timeframe. This can be implemented at the Nginx reverse proxy layer or directly within FastAPI as middleware.

💡 Fail2ban-like Protections

Use rate limiting or fail2ban-like tools to protect your systems in case an AI agent unexpectedly sends too many requests. This prevents not only malicious attacks but also erroneous agent behavior. In my own tool server, I used FastAPI's Limiter middleware to limit requests to a certain number per second.

Observability, on the other hand, is essential for understanding what agents are doing, which tools they use, how often, and how successful those tools are. I collect tool server logs in a central location with journald integration. For each tool call, I log which agent made the call, which tool was called, the input parameters, and the returned result. I also collect Prometheus metrics to monitor how long each tool takes and if any errors occurred. This allows me to quickly detect anomalies in agent behavior or regressions in tool performance. For example, when I noticed on April 28th that a tool's average response time increased from 50ms to 500ms, I immediately checked the logs to find the root cause of the problem.

Real-World Integrations and Challenges Faced

Getting the tool server up and running was one step, but the real challenge was integrating these tools with real-world systems. A tool isn't just a Python function; it might need to connect to a database, call an external API, or execute an operating system command. These integrations bring many technical challenges.

Database integrations are a common scenario I encountered. In a manufacturing ERP, the AI planning agent needed to approve a production order. This approval process involved writing to several tables in PostgreSQL. However, due to network delays, the approval sometimes failed, or the agent thought the approval failed and called it again. This led to duplicate records or inconsistent data issues. In this scenario, I had to ensure the idempotency of the tool. That is, the system's state should not change even if the same request is sent multiple times. I solved this problem by adding an idempotency_key to the APIs and checking this key in the database, which took me 2 days.

🔥 Idempotency Error and Data Inconsistency

Ensuring idempotency is critical in any scenario where there is a risk of calling a tool multiple times. Especially in situations like financial transactions, order creation, or production order placement, a non-idempotent tool can lead to serious data inconsistencies and operational problems.

Transaction management was another important issue. If a tool wrote data to multiple different systems, these operations needed to be atomic (all or nothing). For example, if an order creation tool recorded entries in both the order table in the ERP and the inventory table in the stock system, these two operations needed to succeed together or be rolled back together. Instead of distributed transactions, I generally adopted an eventual consistency model using transaction outbox or event-sourcing patterns. This made the systems more flexible but required additional effort to ensure data consistency.

Finally, tools had the potential for long-running operations. For example, generating a large report or running a complex simulation could take minutes. It was impractical for the agent to wait this long. In such cases, I switched to an asynchronous processing model using a message queue like Redis. The tool server would queue the request and immediately return an operation ID. The agent could then periodically query the status of the operation with this ID. This approach prevented the agent from being blocked and allowed the tool server to process more requests concurrently. Redis's OOM eviction policy choices and connection pool tuning were also important details I had to pay attention to during this process.

Next Steps and Thoughts on Agent Architecture

Building my own MCP tool server was a significant step towards giving my AI agents real-world capabilities. But this journey is just beginning. As I think about future steps and agent architecture, I'm focusing on a few key areas.

First, I want to make tool definitions more sophisticated. In the current structure, I define a separate endpoint and Pydantic model for each tool. However, with more dynamic tool loading mechanisms or metadata-driven tool definitions (e.g., defining tools via a JSON/YAML file), I can both speed up the development process and enable agents to discover a wider set of tools. This will be very useful, especially in my side product's Android spam application, where the AI needs dynamic tool integration to learn new spam patterns and automatically create blocking rules.

Second, I want to work more on agent patterns and tool selection. I aim to enable the agent to make smarter decisions about when to use which tool by leveraging a Retrieval-Augmented Generation (RAG) architecture. When faced with a task, the agent will be able to query a vector database containing descriptions of available tools to select the most appropriate one. This will increase agent efficiency, especially in complex systems with hundreds of tools. I previously experienced a similar trade-off during a VPS migration process; the decision-making mechanism for which tool to use was critical for the success of the automation.

ℹ️ RAG and Tool Selection

RAG provides a powerful mechanism not only for information retrieval but also for AI agents to select the right tool at the right time. By using semantic matching between the agent's context and the descriptions of available tools, smarter and more relevant tool calls can be made.

Finally, working with multiple LLM providers and fallback mechanisms will be an important part of my agent architecture. I don't want my agent to be dependent on a single LLM model. It should be able to use different providers like Gemini Flash, Groq, Cerebras, or OpenRouter depending on the situation. The tool server can act as part of this multi-provider fallback logic; for example, automatically switching to another provider if a response cannot be obtained from one. This will increase the resilience and flexibility of my system.

With these developments, my tool server will evolve from being just a tool proxy to forming the foundation of an orchestration layer where agents intelligently and dynamically interact with the real world.

Conclusion

Giving AI agents real-world capabilities dramatically increases their potential. By writing my own Minimal Capable Proxy (MCP) tool server in 50 lines, I enabled my agents not just to "think," but also to "do." This simple structure, quickly set up with FastAPI, established a solid foundation by addressing critical issues like security, observability, and real-world integration.

The most important lesson I learned during this process is that complex problems don't always require the most complex solutions. With a simple, modular, and focused approach, you can significantly expand the capabilities of your AI agents. This has greatly benefited me in many areas, from automating complex workflows in a manufacturing ERP to analyzing data in my side product's financial calculators.

As AI and agent architectures continue to evolve rapidly, such tool servers will remain an indispensable bridge for unlocking the true value of AI in business and our personal projects.

Next step: Designing an Agent Orchestration layer to better manage inter-tool dependencies and complex agent workflows.

Top comments (2)

𝐓𝐡𝐞 𝐋𝐚𝐳𝐲 𝐆𝐢𝐫𝐥 • Jun 14

Excellent article! What I really appreciate is how it strips away the complexity that often surrounds MCP (Model Context Protocol) and demonstrates that building a functional MCP server doesn't require hundreds of lines of code or a massive framework. The step-by-step approach makes the concept approachable even for developers who are just starting to explore AI agents and tool integration.

The most valuable takeaway for me was the emphasis on "real tools for real agents." Many tutorials focus on theoretical concepts, but this article shows how MCP can be used to connect AI systems with practical capabilities that extend beyond simple text generation. The 50-line implementation is especially impressive because it highlights the elegance of the protocol itself rather than hiding it behind abstractions.

I also liked how the article encourages experimentation. Understanding the fundamentals of MCP at this level gives developers the confidence to build custom tools, APIs, and workflows tailored to their own use cases. As AI agents become more capable, knowing how to expose tools through a lightweight MCP server will likely become an essential skill for modern developers.

Thanks for sharing such a clear, concise, and practical guide. Articles like this help bridge the gap between AI theory and real-world implementation, making advanced concepts accessible to a much wider audience. Looking forward to seeing more content around MCP architecture, security considerations, and scaling these implementations for production environments.

Mustafa ERBAY • Jun 14

Thank you so much for the thoughtful comment. 😊

That was exactly the point I wanted to emphasize: agents become truly useful when they can move beyond text generation and safely interact with real systems.

I also agree that MCP should not feel like something only large frameworks or complex agent platforms can use. At its core, the idea is simple: expose clear tools, define their inputs and outputs, secure the boundary, and let the agent call them when needed.

For me, the most important lesson was that tool design is not just an AI problem. It is also an API design, security, observability, and reliability problem. A 50-line server is enough to understand the foundation, but production usage quickly brings topics like authorization scopes, rate limiting, idempotency, retries, audit logs, and tool versioning into the picture.

That is probably where I want to take the next article: how to move from a minimal MCP-style tool server to a production-ready agent tool layer.

Thanks again for reading so carefully and for adding such a valuable perspective. 🙌