The story of how every developer eventually hits the same wall — and what finally fixes it
Let’s Start From the Very Beginning
Forget AI for a moment.
You have a MySQL database. Inside it lives your company’s data — customer records, orders, inventory, whatever. You want to work with that data. So what do you do?
You open a MySQL client. You connect directly. You run queries. Simple.
This works perfectly — as long as you’re the only one who needs the data.
But then your frontend team needs the same data. Your mobile app needs it. Your analytics dashboard needs it. Your reporting tool needs it.
Now everyone is connecting directly to MySQL. Credentials are scattered everywhere. If the database schema changes, every single client breaks. There’s no security layer, no rate limiting, no caching.
Direct database connections don’t scale.
The API Layer Enters the Picture
So the smart engineering move is: you put an API in the middle.
You write a Python (or Node.js) backend. It connects to MySQL. It exposes clean endpoints. Now nobody talks to the database directly — they talk to the API.
This is much better. One database connection. One place to apply security, validation, and business logic. If the database changes, you update the API once and all clients keep working.
This pattern is so fundamental that every developer learns it early in their career. The API becomes the single source of truth between your data and the world.
Great. We’ve built solid engineering foundations. Now let’s bring AI into the picture.
Adding LLMs to the Mix
It’s 2024. Your company wants an AI assistant. You want it to answer questions about your data — the very data sitting in that MySQL database your API already serves.
So you think: let me connect my LLM to my API.
You start with one model. Let’s say Gemini.
You write a client file. It calls your Python API, gets data, formats it, and sends it to the Gemini API with your API key. The LLM reads the data and responds intelligently.
It works! Your AI assistant can now answer questions about your customer data.
You show it to the team. Everyone’s excited.
Then someone asks: “Can we also try Claude? I heard it’s better for reasoning.”
More LLMs, More Clients
Sure. You write another client file. This time for Claude.
# claude_client.py
import requests
import anthropic
data = requests.get("http://localhost:8000/customers").json()
client = anthropic.Anthropic(api_key="YOUR_CLAUDE_KEY")
response = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1024,
messages=[{
"role": "user",
"content": f"Here is our customer data: {data}\n\n{user_question}"
}]
)
print(response.content[0].text)
Then the CTO says: “We should also benchmark against OpenAI.”
# openai_client.py
import requests
from openai import OpenAI
data = requests.get("http://localhost:8000/customers").json()
client = OpenAI(api_key="YOUR_OPENAI_KEY")
response = client.chat.completions.create(
model="gpt-4o",
messages=[{
"role": "user",
"content": f"Here is our customer data: {data}\n\n{user_question}"
}]
)
print(response.choices[0].message.content)
Now you have three client files. Each one:
- Calls your Python API to fetch data
- Formats the data for that specific LLM
- Sends it to the LLM’s API with that model’s unique syntax
- Parses the response in that model’s unique response format
Three files. Manageable. You handle it.
Things are working. But then your API grows.
The Problem Starts Here
Your Python API started simple. One file. A few endpoints.
But over time, the team adds features. New data sources. New business logic. New modules.
Six months later, your backend isn’t one file anymore.
It’s 25 Python files.
Now here’s the question that should keep you up at night:
Your three LLM clients — Gemini, Claude, OpenAI — which of these 25 files do they know about?
Only the ones you hardcoded into them when you wrote them.
The customers.py endpoint, maybe orders.py. Whatever you thought to include back then.
But forecasts.py? campaigns.py? leads.py? The client files have no idea those even exist.
The Real Nightmare: Every Client Needs to Be Updated
Let’s say you add a new module — contracts.py. It's important. Your LLM assistant should definitely be able to query it.
What do you have to do?
You open gemini_client.py. Add the new endpoint. Test it. Deploy. Then open claude_client.py. Add the same endpoint. Test it. Deploy. Then open openai_client.py. Same thing again.
Three files updated for one new backend module.
Now imagine this happening every week. Every sprint. Every time a new Python file gets added to the backend.
And what if you add a fourth LLM? A fifth? Maybe you want to try a local Ollama model. Or a fine-tuned internal model. Every new LLM means another client file that needs to be kept in sync with 25 (and growing) backend files.
This is the wall every team eventually hits. You started with a clean, sensible architecture — MySQL → API → LLM clients. But as the system grows, the number of connections explodes. You’re writing the same integration logic over and over. You’re updating files constantly. One missed update means your AI gives stale or incomplete answers.
Let’s visualize how bad this gets:
Each client knows a different subset of your backend. They’re out of sync with each other. They’re all out of date with the actual backend. And every new LLM you add starts at zero — it knows nothing until you manually wire it up.
This is called the N × M problem.
- N = number of LLM clients (Gemini, Claude, OpenAI, Ollama, your custom model…)
- M = number of backend modules (25 Python files, growing…)
- N × M = the total number of integrations you have to write and maintain
3 LLMs × 25 files = 75 custom integration points to maintain. And that number only goes up.
There has to be a better way.
What If There Was One Standard?
Step back and think about what all these clients are actually doing. They’re all:
- Discovering what data/tools are available
- Fetching or invoking those capabilities
- Passing results back to an LLM
The logic is identical. Only the format differs — because each LLM has its own proprietary way of describing tools and functions.
What if we created one universal standard for describing capabilities — and any LLM that speaks that standard could automatically discover and use all 25 of your backend modules?
What if, when you added contracts.py, you only had to register it in one place — and all your LLM clients instantly knew about it?
That’s the idea behind Model Context Protocol (MCP).
Enter MCP: One Standard to Connect Them All
MCP (Model Context Protocol) was introduced by Anthropic in November 2024. The concept is elegant: instead of each LLM client talking directly to your backend in its own custom way, you put a standardized server in the middle.
Your backend exposes its capabilities through the MCP Server. Every LLM client — whether it’s Claude, Gemini, OpenAI, or a local Ollama model — speaks to that one server in the same universal language.
Now when you add contracts.py:
- You register it once in the MCP Server
- All LLM clients automatically discover and use it
- Zero updates to any client file
When you add a brand new LLM — Llama, Mistral, whatever comes next:
- The new client just connects to the existing MCP Server
- It immediately has access to all 25 modules
- Zero integration code to write
The N × M problem collapses to N + M. You maintain your backend modules (M) separately and your LLM clients (N) separately. The MCP Server is the bridge that connects them all.
How MCP is Structured: The Three Players
MCP defines exactly three roles in every interaction:
The Host
The AI application your user interacts with. Claude Desktop, VS Code with Copilot, Cursor, or your own custom chatbot. The Host contains the LLM and manages the conversation.
The Client
Lives inside the Host. Acts as the translator — converts the LLM’s requests into the MCP protocol format (JSON-RPC 2.0), sends them to the Server, and brings responses back. Each Client has a 1:1 connection with one MCP Server, but a Host can run multiple Clients simultaneously.
The Server
This is what you build. It wraps your backend capabilities — your 25 Python files — and exposes them through three standardized primitives: Resources , Tools , and Prompts.
The Three Primitives: Resources, Tools, and Prompts
This is the core design of MCP — and the mental model is beautifully simple.
Resources — Read This
Resources give the LLM read-only access to data. No side effects. No changes. Just information retrieval.
Going back to our story: your customers.py and reports.py modules — if an LLM just needs to read customer data to answer a question, you'd expose those as Resources.
@server.resource("customers/list")
async def get_customers() -> Resource:
data = await db.fetch_all_customers()
return Resource(content=data)
Resources are perfect for RAG-style workflows. Instead of dumping your entire database into the prompt upfront, you expose data as addressable resources the LLM can fetch on demand — much more efficient.
Resources = Query, never modify
Tools — Do This
Tools are functions the LLM can invoke to take real actions. This is where MCP becomes truly powerful for agentic use cases:
- Create a new order in the database
- Send an email notification
- Update an inventory count
- Generate and export a report
- Trigger a shipping webhook
Each Tool has a typed JSON Schema so the LLM knows exactly what arguments to pass. The LLM decides when a Tool is needed, emits a structured call, and the MCP Client routes it to the Server for execution.
@server.tool("create_order")
async def create_order(customer_id: str, items: list, total: float) -> ToolResult:
order = await db.insert_order(customer_id, items, total)
return ToolResult(content=f"Order {order.id} created successfully")
Tools = Take action, produce side effects
Prompts — Use This Template
Prompts are reusable, parameterized templates that standardize common LLM workflows. Users select them explicitly.
For example: a “generate monthly sales report” prompt that takes month and year as parameters and assembles the perfect system message for that task — every time, consistently.
Prompts = Standardize, make repeatable
The golden rule: Resources query. Tools act. Prompts standardize.
Back to Our Story: The Before and After
Before MCP — what our system looked like:
After MCP — what it looks like now:
The architecture went from a tangled web of N × M custom connections to a clean hub-and-spoke model. Your backend team works on the MCP Server. Your AI/client team works on the LLM integrations. They no longer need to constantly sync up every time something changes.
A Quick Example: How It Feels in Practice
Here’s the kind of conversation that becomes possible once everything is connected through MCP:
User: “Show me all customers who placed orders last month but haven’t received their shipment yet, then draft a follow-up email for each of them.”
Without MCP, you’d need to manually wire orders.py, customers.py, shipping.py, and a notification tool into whichever LLM client you're using.
With MCP, the LLM:
- Calls the orders Resource → gets last month's orders
- Calls the customers Resource → gets customer details
- Calls the shipping Resource → checks shipment status
- Filters the unshipped ones
- Calls the draft_email Tool → generates personalized follow-ups
- Reports back to you with a summary
All of this using standardized MCP calls. No custom glue code. And if tomorrow you want to run this same workflow with Claude instead of Gemini? Just point Claude’s client at the same MCP Server. Done.
The Bigger Picture
MCP was introduced by Anthropic in November 2024 and was inspired by the Language Server Protocol (LSP) — the standard that lets code editors like VS Code support dozens of programming languages without each language needing its own custom editor plugin.
MCP does the same thing for AI: instead of every LLM needing a custom plugin for every tool, there’s one protocol that all of them speak.
Since its release, it has been adopted by OpenAI, Google DeepMind, and a rapidly growing ecosystem of developer tools. In December 2025, Anthropic donated the protocol to the Linux Foundation, making it vendor-neutral and community-governed. Today (May 2026), there are 200+ community-built MCP servers for tools like GitHub, Slack, PostgreSQL, Stripe, Figma, and Docker.
It has moved from “interesting Anthropic experiment” to the de facto infrastructure standard for agentic AI systems.
Where to Start
If this story resonates with you and you’re ready to stop writing N × M integrations:
- Browse the MCP server registry at modelcontextprotocol.io — there's likely already a server for the tools you use
- Install Claude Desktop and connect a community MCP server to experience it as a user first
- Build a simple MCP server using the Python SDK (pip install mcp) — wrap one of your existing API endpoints
- Connect it to your preferred LLM client — the same server will work with Claude, GPT, or any local model
The documentation is clean, the SDKs are mature, and the community is extremely active.
Final Thoughts
The journey from a MySQL direct connection → REST API → LLM client → MCP isn’t just a technical evolution. It’s a story that every developer who works with AI will live through.
You’ll start simple. You’ll add more LLMs. You’ll add more backend modules. And one day you’ll look at your codebase and realize you’re maintaining 75 custom integration points just to keep three AI clients in sync with a growing backend.
That’s the moment MCP starts making complete sense.
It’s not a fancy new concept. It’s the same lesson we already learned with APIs — you don’t let every client talk directly to the database. You put a standard interface in the middle.
MCP is that standard interface. But for AI.
To stay informed on the latest technical insights and tutorials, connect with me on Medium, LinkedIn, and Dev.to. For professional inquiries or technical discussions, please contact me via email. I welcome the opportunity to engage with fellow professionals and address any questions you may have.








Top comments (0)