TheProdSDE

Posted on Apr 21 • Originally published at pub.towardsai.net

MCP for Backend Engineers: When to Use It (and When to Skip It)

#ai #architecture #backend #mcp

"If you only have one consumer, MCP is a liability."

Subtitle: Real code, real cost, and an honest answer to the question every AI team gets wrong.

TL;DR

1 app · 1 team · 1 toolset → use plain tool calling
Multiple teams or AI surfaces → use MCP
Existing APIs → wrap them in MCP
Early-stage / MVP → skip MCP

MCP is a scaling decision, not a starting point.

Why This Exists

A client had a simple ask: "Let our AI assistant use our REST API."

Three teams built three integrations. All three were wrong:

- Model called /getCustomer when it needed /getOrders
- Auth header dropped silently on retry
- Three teams hardcoded three different endpoint assumptions

Result: 3 integrations. 3 slightly different bugs. 1 very awkward client call.

That's when MCP became the answer. One server wrapping the API. Any AI host connects to it. The client got exactly what they asked for — and we got a pattern reused three times since.

MCP is not about smarter models — it's about cleaner boundaries.

MCP in One Line

MCP = API Gateway + Plugin System + Contract Layer for LLMs

Without MCP	With MCP
Duplicate integrations	Single contract
Auth bugs everywhere	Centralized auth
Hardcoded tools	Dynamic discovery
Fragile wrappers	Reusable servers
Rebuild per host	Plug into existing system

The Hero Diagram: MCP vs Tool Calling

This is the architecture difference that actually matters.

graph TD
    subgraph TC["❌ Tool Calling — Tightly Coupled"]
        direction TB
        TC_H["AI Host"]
        TC_T1["Tool: get_customer()"]
        TC_T2["Tool: list_orders()"]
        TC_T3["Tool: check_inventory()"]
        TC_H --> TC_T1
        TC_H --> TC_T2
        TC_H --> TC_T3
    end

    subgraph MCP["✅ MCP — Loosely Coupled via Contract"]
        direction TB
        MH1["Host: Web Chat"]
        MH2["Host: Slack Bot"]
        MH3["Host: VS Code"]

        MS1["customer-mcp-server\nOwner: CRM Team"]
        MS2["payments-mcp-server\nOwner: Payments Team"]
        MS3["infra-mcp-server\nOwner: Platform Team"]

        MH1 --> MS1
        MH1 --> MS2
        MH2 --> MS1
        MH2 --> MS3
        MH3 --> MS2
        MH3 --> MS3
    end

The shift: Tool calling is a local decision. MCP is an organizational contract.

What MCP Actually Is

Model Context Protocol (MCP) is an open protocol that standardises how AI applications connect to external tools and data sources. Think of it as USB-C for AI tools — one connector, many servers, any host.

Instead of every IDE, chat UI, or agent framework inventing its own plugin format, MCP defines a common contract using JSON-RPC 2.0 over Streamable HTTP or stdio.

Three roles in the spec:

Host — the application the user sees (IDE, CLI, chat UI, agent framework)
Client — a connector inside the host that speaks the MCP protocol
Server — a service that exposes tools, resources, and prompts to the model

The client sits between host and server. It connects to one server at a time and speaks the protocol; the host can orchestrate multiple clients simultaneously.

graph TD
    subgraph HOST["AI Host (Orchestrates Everything)"]
        direction TB
        H["Orchestrator Logic + LLM Calls"]
        C1["MCP Client 1"]
        C2["MCP Client 2"]
        H --> C1
        H --> C2
    end

    subgraph SERVERS["MCP Servers (Owned per Team)"]
        S1["customer-mcp-server"]
        S2["infra-mcp-server"]
    end

    C1 --> S1
    C2 --> S2

Key Insight: The model never talks to your backend directly. It only talks to tools exposed by MCP servers. That's the boundary.

Case Study: Wrapping a Client REST API

Before: Three separate UIs (web, Slack bot, IDE) called the REST API directly. Each integrator made slightly different assumptions about endpoints and auth — producing three brittle integrations and repeated bugs.

After: One MCP server wrapping the existing API. Same tool list and prompts exposed to every host.

Results in production:

Integration time reduced by ~60% for new hosts
Tool-call auth failures dropped from multiple incidents to zero in the first month (root cause: centralized token handling)
Two new hosts onboarded with zero backend changes

This is the practical payoff: a small upfront cost for long-term reduction in duplicated integration work and clearer ownership.

Should You Use MCP? — Decide Before Reading Further

Most tutorials make you read 80% before answering this. Here it is upfront.

If you pick MCP too early, you're trading ~1–2 extra days of setup for zero immediate benefit. Start with the simplest thing. Reach for MCP when the "three wrappers" problem is already real.

Three questions that determine everything:

How many consumers will call these tools?
- One app, one team → plain tool calling, done
- Multiple apps or teams → MCP starts making sense
Who owns the tools vs who owns the AI host?
- Same team owns both → plain tool calling, tight coupling is fine
- Different teams → MCP gives you the clean boundary
Do these tools already exist as APIs?
- Yes → wrap them as an MCP server, zero changes to existing backend
- No → build them either way, MCP adds structure from day one

Five Real Scenarios

Scenario 1 — Internal chatbot for one team
A team wants an AI that queries their own database and sends Slack alerts. One app, one team, they control everything.
→ Plain tool calling. MCP adds overhead with zero benefit here.

Scenario 2 — Your client's REST API (the exact story above)
Client has an existing API. Wants multiple AI surfaces to consume it — web app today, Slack bot next month, VS Code extension later.
→ MCP. Wrap the API once, every host connects via the same protocol.

Scenario 3 — Enterprise platform, multiple teams
Payments team, infra team, CRM team — each owns their domain. One central AI console orchestrates across all three.
→ MCP per domain. Each team ships and secures their own server.

Scenario 4 — Quick prototype / MVP
You're validating an idea over a weekend. Speed matters more than architecture.
→ Plain tool calling always. Get the idea working first. MCP is a refactor you do when it proves valuable.

Scenario 5 — Agentic workflow with long-running tasks
Agent coordinates across multiple systems, tasks need to be traceable and retryable across sessions.
→ MCP. The clean server/client boundary makes observability and retry logic significantly easier to implement.

Decision Flow

flowchart TD
    Start([Start])
    Start --> Prototype{Prototype / MVP?}
    Prototype -- Yes --> Plain[Plain tool calling]
    Prototype -- No --> Multi{Multiple hosts / teams?}
    Multi -- No --> Plain
    Multi -- Yes --> Ownership{Same team owns tools & host?}
    Ownership -- Yes --> Plain
    Ownership -- No --> APICheck{APIs already exist?}
    APICheck -- Yes --> MCP[Use MCP]
    APICheck -- No --> Consider[Build either way — MCP adds structure]
    classDef recommend fill:#f9f,stroke:#333,stroke-width:1px;
    class MCP recommend;

Key Insight: MCP is triggered by scale of ownership and consumption — not complexity.

Where MCP Is a Bad Idea

MCP is the wrong choice when:

You're at the early-stage or MVP phase
Only one host consumes the tools
One team owns the full stack
Latency is business-critical
Tool shapes are still evolving

If you only have one consumer, MCP is architecture cosplay.

Core MCP Concepts

Server-side primitives

1. Tools
Executable actions — often with side effects. The LLM decides when to call them.
Examples: create_invoice, deploy_service, run_sql_query.

2. Resources
Read-only data and context the model is allowed to see — not actions.
Examples: docs, config files, logs, user profile JSON, query results.

3. Prompts
Reusable workflow templates — Postman collections for LLM behaviour. Server-defined recipes the host can invoke.
Examples: bug_report_prompt, deployment_checklist_prompt.

Client-side capabilities

4. Sampling
The server asks the client: "Run a model completion for me using your model access." Useful when servers don't hold model keys.

5. Roots
The server asks: "What file paths or URIs can I operate on?" Organises where tools can act.

6. Elicitation
The server asks the client to collect more information from the user.
Example: "Ask the user which environment to deploy to: dev/staging/prod."

Request / Response Lifecycle

The most important thing to understand: the Host orchestrates LLM calls — not the MCP Client. The client only handles server communication. This distinction is what most sequence diagrams get wrong.

sequenceDiagram
    autonumber
    participant U as User
    participant H as Host / Orchestrator
    participant L as LLM
    participant C as MCP Client
    participant S as MCP Server

    U ->> H : Query
    H ->> C : discover tools
    C ->> S : GET /mcp/tools
    S -->> C : tool schemas
    C -->> H : return schemas
    H ->> L : prompt + tool schemas
    L -->> H : tool call decision
    H ->> C : execute tool call
    C ->> S : call_tool(name, args)
    S -->> C : tool result
    C -->> H : return result
    H ->> L : result + continue prompt
    L -->> H : final answer
    H -->> U : output

Key Insight: The model decides what to call, but the server controls what exists. That's the contract.

Minimal MCP Server (Python)

Three things to watch in this block:

@mcp.tool() is an executable action with side effects
@mcp.resource() is read-only context — the model sees it, doesn't call it
The /health endpoint is not optional — Container Apps and AKS need it for readiness probes. Skip it and your pods restart on a loop.

# server.py
# requires: fastmcp>=2.0.0, python-dotenv

import os
import uuid
from dotenv import load_dotenv
from fastmcp import FastMCP, Resource

load_dotenv()

mcp = FastMCP(
    name="customer-service",
    instructions="Customer service MCP server.",
)

# ── Tool: get_customer ────────────────────────────────────────────────────────
@mcp.tool()
async def get_customer(customer_id: str) -> dict:
    """Fetch customer profile by ID."""
    # Replace with your real DB call
    return {
        "customer_id": customer_id,
        "name": "Priya Sharma",
        "email": "priya@example.com",
        "tier": "premium",
        "last_order_date": "2026-03-15",
    }

# ── Tool: get_orders ─────────────────────────────────────────────────────────
@mcp.tool()
async def get_orders(customer_id: str, limit: int = 5) -> dict:
    """Fetch recent orders for a customer."""
    return {
        "customer_id": customer_id,
        "orders": [
            {"order_id": f"ORD-{i}", "status": "delivered", "amount": 1200 + i * 50}
            for i in range(1, limit + 1)
        ],
    }

# ── Tool: create_support_ticket ───────────────────────────────────────────────
@mcp.tool()
async def create_support_ticket(
    customer_id: str,
    issue: str,
    priority: str = "medium",
) -> dict:
    """Create a support ticket for a customer issue."""
    return {
        "ticket_id": f"TKT-{uuid.uuid4().hex[:8].upper()}",
        "customer_id": customer_id,
        "issue": issue,
        "priority": priority,
        "status": "open",
    }

# ── Resource: customer playbook ───────────────────────────────────────────────
@mcp.resource("customer-playbook")
async def customer_playbook() -> Resource:
    """Internal guidance for handling customer incidents."""
    content = """
# Customer Handling Playbook

- Always greet the customer by name.
- Premium customers: consider goodwill credit for severe issues.
- Escalate to L3 if incident impacts > 10 customers.
- SLA: 4 hours for premium, 24 hours for standard.
"""
    return Resource(
        name="customer_playbook",
        description="Internal guidance for handling customer incidents.",
        mimeType="text/markdown",
        content=content,
    )

# ── Prompt: email draft ───────────────────────────────────────────────────────
@mcp.prompt("email_draft_prompt")
async def email_draft_prompt():
    """Reusable workflow template for customer apology emails."""
    return {
        "name": "email_draft_prompt",
        "description": "Template for customer apology emails.",
        "messages": [{
            "role": "system",
            "content": (
                "You are a senior customer success manager.\n"
                "Write a short apology email to {{customer_name}} about: {{issue}}.\n"
                "Tone: professional, empathetic. Under 100 words."
            ),
        }],
    }

# ── Health check — required for ACA and AKS readiness probes ─────────────────
@mcp.custom_route("/health", methods=["GET"])
async def health():
    from fastapi.responses import JSONResponse
    return JSONResponse({"status": "ok", "server": "customer-mcp"})

if __name__ == "__main__":
    mcp.run(
        transport="streamable-http",
        host="0.0.0.0",
        port=int(os.getenv("PORT", 8000)),
    )

Start it:

pip install "fastmcp>=2.0.0" python-dotenv
python server.py
# Live at http://localhost:8000/mcp

Inspect without writing a single line of client code:

npx @modelcontextprotocol/inspector http://localhost:8000/mcp

MCP Client + Agent Loop

The key line is client.list_tools() — tools are discovered dynamically from the server, not hardcoded in the client. Every host that connects to this server gets the same tool list automatically, even as you add new tools.

# client.py
# requires: fastmcp>=2.0.0, openai>=1.0.0, python-dotenv

import asyncio
import json
import os
from dotenv import load_dotenv
from fastmcp import Client
from openai import AsyncOpenAI

load_dotenv()

openai_client = AsyncOpenAI(api_key=os.getenv("OPENAI_API_KEY"))
MCP_URL = os.getenv("MCP_SERVER_URL", "http://localhost:8000/mcp")


async def run(query: str) -> str:
    async with Client(MCP_URL) as client:

        # Discover tools from the server — not hardcoded here
        tools = await client.list_tools()
        openai_tools = [
            {
                "type": "function",
                "function": {
                    "name": t.name,
                    "description": t.description,
                    "parameters": t.inputSchema,
                },
            }
            for t in tools
        ]

        print(f"\n[MCP] {len(tools)} tools discovered: {[t.name for t in tools]}")
        messages = [{"role": "user", "content": query}]

        while True:
            resp = await openai_client.chat.completions.create(
                model="gpt-4o",
                messages=messages,
                tools=openai_tools,
                tool_choice="auto",
            )
            msg = resp.choices[0].message
            messages.append(msg)

            # No tool calls = final answer
            if not msg.tool_calls:
                return msg.content

            # Execute each tool call via the MCP server
            for tc in msg.tool_calls:
                args = json.loads(tc.function.arguments)
                print(f"[Tool] {tc.function.name}({args})")

                result = await client.call_tool(tc.function.name, args)
                tool_content = (
                    result.content[0].text if result.content else "{}"
                )
                messages.append({
                    "role": "tool",
                    "tool_call_id": tc.id,
                    "content": tool_content,
                })


if __name__ == "__main__":
    queries = [
        "Look up customer C-123 and tell me their tier.",
        "Get the last 3 orders for customer C-123.",
        "C-123 had a late delivery. Open a ticket and draft an apology email.",
    ]
    for q in queries:
        print(f"\n[Query] {q}")
        print(asyncio.run(run(q)))
        print("─" * 60)

Run it (server must be up):

python client.py

Auth — The Part Most Tutorials Skip

TL;DR:
Internal / pod-to-pod → skip auth entirely, use network policy.
Public-facing → OAuth 2.1 + PKCE, no exceptions. The spec mandates it.
Everything else is a practical middle ground.

For remote MCP servers (Streamable HTTP), the spec prescribes OAuth 2.1 Authorization Code + PKCE. The MCP client acts as the OAuth client; your server acts as the resource server; an Authorization Server (Keycloak, Auth0, Azure AD) issues the tokens.

The flow:

Client hits server without a token → 401 with WWW-Authenticate pointing to /.well-known/oauth-protected-resource
Client discovers the auth server from that metadata
Client runs OAuth 2.1 Authorization Code + PKCE
Client retries MCP requests with Authorization: Bearer <token>
Server validates the token and serves tools/resources

Here's the token validation middleware — the piece that took longest to get right because the aud (audience) claim check is easy to miss. Without it, a token issued for a different service will pass validation silently.

# auth_middleware.py
import os
import httpx
from authlib.jose import jwt, JsonWebKey
from fastapi import Request
from fastapi.responses import JSONResponse

MCP_RESOURCE_ID = os.getenv("MCP_RESOURCE_ID", "https://your-mcp-server/")
AUTH_SERVER_URL  = os.getenv("AUTH_SERVER_URL", "")
_jwks_cache: dict | None = None

SKIP_PATHS = {"/health", "/.well-known/oauth-protected-resource"}


async def fetch_jwks() -> dict:
    global _jwks_cache
    if _jwks_cache:
        return _jwks_cache
    async with httpx.AsyncClient() as http:
        r = await http.get(f"{AUTH_SERVER_URL}/protocol/openid-connect/certs")
        r.raise_for_status()
        _jwks_cache = r.json()
    return _jwks_cache


async def oauth_middleware(request: Request, call_next):
    if request.url.path in SKIP_PATHS:
        return await call_next(request)

    auth = request.headers.get("Authorization", "")
    if not auth.startswith("Bearer "):
        return JSONResponse(
            status_code=401,
            content={"error": "missing_token"},
            headers={"WWW-Authenticate": f'Bearer realm="{MCP_RESOURCE_ID}"'},
        )

    token = auth.removeprefix("Bearer ").strip()

    try:
        jwks   = await fetch_jwks()
        claims = jwt.decode(token, JsonWebKey.import_key_set(jwks))
        claims.validate()

        # ── The check most tutorials miss ───────────────────────────
        aud = claims.get("aud", [])
        if isinstance(aud, str):
            aud = [aud]
        if MCP_RESOURCE_ID not in aud:
            return JSONResponse(status_code=403, content={"error": "wrong_audience"})

        request.state.user = dict(claims)
        return await call_next(request)

    except Exception as e:
        return JSONResponse(status_code=401, content={"error": str(e)})

Wire it into your server:

# Add to server.py after: mcp = FastMCP(...)
from auth_middleware import oauth_middleware
mcp.app.middleware("http")(oauth_middleware)

Real-World Cost (Numbers, Not Theory)

Expect:

+3–100ms MCP protocol overhead per call (varies by gateway)
+280–950ms total round-trip latency including tool execution (p50–p95)
+1.8–4.2 seconds for multi-step agentic flows (3-agent chains, p50–p95)
Extra infra: containers, auth service, monitoring
Non-trivial engineering overhead during setup

If your system is latency-sensitive or single-tool focused, MCP will feel slower — because it is.

You're buying reuse and consistency, not speed.

Observability & Debugging

Make MCP servers and client-host interactions observable from day one:

Trace every request — propagate a correlation ID across host → client → server → backend and include it in logs and responses
Metrics to track — mcp_tool_call_latency, mcp_tool_call_errors, mcp_discovery_time, mcp_auth_failures, per-tool success rates
Structured JSON logs — (timestamp, level, correlation_id, tool, args, duration, error) to make postmortems fast
Distributed tracing — use OpenTelemetry to connect host traces to server traces for full request/response visibility
Troubleshooting checklist — confirm /.well-known/oauth-protected-resource, check token aud claim, verify /health readiness, replay failing tool calls against a local mock server

Testing & QA

Treat the MCP server like a public contract:

Contract tests — verify the tool list and input/output schemas remain compatible across releases. Fail CI on breaking schema changes.
Unit tests — each @mcp.tool() should have unit tests for edge cases
Integration tests — run the server against a mocked backend and a simulated client that performs discovery + tool calls
Smoke tests — lightweight end-to-end checks that run on deploy (e.g., a small request to get_customer and get_orders)

Production Checklist

[ ] Health/readiness: expose /health and use readiness probes (ACA/AKS)
[ ] Replicas: min-replicas >= 1 for interactive workloads to avoid cold starts
[ ] Auth: validate aud claim and enforce least-privilege scopes
[ ] Rate limiting & quotas: protect downstream systems from runaway agents
[ ] Secrets: store credentials in Key Vault / secret manager — avoid plain env vars for DB credentials
[ ] Observability: metrics, logs, and traces wired to your platform
[ ] Cost & latency: budget for an extra 280–950ms network hop per tool call

Deploying on Azure

MCP is not Azure-specific — it runs on any platform that supports HTTP. But if you're already in the Azure ecosystem, here's the honest breakdown.

flowchart TD
    A["Infra context?"]
    A -->|Greenfield| B["Azure Container Apps\n✅ Default choice"]
    A -->|Existing cluster| C["AKS + Workload Identity"]
    A -->|Low usage / event-driven| D["Azure Functions"]
    A -->|Existing App Service| E["App Service /mcp"]

Azure Deployment Decision Table

Situation	Best fit	Why
Greenfield, no existing infra	Azure Container Apps	Least ops overhead, managed ingress
Already running AKS	AKS + Ingress	Reuse cluster auth, colocate workloads
Multiple servers, same team	ACA shared environment	Internal discovery, Dapr sidecars
Infrequent tools / cost-sensitive	Azure Functions	Per-invocation billing, zero idle cost
Existing App Service web app	App Service /mcp	Zero infrastructure change
Enterprise, Istio service mesh	AKS + mTLS	Zero trust alongside OAuth 2.1

Option 1 — Azure Container Apps (start here)

Recommended for most teams starting fresh. Handles autoscaling, ingress, Entra ID integration, and scale-to-zero with no MCP-specific platform knowledge needed.

RG="mcp-rg"
ACR="<your-acr-name>"

# Build and push
az acr build --registry $ACR --image customer-mcp-server:latest .

# Create environment
az containerapp env create \
  --name mcp-env --resource-group $RG --location eastus

# Store credentials in Key Vault — never directly in env vars
az keyvault secret set \
  --vault-name mcp-keyvault \
  --name database-url \
  --value "postgresql://user:pass@host:5432/db"

# Deploy
az containerapp create \
  --name customer-mcp-server \
  --resource-group $RG \
  --environment mcp-env \
  --image $ACR.azurecr.io/customer-mcp-server:latest \
  --target-port 8000 \
  --ingress external \
  --min-replicas 1 \
  --max-replicas 10 \
  --secrets db-url=keyvaultref:<key-vault-secret-uri> \
  --env-vars DATABASE_URL=secretref:db-url

min-replicas 1 is critical for interactive use — scale-to-zero causes cold start that breaks mid-conversation flow. Expect ~1–3s latency even on warm replicas; design your agent loop timeouts accordingly (30s is a safe outer bound).

Option 2 — AKS (if you already run a cluster)

Use Workload Identity over storing credentials in env vars. Reuse cluster auth and network policies.

# mcp-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: customer-mcp-server
  namespace: mcp-system
spec:
  replicas: 2
  selector:
    matchLabels:
      app: customer-mcp-server
  template:
    metadata:
      labels:
        app: customer-mcp-server
    spec:
      containers:
        - name: mcp-server
          image: <your-acr>.azurecr.io/customer-mcp-server:latest
          ports:
            - containerPort: 8000
          env:
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: mcp-secrets
                  key: database-url
          readinessProbe:
            httpGet:
              path: /health
              port: 8000
            initialDelaySeconds: 5
            periodSeconds: 10
          livenessProbe:
            httpGet:
              path: /health
              port: 8000
            initialDelaySeconds: 15
            periodSeconds: 20
---
apiVersion: v1
kind: Service
metadata:
  name: customer-mcp-server
  namespace: mcp-system
spec:
  selector:
    app: customer-mcp-server
  ports:
    - port: 8000
      targetPort: 8000
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: mcp-ingress
  namespace: mcp-system
spec:
  rules:
    - host: mcp.yourdomain.internal
      http:
        paths:
          - path: /mcp
            pathType: Prefix
            backend:
              service:
                name: customer-mcp-server
                port:
                  number: 8000

kubectl create namespace mcp-system
kubectl apply -f mcp-deployment.yaml
kubectl get pods -n mcp-system

Option 3 — Azure Functions (infrequent, event-driven tools)

Best for tools called rarely — scheduled jobs, webhook-triggered actions, batch processors. Per-invocation billing means zero idle cost. Not suitable for interactive agent loops — cold start breaks conversational flow.

Option 4 — App Service (add `/mcp` to an existing service)

Mount the MCP server on /mcp alongside existing routes. Zero infrastructure change. Lowest friction path for adding MCP to a service that already exists.

Common Failure Modes

Token misconfiguration — auth breaks silently on retry
Tool explosion — too many vaguely named tools confuse the model
Latency stacking — chained MCP calls compound response time (budget 280–950ms per hop)
Business logic duplication — logic living in both the server and the calling agent
Weak observability — no tracing across the client–server boundary
No ownership boundaries — multiple teams modifying the same server

Critical rule: One MCP server = one owning team. Violating this recreates the exact problem MCP was meant to solve.

What MCP Does NOT Solve

Bad tool design
Weak or vague prompts
Latency at the model level
Capability gaps in the underlying LLM
Business logic duplication inside your tools

MCP standardises access. It does not improve what you expose.

Migration Path

Don't start with MCP. Migrate to it when the pain appears:

Start with plain tool calling
Identify tools being duplicated across teams or surfaces
Wrap those shared tools in an MCP server
Add auth, deployment, and observability incrementally

Avoid premature standardisation. It's just premature optimisation with a fancier name.

Further Resources

MCP spec: https://modelcontextprotocol.io/specification
Inspector: npx @modelcontextprotocol/inspector — discover and test any MCP server
FastMCP: https://github.com/jlowin/fastmcp — the Python server/client used in this guide

If this saved you from building an MCP server you didn't need — hit clap. It takes one second and tells the algorithm this is worth distributing. If you've already wired MCP into production, drop your setup in the comments — specifically curious what the auth layer looks like on your end.

The Honest Verdict

MCP won't make your architecture simpler. It will make it honest — every tool, resource, and auth scope exactly where it belongs, exposed through a contract any host can consume.

If you're building for one app, one team: skip it. Plain tool calling ships faster and is entirely adequate.

But if you're building something multiple systems or teams will consume — especially if those tools already exist as APIs — MCP ends the "three copies of the same integration, each slightly wrong" problem permanently.

That's the only problem it solves. It solves it very well.

DEV Community

MCP for Backend Engineers: When to Use It (and When to Skip It)

TL;DR

Why This Exists

MCP in One Line

The Hero Diagram: MCP vs Tool Calling

What MCP Actually Is

Case Study: Wrapping a Client REST API

Should You Use MCP? — Decide Before Reading Further

Five Real Scenarios

Decision Flow

Where MCP Is a Bad Idea

Core MCP Concepts

Server-side primitives

Client-side capabilities

Request / Response Lifecycle

Minimal MCP Server (Python)

MCP Client + Agent Loop

Auth — The Part Most Tutorials Skip

Real-World Cost (Numbers, Not Theory)

Observability & Debugging

Testing & QA

Production Checklist

Deploying on Azure

Azure Deployment Decision Table

Option 1 — Azure Container Apps (start here)

Option 2 — AKS (if you already run a cluster)

Option 3 — Azure Functions (infrequent, event-driven tools)

Option 4 — App Service (add `/mcp` to an existing service)

Common Failure Modes

What MCP Does NOT Solve

Migration Path

Further Resources

The Honest Verdict

Related Reading

Top comments (0)

TL;DR

Why This Exists

MCP in One Line

The Hero Diagram: MCP vs Tool Calling

What MCP Actually Is

Case Study: Wrapping a Client REST API

Should You Use MCP? — Decide Before Reading Further

Five Real Scenarios

Decision Flow

Where MCP Is a Bad Idea

Core MCP Concepts

Server-side primitives

Client-side capabilities

Request / Response Lifecycle

Minimal MCP Server (Python)

MCP Client + Agent Loop

Auth — The Part Most Tutorials Skip

Real-World Cost (Numbers, Not Theory)

Observability & Debugging

Testing & QA

Production Checklist

Deploying on Azure

Azure Deployment Decision Table

Option 1 — Azure Container Apps (start here)

Option 2 — AKS (if you already run a cluster)

Option 3 — Azure Functions (infrequent, event-driven tools)

Option 4 — App Service (add /mcp to an existing service)

Common Failure Modes

What MCP Does NOT Solve

Migration Path

Further Resources

The Honest Verdict

Related Reading

Option 4 — App Service (add `/mcp` to an existing service)