Shanahan

Posted on Jun 28

Wiring Django to Claude: Generating an MCP server from OpenAPI

#ai #django #mcp #python

If you build Django APIs, you probably use drf-spectacular to generate your OpenAPI schema. That schema already describes every single endpoint you have. Here is how to automatically transform it into a Model Context Protocol (MCP) server. As well as how to survive the harder engineering problems that appear once real AI agents actually starts calling it.

If you want an LLM to operate your Django API, not just talk about it, you have to give it tools. The naïve way is to hand-write one MCP tool per endpoint: name, description, input schema, the call itself. It works on day one, and it rots by day thirty. Every new endpoint, every renamed field, every changed query param becomes a second place you have to remember to update. Maintenance scales with your API surface, which is exactly the thing you were hoping the LLM would help you tame.

But you've already described your entire API once. If you use drf-spectacular, and if you're doing DRF in 2026 you almost certainly do, that's an OpenAPI 3 schema sitting in your project that already knows every path, parameter, and type. That schema can be your tool definitions. You just have to teach something to read it.

DRF views (drf-spectacular) ──▶ OpenAPI schema ──▶ MCP tools ──▶ Claude (or any other LLMs)

This post comes in two halves. First, the wiring: how you turn that schema into a working MCP server, with real code, so you understand what off-the-shelf tools are doing under the hood. Then the harder half: the problems that have nothing to do with OpenAPI and everything to do with putting a real agent in front of those tools.

First, the honest part: this isn't new

Generating LLM tools from an OpenAPI spec is a solved problem. On the Django side, packages like django-rest-framework-mcp do almost exactly this: they hook directly into your DRF configuration and expose your Views and ViewSets as MCP tools with minimal setup. There are also generic openapi → mcp converters (openapi-to-mcp, openapi-mcp-generator), and FastMCP ships native OpenAPI support.

So why read on? Because if you only ever pip install one of those, you never learn what it's doing. The day it doesn't quite fit your API, you're stuck debugging a black box. I built a small, readable reference implementation so you can see every piece. If you want a maintained dependency, use FastMCP. If you want to understand how it works, keep reading.

Part 1: The wiring, in five pieces

1. Introspect (in-process, no HTTP self-calling). The reflex is to fetch /api/schema/ over HTTP. Don't. drf-spectacular will hand you the same schema in-process, no running server required:

from drf_spectacular.generators import SchemaGenerator

generator = SchemaGenerator()
schema = generator.get_schema(request=None, public=True)

That's it: a plain dict describing every endpoint. (If your schema lives somewhere else, a remote service or a non-DRF API, you fall back to fetching a URL and resolving its $ref pointers yourself.)

2. Generate tool specs. Walk the schema's paths and turn each operation into a tool. The operationId becomes the name, the parameters become a JSON Schema. Path params are always required, query params honour their own flag:

for param in operation.get("parameters", []):
    if "$ref" in param:                         # params can be references
        param = resolve_ref(schema, param["$ref"])
    location = param.get("in")                   # "path" or "query"
    if location not in ("path", "query"):
        continue
    properties[param["name"]] = dict(param.get("schema") or {"type": "string"})
    if location == "path" or param.get("required"):
        required.append(param["name"])

Two gotchas the happy-path tutorials skip.

First, $ref resolution: a parameter can arrive as {"$ref": "#/components/parameters/Foo"} instead of an inline object. Resolve the pointer or you'll generate broken tools.

Second, name collisions and charset: MCP tool names allow only a limited character set, and two operations can share an operationId. Sanitize and de-duplicate, or the client rejects your entire tool list.

3. Serve: the one MCP decision that matters. The official mcp Python SDK gives you two ways to define tools. The high-level FastMCP @tool decorator is nice for tools you know at write time. It's the wrong tool for this job. We're generating tools at runtime, each with a JSON Schema we can't know in advance. For that, you drop down to the low-level Server:

from mcp.server.lowlevel import Server
import mcp.types as types

server = Server("my-django-api")

@server.list_tools()
async def list_tools():
    return [types.Tool(name=s.name, description=s.description,
                       inputSchema=s.input_schema) for s in specs]

@server.call_tool()
async def call_tool(name, arguments):
    spec = by_name[name]
    text = await execute(spec, arguments)
    return [types.TextContent(type="text", text=text)]

If you take one thing from this half of the post, take this: @tool is for static tools. The low-level Server is for tools you generate. Reaching for the decorator first and then fighting it is the most common wrong turn.

4. Execute. A tool spec is really two things: a schema to advertise, and the routing info to make the call. At call time you split the arguments back apart. Path params get substituted into the URL template, query params get attached as a query string. Then fire the request:

for key, value in arguments.items():
    if key in path_params:
        path = path.replace("{" + key + "}", str(value))
    elif key in query_params:
        query[key] = value

url = base_url.rstrip("/") + path
async with httpx.AsyncClient(timeout=timeout) as client:
    resp = await client.request(spec.method, url, params=query, headers=headers)

5. Two transports, one server. The same Server object serves over stdio (how Claude Desktop and Claude Code launch a local server) and Streamable HTTP (for a deployed server).

One gotcha that cost me an hour the first time: in stdio mode, stdout is the protocol channel. One stray print() corrupts the stream, and the client silently shows zero tools.🙃

Always log your diagnostics to stderr instead.

Two decisions worth making consciously

Safe by default. Only GET endpoints become tools. Auto-exposing your whole CRUD surface to an LLM, DELETE included, is how you end up explaining to your team why the agent dropped a production row. Writes are strictly opt-in (INCLUDE_METHODS = ["GET", "POST"]). Check your package's default and override it on purpose, not by accident.

Pro-tip: even within safe GET requests, you've probably got views you never want an LLM anywhere near. Heavy reporting endpoints, billing aggregations, anything expensive. Because the schema is generated in-process, you can hide those with drf-spectacular's native @extend_schema(exclude=True) decorator. It drops the endpoint from both your OpenAPI docs and the generated tool list in one move.

Auth: the part that bites. A generated tool is useless if it can't authenticate, so credentials are applied to every outgoing request, configured once (token / bearer / custom header). But I'll be straight about the ceiling: this uses static credentials.

A better version, forwarding the calling user's own credentials so each tool runs with their permissions instead of a shared service account, is a line between demo and deployment. Don't let any tutorial, this one included, tell you auth is "done" while it's pointing at a single static token.

Try it

The repo ships a runnable example/ DRF project, a tiny shop (products + orders, with an in_stock filter), so you can watch the whole loop work in about two minutes:

git clone https://github.com/Shanahan-Suresh/django-openapi-mcp
cd django-openapi-mcpnstall "git+https://github.com/Shanahan-Suresh/django-openapi-mcp"

In Django settings.py

INSTALLED_APPS = [..., "rest_framework", "drf_spectacular", "django_openapi_mcp"]
DJANGO_OPENAPI_MCP = {"BASE_URL": "http://127.0.0.1:8000"}

On a new terminal run

python manage.py run_mcp_server --transport stdio

Wire it into Claude Desktop's claude_desktop_config.json, restart, and all your endpoints are tools in your own MCP server:

Now for the fun part. Open a chat and ask Claude to interact with your Django app: "List all products that are in stock."

Claude will automatically map your prompt to the products_list tool, understand it needs to pass the in_stock=true query parameter based on your schema, and ask for permission to run it:

Once you allow it, the tool hits your local Django API, returns the JSON, and Claude formats it into a neat table. You just gave an LLM read-access to your database with zero hand-written glue code.

Part 2: Generating tools is the easy part

Everything above is an afternoon's work, and several packages will do it for you. Then you point a real agent at those tools and ask it something multi-step: "find this customer's most recent order and tell me whether everything in it is still in stock".

You discover the tools were never the hard part. The hard part is everything between the model and the tools.

In real deployment that agent usually sits behind a chat interface, a thin web UI where someone types a question and watches the tool calls stream back. The surface doesn't change any of the problems below. They're the same whether the caller is a chat app, Claude Desktop, or a cron job.

Generated tools aren't choosable tools. A raw schema gives you one tool per operation, named after its operationId. Fine for a human reading docs, nearly useless for a model deciding which tool to reach for. When the user asks "is their last order still fulfillable?", which of orders_list, orders_retrieve, and products_retrieve fire, and in what order?

You end up adding a thin layer of semantic enrichment, an index of what each tool needs and yields, so the agent can reason over capabilities ("I need something that returns an order id") instead of pattern-matching tool names.

The reasoning loop. A single tool call rarely answers a real question. "Is their last order in stock?" is at least three steps, and step 2 depends on step 1's output.

That's a ReAct-style loop: reason → call a tool → append the result to the conversation → reason again, over a single persistent message history so the model can carry an id from one step to the next.

Errors → classify → recover → retry. This is the layer that surprised me most. Generic retries don't fix semantic errors. If a call fails because a required field is missing or an id is malformed, calling it again unchanged fails identically.

So instead of "retry N times," you classify the failure and pick a recovery strategy matched to it:

Missing fields: Derive them from context (e.g., extract a username from an email).
Wrong identifiers: If the model tries to pass a name ("Bob's latest order") instead of a required ID field, intercept it, use a lookup tool to find the ID, and retry.
Malformed data: Alter wrong formats into the correct shape before hitting the API again.

Keep these as small, ranked, pluggable strategies. A real fraction of first-attempt failures get fixed without going back to the model at all: faster, cheaper, more reliable. The discipline that keeps it safe: derive, never invent. Reformatting a value the user gave you is fine, fabricating one they didn't is how you get confidently wrong answers.

Surviving the model layer. Your loop is only as available as the model behind it, and quota exhaustion and rate limits aren't edge cases in production. They're Tuesdays. A 429/402 is a different animal from a transient blip, retrying the same over-quota provider just burns time. Put an abstraction over the provider, classify quota errors separately from retryable ones, and fall back primary → secondary.

The thread running through all of it

None of those Part 2 problems are about OpenAPI or MCP. Tool generation is just transport, it gets a list of capabilities in front of the model. But everything that makes an agent actually succeed (choosing tools, chaining them, recovering from failure) lives above that line.

If there is one principle that ties both halves of this post together, it's this:
Engineer defensively, rather than hoping the model behaves.

Block destructive operations at generation time.
Derive missing data, never let the model invent it.
Classify errors before you blindly retry.

An LLM is a brilliant, entirely unreliable component. Production engineering is the harness that makes it dependable despite all that.

Wrapping up

If you are building this for work tomorrow and need a quick reliable, maintained package, check out django-mcp-server or django-rest-framework-mcp.

But if you wanted to understand what's happening under the hood, you do now. Feel free to try out the reference repo by cloning it, breaking it down and seeing how the wiring works.