I built FlashMCP, a gateway that turns any REST API into an MCP server on the fly. Here's how the architecture works.
The request flow
When a POST hits flashmcp.dev/api.stripe.com:
-
URL parsing — Extract target host (
api.stripe.com), base path, and optional?spec=parameter -
Short-circuit —
initializeandpingdon't need a spec. Return immediately. -
Spec resolution — Check Cloudflare Cache API → fetch explicit
?spec=URL → auto-discover - Tool mapping — Parse OpenAPI spec, map all endpoints to MCP tool definitions
-
JSON-RPC dispatch — Route
tools/listortools/callto the right handler - Upstream proxy — Build request, forward auth headers, call the real API
- Response formatting — JSON → text, images → base64 content blocks, 204 → confirmation message
All in a single Cloudflare Worker. No containers, no state, no cold starts.
Auto-discovery: finding the spec
Most APIs don't tell you where their OpenAPI spec is. FlashMCP uses a 3-phase discovery system:
Phase 1: apis.guru — A community directory of 2,500+ API specs. We check if the target host has an entry.
Phase 2: HTTP Link header — Per RFC 8631, APIs can advertise their spec via:
Link: </openapi.json>; rel="service-desc"
Phase 3: .well-known/api-catalog — Per RFC 9727, fetch https://target/.well-known/api-catalog for a JSON catalog of API descriptions.
Total: max 3-4 HTTP requests. Results cached for 5 minutes.
The $ref problem
OpenAPI specs use JSON $ref pointers for schema reuse. A pet store spec might have:
{
"Pet": {
"properties": {
"category": { "$ref": "#/components/schemas/Category" },
"tags": {
"items": { "$ref": "#/components/schemas/Tag" }
}
}
}
}
Simple, right? Until you hit circular references:
{
"TreeNode": {
"properties": {
"children": {
"items": { "$ref": "#/components/schemas/TreeNode" }
}
}
}
}
FlashMCP's resolveSchemaRefs() tracks visited references and replaces circular refs with { type: "object", description: "(circular reference)" }. No external dependencies — just recursive descent with a Set.
Body flattening
LLMs are bad at nesting. When an API expects:
POST /pets
{ "name": "Rex", "status": "available" }
A naive MCP server would define the tool as:
createPet({ body: { name: "Rex", status: "available" } })
LLMs frequently forget the body wrapper. FlashMCP flattens body fields into the top-level schema:
createPet({ name: "Rex", status: "available" })
Then reconstructs the body from non-parameter arguments at call time.
Pagination for large APIs
Some APIs have 500+ endpoints. Dumping all tools in one response overwhelms LLMs. FlashMCP uses cursor-based pagination — 50 tools per page, with a nextCursor for more.
What I learned
KV TTL minimum is 60 seconds — Cloudflare KV throws a 400 if you set
expirationTtlbelow 60. This surfaces as a generic 1101 error to clients.new URL(path, base)strips base paths —new URL("/pets", "https://api.com/v3")giveshttps://api.com/pets, nothttps://api.com/v3/pets. Use string concatenation.Stateless is enough — MCP supports sessions, but for a proxy gateway, stateless works perfectly. Each request is independent.
Try it: flashmcp.dev | Playground | Gallery
Top comments (0)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.