DEV Community

Docat
Docat

Posted on

Why LLMs Suck at Calling APIs (And How Flat Schemas Fix It)

Every AI tool-calling framework has the same dirty secret: LLMs are terrible at constructing nested JSON.

If you've built MCP servers, OpenAI function calls, or any tool-use integration, you've seen it. The model hallucinates field names, forgets closing braces, puts arrays where objects should be, and confidently generates invalid payloads.

The fix is surprisingly simple: flatten your schemas.


The Problem

Consider a typical REST API endpoint:

POST /api/orders
Content-Type: application/json

{
  "customer": {
    "name": "Alice",
    "address": {
      "street": "123 Main St",
      "city": "Portland",
      "state": "OR",
      "zip": "97201"
    }
  },
  "items": [
    { "sku": "WIDGET-1", "quantity": 2 }
  ],
  "shipping": {
    "method": "express",
    "instructions": "Leave at door"
  }
}
Enter fullscreen mode Exit fullscreen mode

When you expose this as an MCP tool with the full nested schema, you're asking the LLM to:

  1. Remember a 3-level deep JSON structure
  2. Keep track of which braces belong to which object
  3. Correctly nest address inside customer, not at the root
  4. Generate a valid array of objects for items
  5. Get all of this right in a single generation pass with no backtracking

The success rate drops with every level of nesting. In our testing, a 3-level nested schema produces malformed tool calls 15-25% of the time. At 4+ levels, it's worse.


Why Nesting Fails

LLMs generate tokens left-to-right. They don't have a "syntax checker" running in parallel — they predict the next token based on context.

When generating nested JSON:

{
  "customer": {
    "name": "Alice",
    "address": {
      "street": "123 Main St",
Enter fullscreen mode Exit fullscreen mode

By this point, the model is tracking three open braces and needs to remember the exact structure to close them correctly. Each additional level of nesting adds cognitive load to the generation.

Common failure modes:

  • Premature closing: "street": "123 Main St" } } — closes address and customer too early
  • Wrong level: Puts city at the customer level instead of inside address
  • Missing objects: Omits the address wrapper entirely, putting street/city at the customer level
  • Array confusion: Generates "items": { "sku": "..." } instead of an array

These aren't bugs in the model. They're a natural consequence of autoregressive generation on complex structures.


The Solution: Flat Schemas

Instead of exposing the nested schema, flatten it:

{
  "customer_name": { "type": "string", "description": "Customer name" },
  "customer_address_street": { "type": "string" },
  "customer_address_city": { "type": "string" },
  "customer_address_state": { "type": "string" },
  "customer_address_zip": { "type": "string" },
  "items_0_sku": { "type": "string", "description": "First item SKU" },
  "items_0_quantity": { "type": "integer" },
  "shipping_method": { "type": "string", "enum": ["standard", "express"] },
  "shipping_instructions": { "type": "string" }
}
Enter fullscreen mode Exit fullscreen mode

The LLM now generates a simple key-value map:

{
  "customer_name": "Alice",
  "customer_address_street": "123 Main St",
  "customer_address_city": "Portland",
  "customer_address_state": "OR",
  "customer_address_zip": "97201",
  "items_0_sku": "WIDGET-1",
  "items_0_quantity": 2,
  "shipping_method": "express",
  "shipping_instructions": "Leave at door"
}
Enter fullscreen mode Exit fullscreen mode

No nesting. No ambiguity. The tool's executor reconstructs the proper nested JSON before sending the HTTP request.

Results:

  • Tool-call accuracy: ~95%+ (up from ~75-85% with nested schemas)
  • Average retries per call: 0.1 (down from 0.5-1.0)
  • Token usage: roughly the same (flat keys are longer, but fewer retries)

When NOT to Flatten

Flattening isn't always the right choice:

  • Very simple schemas (1 level deep, < 5 fields) — nesting is fine, no accuracy impact
  • User-facing tools where the schema IS the API — users expect the real structure
  • Dynamic/recursive schemas — trees, linked lists, etc. can't be statically flattened

For the 80% of cases where you're wrapping a REST API for LLM consumption, flatten everything.


Implementing This in MCP

If you're building MCP tools by hand, here's a minimal flattener:

function flattenSchema(
  schema: Record<string, any>,
  prefix = "",
  result: Record<string, any> = {}
): Record<string, any> {
  for (const [key, value] of Object.entries(schema.properties ?? {})) {
    const fullKey = prefix ? `${prefix}_${key}` : key;

    if (value.type === "object" && value.properties) {
      flattenSchema(value, fullKey, result);
    } else {
      result[fullKey] = { ...value };
    }
  }
  return { type: "object", properties: result };
}
Enter fullscreen mode Exit fullscreen mode

Then in your tool executor, unflatten the args back into the nested structure before making the API call.

Or skip all of this and use mcp-openapi, which does it automatically:

npx mcp-openapi --spec https://your-api.com/openapi.json
Enter fullscreen mode Exit fullscreen mode

It reads your OpenAPI spec, flattens all parameter schemas, generates MCP tools, and handles the unflatten→HTTP request mapping internally. Zero config.


Response Side: Smart Truncation

The other half of the problem is responses. Large API responses (paginated lists, deeply nested objects) can blow up the LLM's context window or get hard-truncated mid-JSON, leaving the model confused.

Smart truncation strategies:

1. Array slicing — Show first N items + metadata:

[
  { "id": 1, "name": "Widget A" },
  { "id": 2, "name": "Widget B" },
  { "_meta": "showing 2 of 847 items. Use offset/limit to paginate." }
]
Enter fullscreen mode Exit fullscreen mode

2. Depth pruning — Summarize beyond a certain depth:

{
  "user": {
    "name": "Alice",
    "orders": "[array(23)]",
    "preferences": "[object(8 keys)]"
  }
}
Enter fullscreen mode Exit fullscreen mode

3. Field filtering — Use JMESPath to extract only relevant fields:

items[].{id: id, name: name, price: price}
Enter fullscreen mode Exit fullscreen mode

These techniques preserve the shape of the data while keeping it within context limits. The LLM can still understand the structure and ask for more details if needed.


Key Takeaways

  1. LLMs fail on nested JSON — 15-25% error rate at 3+ levels of nesting
  2. Flat schemas fix this — simple key-value pairs reduce errors to < 5%
  3. The tool executor handles reconstruction — flatten for the LLM, unflatten for the API
  4. Smart truncation > hard truncation — preserve data structure, not just characters
  5. For OpenAPI specs, automate it — tools like mcp-openapi handle this end-to-end

If you're building MCP servers or any LLM tool integration, try flattening your schemas. The accuracy improvement is immediate and dramatic.

Star the repo: github.com/Docat0209/mcp-openapi

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.