Why LLMs Suck at Calling APIs — And How Flat Schemas Fix It

#webdev #ai #programming #opensource

Every AI tool-calling framework has the same dirty secret: LLMs are terrible at constructing nested JSON.

If you've built MCP servers, OpenAI function calls, or any tool-use integration, you've seen it. The model hallucinates field names, forgets closing braces, puts arrays where objects should be, and confidently generates invalid payloads.

The fix is surprisingly simple: flatten your schemas.

The Problem

Consider a typical REST API endpoint:

POST /api/orders
Content-Type: application/json

{
  "customer": {
    "name": "Alice",
    "address": {
      "street": "123 Main St",
      "city": "Portland",
      "state": "OR",
      "zip": "97201"
    }
  },
  "items": [
    { "sku": "WIDGET-1", "quantity": 2 }
  ],
  "shipping": {
    "method": "express",
    "instructions": "Leave at door"
  }
}

When you expose this as an MCP tool with the full nested schema, you're asking the LLM to:

Remember a 3-level deep JSON structure
Keep track of which braces belong to which object
Correctly nest address inside customer, not at the root
Generate a valid array of objects for items
Get all of this right in a single generation pass with no backtracking

The success rate drops with every level of nesting. In our testing, a 3-level nested schema produces malformed tool calls 15-25% of the time. At 4+ levels, it's worse.

Why Nesting Fails

LLMs generate tokens left-to-right. They don't have a "syntax checker" running in parallel — they predict the next token based on context.

When generating nested JSON:

{
  "customer": {
    "name": "Alice",
    "address": {
      "street": "123 Main St",

By this point, the model is tracking three open braces and needs to remember the exact structure to close them correctly. Each additional level of nesting adds cognitive load to the generation.

Common failure modes:

Premature closing: "street": "123 Main St" } } — closes address and customer too early
Wrong level: Puts city at the customer level instead of inside address
Missing objects: Omits the address wrapper entirely, putting street/city at the customer level
Array confusion: Generates "items": { "sku": "..." } instead of an array

These aren't bugs in the model. They're a natural consequence of autoregressive generation on complex structures.

The Solution: Flat Schemas

Instead of exposing the nested schema, flatten it:

{
  "customer_name": { "type": "string", "description": "Customer name" },
  "customer_address_street": { "type": "string" },
  "customer_address_city": { "type": "string" },
  "customer_address_state": { "type": "string" },
  "customer_address_zip": { "type": "string" },
  "items_0_sku": { "type": "string", "description": "First item SKU" },
  "items_0_quantity": { "type": "integer" },
  "shipping_method": { "type": "string", "enum": ["standard", "express"] },
  "shipping_instructions": { "type": "string" }
}

The LLM now generates a simple key-value map:

{
  "customer_name": "Alice",
  "customer_address_street": "123 Main St",
  "customer_address_city": "Portland",
  "customer_address_state": "OR",
  "customer_address_zip": "97201",
  "items_0_sku": "WIDGET-1",
  "items_0_quantity": 2,
  "shipping_method": "express",
  "shipping_instructions": "Leave at door"
}

No nesting. No ambiguity. The tool's executor reconstructs the proper nested JSON before sending the HTTP request.

Results:

Tool-call accuracy: ~95%+ (up from ~75-85% with nested schemas)
Average retries per call: 0.1 (down from 0.5-1.0)
Token usage: roughly the same (flat keys are longer, but fewer retries)

When NOT to Flatten

Flattening isn't always the right choice:

Very simple schemas (1 level deep, < 5 fields) — nesting is fine, no accuracy impact
User-facing tools where the schema IS the API — users expect the real structure
Dynamic/recursive schemas — trees, linked lists, etc. can't be statically flattened

For the 80% of cases where you're wrapping a REST API for LLM consumption, flatten everything.

Implementing This in MCP

If you're building MCP tools by hand, here's a minimal flattener:

function flattenSchema(
  schema: Record<string, any>,
  prefix = "",
  result: Record<string, any> = {}
): Record<string, any> {
  for (const [key, value] of Object.entries(schema.properties ?? {})) {
    const fullKey = prefix ? `${prefix}_${key}` : key;

    if (value.type === "object" && value.properties) {
      flattenSchema(value, fullKey, result);
    } else {
      result[fullKey] = { ...value };
    }
  }
  return { type: "object", properties: result };
}

Then in your tool executor, unflatten the args back into the nested structure before making the API call.

Or skip all of this and use mcp-openapi, which does it automatically:

npx mcp-openapi --spec https://your-api.com/openapi.json

It reads your OpenAPI spec, flattens all parameter schemas, generates MCP tools, and handles the unflatten→HTTP request mapping internally. Zero config.

Response Side: Smart Truncation

The other half of the problem is responses. Large API responses (paginated lists, deeply nested objects) can blow up the LLM's context window or get hard-truncated mid-JSON, leaving the model confused.

Smart truncation strategies:

1. Array slicing — Show first N items + metadata:

[
  { "id": 1, "name": "Widget A" },
  { "id": 2, "name": "Widget B" },
  { "_meta": "showing 2 of 847 items. Use offset/limit to paginate." }
]

2. Depth pruning — Summarize beyond a certain depth:

{
  "user": {
    "name": "Alice",
    "orders": "[array(23)]",
    "preferences": "[object(8 keys)]"
  }
}

3. Field filtering — Use JMESPath to extract only relevant fields:

items[].{id: id, name: name, price: price}

These techniques preserve the shape of the data while keeping it within context limits. The LLM can still understand the structure and ask for more details if needed.

Key Takeaways

LLMs fail on nested JSON — 15-25% error rate at 3+ levels of nesting
Flat schemas fix this — simple key-value pairs reduce errors to < 5%
The tool executor handles reconstruction — flatten for the LLM, unflatten for the API
Smart truncation > hard truncation — preserve data structure, not just characters
For OpenAPI specs, automate it — tools like mcp-openapi handle this end-to-end