DEV Community

Nova
Nova

Posted on

JSON-First Prompting: Valid Structured Output (Plus a Repair Loop)

If you use LLMs for anything beyond “brainstorm ideas”, you quickly run into the same bottleneck:

You don’t want prose.

You want structured output you can pipe into code:

  • a list of tasks with owners
  • a set of test cases
  • a config file
  • a PR checklist
  • a JSON object you can validate

The problem: LLMs love to almost follow your format.

This post is my practical approach to JSON-first prompting—including a recovery flow for when the model outputs invalid JSON.


The principle: treat output like an API response

When you want JSON, stop prompting like you’re chatting and start prompting like you’re designing an API contract.

That means:

  • explicitly define the schema
  • forbid extra keys
  • specify error behavior
  • validate output in code

A copy/paste JSON-first prompt

Use this template as a starting point.

You are a careful assistant that outputs ONLY valid JSON.

Return a JSON object matching this schema:
{
  "title": string,
  "summary": string,
  "tasks": [
    {
      "id": string,
      "description": string,
      "priority": "low" | "medium" | "high",
      "estimate_hours": number,
      "dependencies": string[]
    }
  ]
}

Rules:
- Output must be VALID JSON (double quotes, no trailing commas).
- Output must contain ONLY the JSON object. No markdown, no commentary.
- Do not include keys not in the schema.
- If the input is missing critical info, set summary to "NEEDS_INFO" and include a "tasks" entry with id "question" describing what you need.

Input:
<PASTE YOUR NOTES HERE>
Enter fullscreen mode Exit fullscreen mode

This does a few important things:

  • It says “ONLY JSON” twice (people underestimate how much that helps)
  • It defines acceptable enums
  • It specifies what to do when info is missing (instead of hallucinating)

Add a “strictness knob” when it matters

Sometimes you want the model to ask questions. Sometimes you want it to make assumptions.

I add a strictness knob:

Strictness: 0 = assume reasonable defaults, 1 = ask questions when unsure.
Strictness = 1
Enter fullscreen mode Exit fullscreen mode

Then include:

- If Strictness = 1 and any required field can’t be derived, return NEEDS_INFO behavior.
Enter fullscreen mode Exit fullscreen mode

It prevents the model from guessing estimates, priorities, etc.


The reality: you still need validation

Even with a great prompt, you should validate.

Here’s a minimal Node.js snippet that:

1) extracts the first JSON object from a response (helpful when a model leaks extra text)
2) parses it
3) validates basic shape

// parse-json.js
export function extractJson(text) {
  const start = text.indexOf("{");
  const end = text.lastIndexOf("}");
  if (start === -1 || end === -1 || end <= start) {
    throw new Error("No JSON object found");
  }
  return text.slice(start, end + 1);
}

export function parseJson(text) {
  const jsonText = extractJson(text);
  return JSON.parse(jsonText);
}

export function assertSchema(obj) {
  if (typeof obj.title !== "string") throw new Error("title must be string");
  if (!Array.isArray(obj.tasks)) throw new Error("tasks must be array");
  for (const t of obj.tasks) {
    if (typeof t.id !== "string") throw new Error("task.id must be string");
    if (typeof t.description !== "string") throw new Error("task.description must be string");
  }
}
Enter fullscreen mode Exit fullscreen mode

This isn’t a full JSON Schema validator, but it catches 90% of “oops” errors fast.


Recovery flow: when the JSON is invalid

When parsing fails, don’t throw away the whole response. Instead, send a repair prompt.

Repair prompt (copy/paste)

You output invalid JSON.

Fix it.

Rules:
- Output ONLY valid JSON.
- Keep the same data and meaning.
- Do not add new keys.

Here is the invalid output:
<PASTE>
Enter fullscreen mode Exit fullscreen mode

This works surprisingly well.

Bonus: validate → repair → validate loop

In practice, I do:

1) ask for JSON
2) parse + validate
3) if fail, run repair prompt
4) parse + validate again

You can even automate this loop.


A more robust option: JSON Schema (when available)

Some LLM APIs support “structured output” / JSON Schema mode. If you have it, use it.

But even then, the prompting principles stay the same:

  • define the schema
  • constrain output
  • validate anyway

Because the failure mode isn’t just “invalid JSON”—it’s valid JSON with the wrong meaning.


Practical example: generate test cases as JSON

Here’s a prompt that generates unit test cases you can feed into a test runner.

Output ONLY JSON.

Schema:
{
  "function": string,
  "cases": [
    {
      "name": string,
      "input": object,
      "expected": object,
      "notes": string
    }
  ]
}

Rules:
- Valid JSON only.
- cases must include: happy path, edge cases, invalid inputs.

Function contract:
- function: normalizeUser(user)
- input: { name?: string, email?: string, age?: number }
- output: { name: string, email: string, age: number, isAdult: boolean }
- behavior:
  - missing name/email => throw "INVALID_USER"
  - missing age => default to 0

Generate 8 cases.
Enter fullscreen mode Exit fullscreen mode

This gives you something you can turn into real tests quickly.


The key mindset shift

When you prompt for structured output, you’re not “asking for help.”

You’re defining:

  • a contract
  • a validator
  • and a retry path

That combo is what makes LLMs usable in pipelines.


Want more copy/paste templates?

I publish the templates I actually use (code review, debugging, planning, writing) in my Prompt Engineering Cheatsheet.

Top comments (0)