How I ended up contributing five ways to shape an LLM's output

#ai #opensource #node #javascript

A few months back I was on a project that needed LLMs to return actual structured data, not prose I'd have to regex my way through. Simple enough in theory, except every provider wanted the shape described differently - Anthropic's SDK wanted a tool schema, OpenAI had its own response-format, Ollama was basically raw prompting and hoping. I was rewriting the same schema three times per feature and writing my own retry-on-invalid-output loop for anything that didn't guarantee output.

That's how I found ShapeCraft (@aviasole/shapecraft on npm). Define a schema once, hand it a model, get back validated data plus a guaranteeLevel telling you how much to trust it - "guaranteed" means something different on OpenAI's native schema enforcement than on Anthropic's prompt-and-validate approach than on Ollama's token-level grammar constraints.

It already supported Zod out of the box:

import { z } from "zod";
import { generate, openai } from "@aviasole/shapecraft";

const schema = z.object({ name: z.string(), score: z.number() });
const result = await generate(openai({ model: "gpt-4o-mini" }), schema, prompt);

That covered most of what I needed. But a few things in my actual project didn't fit "define a Zod schema," so instead of working around it, I opened some PRs.

Part of the data I was extracting already had a JSON Schema, generated by another tool upstream. Rewriting it as Zod types just to satisfy the library felt like busywork, so that became the first addition - pass the JSON Schema straight through:

const result = await generate(model, {
  jsonSchema: {
    type: "object",
    properties: { name: { type: "string" }, score: { type: "number" } },
    required: ["name", "score"],
  },
}, prompt);

Then I hit cases where I didn't need an object at all, just a string in a specific format. Wrapping that in an object schema felt like putting a hat on a hat, so next came a plain regex option:

const result = await generate(model, {
  pattern: /^\d{4}-\d{2}-\d{2}$/,
}, "What is today's date?");

After that, a rule that only made sense as logic, not a type - something like "required only if this other field equals X." Technically expressible as a schema, miserable to read. So I added a way to pass a validator function directly, plus a hint so the model still has something to aim for:

const result = await generate(model, {
  validate: (output) => typeof output === "object" && output !== null && "id" in output,
  hint: { type: "object", properties: { id: { type: "string" } } },
}, prompt);

The one that mattered most to me was separate from this project entirely. I also do Tally/TDL and GST integration work, and that world runs on XML - nested tags, required fields buried a few levels deep. None of the above helped there since the output isn't JSON-shaped at all. So the last addition was template-based XML: give it an example with typed placeholders, model fills them in.

const result = await generate(model, {
  xml: {
    template: `<book>
  <title>{string}</title>
  <author>{string}</author>
  <year>{number}</year>
</book>`,
    required: ["title", "author"],
  },
}, 'Extract: "Clean Code" by Robert C. Martin, 2008.');

This one took the most back-and-forth. Placeholders are deliberately limited to {string}, {number}, {boolean}, and a typo throws before the model is even called - that's a template bug, not something worth a retry.

The trickier part was literal text. Anything outside the {} is supposed to survive untouched, but it's best-effort by default - a model will occasionally "improve" fixed text that reads like an instruction, since nothing marks it as untouchable. There's an enforceLiterals: true flag that force-corrects every literal after the fact if you need that guaranteed. Or, often simpler, just leave the fixed value out of the template and splice it into result.data yourself afterward.

required also checks non-emptiness at any depth, so <items></items> still triggers a retry instead of counting as present. And arrays plus parse: true gets you back a parsed JS object instead of raw XML, with specific nodes coerced into arrays even with a single item.

One caveat worth knowing up front: XML generation is prompt-driven on every backend, no token-level grammar constraint like Ollama has for JSON. It leans on the model being capable, especially with deeply nested templates.

All five of these route through the same generate() call underneath - same retries, same guaranteeLevel, same error types (SchemaViolationError, MaxRetriesExceededError). Different doors into the same room, which is probably why it was easy to keep adding them.

Genuinely curious about a few things: if you've built template-based XML generation before, does enforceLiterals feel like the right approach, or is there a cleaner way without a re-serialize pass? Is there a schema style still missing that you've actually needed in production - YAML, something protobuf-shaped? And for the validator path, is a hint object enough context, or would you want more?

Repo's here: github.com/aviasoletechnologies/shapecraft. Package is @aviasole/shapecraft. I'm using it across a few different projects now, so I'll keep adding to it as I run into more edge cases.

DEV Community

How I ended up contributing five ways to shape an LLM's output

Top comments (0)