SEN LLC

Posted on Apr 15

zod-to-schema: A Zero-Config CLI That Runs Your TypeScript In-Process

#typescript #zod #jsonschema #tutorial

zod-to-schema: A Zero-Config CLI That Runs Your TypeScript In-Process

A small TypeScript CLI that takes a .ts file, finds every exported Zod schema, and emits JSON Schema — no tsc subprocess, no wiring your own build script. The interesting part isn't the conversion (a library handles that); it's running user-provided TypeScript in-process from a Node CLI without any setup.

📦 GitHub: https://github.com/sen-ltd/zod-to-schema

The problem

Every serious TypeScript project uses Zod for runtime validation. The team loves it — one schema, end-to-end type inference, parse-don't-validate all the way down. Done, sorted, move on.

Then someone asks for your API in OpenAPI. Or a frontend team wants JSON Schema to drive a form builder. Or you need your config schema in a form that VS Code can autocomplete against. All of those consumers want JSON Schema, not Zod.

There's a great library called zod-to-json-schema that does the conversion. But it's a library, not a tool. Using it means writing this:

// scripts/build-schemas.ts
import { writeFileSync } from 'fs'
import { zodToJsonSchema } from 'zod-to-json-schema'
import * as schemas from '../src/schemas.js'

for (const [name, schema] of Object.entries(schemas)) {
  if ('_def' in schema) {
    writeFileSync(
      `dist/schemas/${name}.json`,
      JSON.stringify(zodToJsonSchema(schema, { name }), null, 2)
    )
  }
}

…and then adding it to your package.json, wiring it into pre-commit, documenting it, keeping it in sync with the schema file when exports get added. It's fine. It's also the same 20 lines of glue in every Zod project I've worked on.

zod-to-schema is that glue as a one-binary CLI:

zod-to-schema src/schemas.ts --split --out dist/schemas/

Drop it in package.json's scripts, done. Or run it via Docker if you don't want a global install. Or run it in CI without installing anything because the image is 175 MB.

Simple from the outside. The interesting mechanics are inside.

The thing that makes this CLI hard

When the user runs zod-to-schema src/schemas.ts, the CLI has to execute their TypeScript file and inspect its exports. Not parse it, not emit JS from it — actually run it, so that export const User = z.object({...}) becomes a live ZodObject we can introspect.

That means at some point we need to:

Read TypeScript source from disk
Compile it (or transform it — the important part is "evaluate it")
Import the result into the current Node process
Walk the exports and find Zod schemas
Convert and print

Approach one: shell out to tsc, write JS to a temp dir, import() that. Works but slow, needs a writable tmp dir, needs a tsconfig.json guess, and now you have to reason about "which .ts files does tsc pick up" (usually more than the one the user asked for).

Approach two: shell out to tsx schemas.ts and communicate via stdout. Works but now you're shell-escaping user paths, spawning processes, and building a weird RPC protocol over JSON.

Approach three — the one this CLI uses — is to ask tsx's programmatic API to load the file for us, in-process:

// src/loader.ts
import { tsImport } from 'tsx/esm/api'
import { resolve, isAbsolute } from 'node:path'
import { pathToFileURL } from 'node:url'

export async function loadUserModule(filePath: string) {
  const absolute = isAbsolute(filePath)
    ? filePath
    : resolve(process.cwd(), filePath)
  const url = pathToFileURL(absolute).href
  // tsImport registers a scoped ESM loader hook for the duration of
  // this one import, transforms .ts on the fly, and returns the module.
  return (await tsImport(url, import.meta.url)) as Record<string, unknown>
}

That's the whole loader. tsImport registers a namespaced ESM loader hook, runs a single import(), unregisters. No subprocess, no temp files, no tsc, no tsconfig shenanigans. The second argument is the "parent URL" tsx uses to anchor node_modules resolution for our bundled copy of zod-to-json-schema.

Why tsx and not jiti

I considered jiti too. Both work. The tie-breaker was ESM-first: jiti historically prioritized CommonJS interop and added ESM support later, while tsx is ESM-native and its tsImport() API is specifically designed for the "spawn a scoped loader, import one file, throw it away" workflow. For a CLI whose entire job is "import one user file and read its exports," that's the cleanest fit.

The trade-off both share: your CLI is now running arbitrary TypeScript from disk. There is no sandbox. If the user passes rm-rf-home.ts, you'll faithfully run it. I have strong feelings about this — see the trust boundary section below — but the short version is: this is the same trust level as running tsx schemas.ts yourself, which is exactly what the user was doing before.

Detecting Zod schemas without importing Zod

Once we've got the user's module object, we need to figure out which exports are Zod schemas. The naive version is instanceof z.ZodType. The working version is a duck-type:

// src/detector.ts
export interface ZodLike {
  _def: { typeName: string }
  parse: (input: unknown) => unknown
  safeParse: (input: unknown) => unknown
}

export function isZodSchema(value: unknown): value is ZodLike {
  if (value === null || typeof value !== 'object') return false
  const v = value as Record<string, unknown>
  if (typeof v.parse !== 'function') return false
  if (typeof v.safeParse !== 'function') return false
  const def = v._def
  if (def === null || typeof def !== 'object') return false
  const typeName = (def as Record<string, unknown>).typeName
  return typeof typeName === 'string' && typeName.startsWith('Zod')
}

Why not instanceof? Because the user's zod and our CLI's zod can be different copies at runtime. If the user project installs zod@3.22.0 and we ship with zod@3.23.8, the user's ZodObject is not an instance of our ZodObject — the prototype chain goes through a different file on disk. The class check would return false for every schema that actually works.

Duck-typing sidesteps the problem entirely. We look for the contract, not the constructor. And it's a very distinctive contract: _def.typeName on every Zod schema is a string starting with Zod (ZodObject, ZodString, ZodUnion, …). Paired with parse + safeParse, the false-positive rate is effectively zero.

The detector then walks Object.keys(mod):

export function detectSchemas(mod: Record<string, unknown>): DetectedSchema[] {
  const out: DetectedSchema[] = []
  for (const name of Object.keys(mod)) {
    const value = mod[name]
    if (isZodSchema(value)) out.push({ name, schema: value })
  }
  return out
}

Order matters: Object.keys preserves insertion order for string keys, so the CLI's output is deterministic across runs. The tests assert this explicitly — file order in, file order out.

Split mode: one file per schema

The --split flag is the one feature I wanted more than any other. In practice, when I emit JSON Schema from a Zod project, I almost always want one file per schema:

dist/schemas/
├── User.json
├── Post.json
└── Comment.json

Not one giant file with a definitions object, because consumers of JSON Schema — IDEs, validators, OpenAPI tooling — usually want to pin $ref to a specific file URL. One-per-file lets you $ref: "./User.json" from a human-written OpenAPI doc and it Just Works.

The split formatter is straightforward:

// src/formatters.ts
import { writeFile, mkdir } from 'node:fs/promises'
import { join } from 'node:path'

export async function writeSplit(
  schemas: ConvertedSchema[],
  target: { outDir: string; format: Format }
): Promise<string[]> {
  await mkdir(target.outDir, { recursive: true })
  const written: string[] = []
  for (const s of schemas) {
    const path = join(target.outDir, `${s.name}.json`)
    await writeFile(path, serialize(s.json, target.format) + '\n', 'utf8')
    written.push(path)
  }
  return written
}

The returned written[] is the interesting bit. main.ts reports each path on stderr (wrote dist/schemas/User.json), and the tests assert on the returned array directly — no need to re-read the filesystem inside test code, which keeps the suite fast and flake-free.

What doesn't survive the trip

The CLI is honest about what JSON Schema can't express. The README has a "Non-goals" section and the article will too because the worst kind of build tool is one that silently drops fidelity:

Refinements and transforms get dropped. z.string().refine(s => s.length % 2 === 0) produces a JSON Schema that just says type: "string". The refinement is a runtime JavaScript function; JSON Schema has no general-purpose escape hatch for "call this function." You can sometimes hand-translate it (minLength / maxLength / pattern), but the general case is unsolvable.

.transform() is the same story, worse. A transform changes the output type, but JSON Schema only describes input shape. The library will emit the pre-transform schema, which is usually what you want, but any logic in the transform is gone.

Custom error messages. z.string().min(3, "too short") loses the "too short" string. JSON Schema 2019-09+ has errorMessage extensions but nothing universal.

draft-2020-12 is partial. zod-to-json-schema internally emits draft-2019-09 shape, and the --target 2020-12 flag relabels $schema. For the subset Zod produces this is a lossless rename (the relevant keyword changes are $recursiveRef → $dynamicRef and items/additionalItems → prefixItems/items, none of which Zod emits), but if you need true draft-2020-12 keywords like unevaluatedProperties you need a post-processor. I documented this rather than pretending it was full support.

The trust boundary I keep thinking about

Running user TypeScript in-process is the whole pitch of this CLI. It's also the sharpest edge.

When the user runs zod-to-schema src/schemas.ts, tsx opens that file, transforms it, and evaluates it. Top-level statements execute. Imports resolve. If src/schemas.ts does import './side-effect.ts', side effects fire. This is not unique to zod-to-schema — it's how every tool that "reads your Zod schemas" has to work, short of writing a TypeScript AST walker that interprets Zod calls symbolically (which is a research project, not a CLI).

The right mental model is: running zod-to-schema foo.ts is exactly as dangerous as running tsx foo.ts yourself. If foo.ts is yours, you're fine. If foo.ts came from a PR on a repo you don't trust, you should not run this tool on it outside a sandbox.

For the Docker image, the trust boundary is stronger: the container is non-root, has no network access unless you give it any, and only has the volume you mount. Running untrusted schemas becomes a meaningful option — at the cost of a ~1s container startup per invocation.

Try it in 30 seconds

docker run --rm -v "$PWD":/work ghcr.io/sen-ltd/zod-to-schema /work/schemas.ts

Or with a local install:

npm install -g zod-to-schema
zod-to-schema src/schemas.ts --split --out dist/schemas/

Full example:

mkdir demo && cd demo
cat > schemas.ts << 'EOF'
import { z } from 'zod'

export const User = z.object({
  id: z.string().uuid(),
  name: z.string().min(1).max(100),
  email: z.string().email(),
  age: z.number().int().min(0).optional(),
  tags: z.array(z.string()),
})

export const Post = z.object({
  id: z.string(),
  title: z.string(),
  author: User,
})
EOF

docker run --rm -v "$PWD":/work zod-to-schema /work/schemas.ts
# Combined JSON Schema for User + Post on stdout

docker run --rm -v "$PWD":/work zod-to-schema /work/schemas.ts --export User
# Just User

docker run --rm -v "$PWD":/work zod-to-schema /work/schemas.ts --split --out /work/out/
ls out/
# Post.json  User.json

That's it. 38 tests, 175 MB Alpine image, four source files, one honest set of limitations. The code lives at https://github.com/sen-ltd/zod-to-schema — MIT licensed, PRs welcome, and I'd love to hear about other "Zod → X" conversions that hit the same "library exists, wiring is the problem" pattern.

Top comments (3)

Albert • May 29

Nice write-up. I like the idea of discovering Zod schemas by actually running the TypeScript module instead of asking people to maintain separate schema files.

The duck-typing part is interesting too. Do you plan to support mixed exports where a file contains both Zod schemas and helper functions, or do you recommend keeping schema exports isolated?

SEN LLC • May 31

Thanks Albert. The honest answer is that mixed exports already work — the duck-typing check ('_def' in value) silently skips anything that isn't a Zod instance, so a file with export const UserSchema = z.object({...}) next to export function parseUser(...) just emits the schema and ignores the helper. No flag, no config.

Where I'd still split, when projects grow:

Module-level side effects. Running the module means the whole top level executes. If a helper does db.connect() or starts a metrics client at import time, that fires every time you run the CLI. Pure schemas in their own file avoid surprises.
Computed schemas. export const Foo = makeFoo() is fine because evaluation produces a Zod instance the duck-test picks up. But if makeFoo() is expensive or depends on env vars, again — module-level work runs.
CI throughput. When you have 200 schemas spread across the same modules as your domain code, every CLI run pays for all the helper imports. Splitting hot paths cuts that.

So my actual recommendation is: mix freely while the surface is small, split when the schemas outgrow co-located helpers or when import-time work starts costing you. The CLI doesn't care either way — it's a documentation / clarity choice for humans, not a technical constraint.

The duck-typing was honestly the part I worried about least when designing this; it's been the most robust seam in practice.

Albert • Jun 3

Thanks for the detailed breakdown! The point about module-level side effects (like database connections or metrics client instantiation on import) is a crucial caveat when scanning TS files dynamically. Co-locating schemas with raw application code definitely speeds up initial setup, but separating hot paths indeed saves CI cycles and import-time surprises when a project starts to scale. Appreciate the insights!