DEV Community

Cover image for JSON Schema in 10 Minutes — Validation, Types & Real Examples
Anh Quân Nguyễn
Anh Quân Nguyễn

Posted on • Originally published at calculators.im

JSON Schema in 10 Minutes — Validation, Types & Real Examples

Two years ago I shipped a webhook handler without input validation. A partner started sending us a slightly malformed payload (an extra field, one missing required field) and our worker silently processed garbage into the database for three days before anyone noticed. By the time I traced it, we had 12,000 corrupt rows and a very awkward customer call.

I learned JSON Schema the next week. This post is the cheat sheet I wish someone had handed me on day one — the keywords I actually use, the gotchas that bit me again later, and the honest comparison with OpenAPI and TypeScript types.

The seven types you'll use

Every JSON value is one of seven types: string, number, integer, boolean, object, array, or null. The integer type is a JSON Schema convenience (raw JSON only has number) but the schema layer enforces "no decimal places." A minimal schema:

{
  "type": "string"
}
Enter fullscreen mode Exit fullscreen mode

That validates any string and rejects everything else. You can also accept a union:

{
  "type": ["string", "null"]
}
Enter fullscreen mode Exit fullscreen mode

Useful for optional fields you want to keep present in the payload rather than omitting. Before I write more than a one-line schema I usually paste a sample payload into a JSON formatter to see the actual shape pretty-printed. Type errors almost always come from misreading the structure.

Objects, required, and the additionalProperties trap

Most real validation work happens on objects. The three keywords you use every day are properties, required, and additionalProperties:

{
  "type": "object",
  "properties": {
    "email":    { "type": "string" },
    "age":      { "type": "integer" },
    "verified": { "type": "boolean" }
  },
  "required": ["email"],
  "additionalProperties": false
}
Enter fullscreen mode Exit fullscreen mode

Three gotchas to internalize. First, properties describes each field but does NOT make any of them required. Without the required array, every property is optional. Second, required is a separate list of property names that must be present (presence only, you still need type to validate the value). Third, additionalProperties: false rejects any property not listed. Without this line, the schema accepts arbitrary extra fields silently. This was the bug that hit me — the partner was sending email_address instead of email, and without additionalProperties: false my schema accepted it as "no email + an unknown field."

Set additionalProperties: false by default. Remove it only when you genuinely want a free-form object. For maps with arbitrary keys but a known value type, use it as a schema instead of a boolean:

{
  "type": "object",
  "additionalProperties": { "type": "number" }
}
Enter fullscreen mode Exit fullscreen mode

That validates any object where every value is a number. Perfect for price lookup tables, feature-flag percentages, or anything keyed dynamically.

String validation: minLength, pattern, format, enum

Real string validation goes beyond "is it a string." The keywords that earn their keep:

  • minLength / maxLength, integer bounds on UTF-16 code units (not bytes, not graphemes)
  • pattern, ECMA-262 regex the string must match somewhere (use ^...$ anchors for a full match)
  • format, named formats like email, uri, date, date-time, uuid, ipv4, ipv6
  • enum, a fixed list of allowed values (works for any type)
  • const, a single allowed value (equivalent to a one-item enum)

A practical username field:

{
  "type": "string",
  "minLength": 3,
  "maxLength": 20,
  "pattern": "^[a-zA-Z0-9_]+$"
}
Enter fullscreen mode Exit fullscreen mode

One gotcha that cost me a day: format is informational by default in older drafts. You must enable format assertion in your validator. Ajv requires ajv-formats. Python jsonschema needs format_checker. Without it, "format": "email" documents intent but does not actually reject invalid emails. See the JSON Schema spec for format for the full list and the assertion behavior per draft.

Number validation: minimum, maximum, multipleOf

For numbers and integers, the validation keywords are arithmetic:

  • minimum / maximum, inclusive bounds
  • exclusiveMinimum / exclusiveMaximum, exclusive bounds (in Draft 2020-12 these take a number, in older drafts they took a boolean)
  • multipleOf, the value must be a multiple of this number

Validating a percentage that must be 0 to 100 in 0.01 increments:

{
  "type": "number",
  "minimum": 0,
  "maximum": 100,
  "multipleOf": 0.01
}
Enter fullscreen mode Exit fullscreen mode

multipleOf has a floating-point trap I keep getting wrong. 0.1 is not exactly representable in IEEE 754, so { "multipleOf": 0.1 } will sometimes reject values you expect to pass. For money, I now store and validate as integer cents ({ "type": "integer", "minimum": 0 }). It is the same precision argument behind storing prices in the smallest currency unit everywhere else in the stack.

Array validation: items, minItems, uniqueItems

For arrays the workhorses are items (schema applied to every element), minItems / maxItems (length bounds), and uniqueItems (rejects duplicates by deep equality). A list of unique tags:

{
  "type": "array",
  "items": { "type": "string", "minLength": 1 },
  "minItems": 1,
  "maxItems": 10,
  "uniqueItems": true
}
Enter fullscreen mode Exit fullscreen mode

For positional tuples where each index has a different schema, use prefixItems in Draft 2020-12 or items as an array in older drafts. A coordinate pair where index 0 is longitude and index 1 is latitude:

{
  "type": "array",
  "prefixItems": [
    { "type": "number", "minimum": -180, "maximum": 180 },
    { "type": "number", "minimum":  -90, "maximum":  90 }
  ],
  "items": false
}
Enter fullscreen mode Exit fullscreen mode

The trailing "items": false rejects any extra elements beyond the two declared positions. The array equivalent of additionalProperties: false.

Schema composition: $ref, allOf, oneOf, anyOf

Once your schemas grow past a single page, you will want to break them up and combine them. JSON Schema has four composition keywords:

  • $ref, reuse another schema by JSON Pointer (e.g., "#/$defs/address" or an external URL)
  • allOf, data must validate against every subschema (intersection / mixin)
  • anyOf, data must validate against at least one (union, OK if multiple match)
  • oneOf, data must validate against exactly one (XOR, rejects if zero or multiple match)

A reusable address schema referenced from two parents:

{
  "$defs": {
    "address": {
      "type": "object",
      "properties": {
        "street":  { "type": "string" },
        "city":    { "type": "string" },
        "country": { "type": "string", "minLength": 2, "maxLength": 2 }
      },
      "required": ["street", "city", "country"]
    }
  },
  "type": "object",
  "properties": {
    "shipping": { "$ref": "#/$defs/address" },
    "billing":  { "$ref": "#/$defs/address" }
  }
}
Enter fullscreen mode Exit fullscreen mode

For discriminated unions (event types, message kinds), oneOf with a const discriminator is the standard pattern:

{
  "oneOf": [
    { "type": "object", "properties": { "kind": { "const": "email" },
        "to": { "type": "string", "format": "email" } }, "required": ["kind", "to"] },
    { "type": "object", "properties": { "kind": { "const": "sms" },
        "phone": { "type": "string", "pattern": "^\\+[1-9]\\d{1,14}$" } }, "required": ["kind", "phone"] }
  ]
}
Enter fullscreen mode Exit fullscreen mode

A real signup schema

Putting every keyword together, here is roughly the schema I now use for a signup endpoint:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "SignupRequest",
  "type": "object",
  "properties": {
    "email":     { "type": "string", "format": "email", "maxLength": 254 },
    "password":  { "type": "string", "minLength": 12, "maxLength": 128 },
    "username":  { "type": "string", "pattern": "^[a-zA-Z0-9_]{3,20}$" },
    "age":       { "type": "integer", "minimum": 13, "maximum": 120 },
    "country":   { "type": "string", "enum": ["US", "UK", "CA", "AU"] },
    "newsletter":{ "type": "boolean", "default": false },
    "referrals": { "type": "array", "items": { "type": "string", "format": "email" },
                   "maxItems": 5, "uniqueItems": true }
  },
  "required": ["email", "password", "username", "age", "country"],
  "additionalProperties": false
}
Enter fullscreen mode Exit fullscreen mode

It enforces the email format with the 254-char maximum from RFC 5321, a 12-character minimum password from NIST SP 800-63B, a regex-validated username, an integer age within plausible bounds, a closed enum of supported countries, an optional boolean with a documented default, and an optional referral list capped at 5 unique emails. The trailing additionalProperties: false is the line that would have saved me three days and 12,000 rows two years ago.

Tooling: Ajv (Node) and jsonschema (Python)

Declare which draft you target with the $schema keyword at the root. The two production-grade validators I reach for:

  • Ajv for Node.js and browser, the fastest JS validator, supports Draft 2020-12. Install ajv and ajv-formats together if you use format. Compile schemas once at startup with const validate = ajv.compile(schema), then call validate(data) on every request. This is 10 to 100 times faster than recompiling per call.
  • jsonschema for Python, the reference Python validator. Use Draft202012Validator(schema).validate(data) or iterate .iter_errors(data) to surface all errors at once instead of failing on the first.

For quick iteration without writing code, I usually paste the schema and a sample payload into a JSON formatter to confirm both parse, then run them through a browser-based validator. When debugging an unexpected failure, a diff checker helps me compare a failing payload against a known-good payload to spot the offending field.

JSON Schema vs OpenAPI vs TypeScript

These three describe data shapes but solve different problems:

  • TypeScript types are compile-time only. They vanish at runtime, so a malformed API payload will silently corrupt your program if you trust the type without validating. Great for developer ergonomics, useless for runtime safety.
  • JSON Schema is runtime validation that works in any language. Use it at API boundaries, for config files, for database documents, and for any cross-language data contract. A single schema can drive validation in your Node frontend, Python backend, and Go worker without rewriting.
  • OpenAPI (formerly Swagger) wraps JSON Schema inside an API description. It adds endpoints, methods, status codes, authentication, examples, and tooling for client SDK generation. Use it when you are describing an HTTP API and want documentation, client codegen, and validation in one document.

The stack I default to now: write the JSON Schema as source of truth, generate TypeScript types from it with json-schema-to-typescript, and embed the same schema inside an OpenAPI spec for HTTP routes. One source, three outputs, no drift.

The mistakes I kept making

1. Forgetting additionalProperties: false

The original bug. Without it, any extra field passes validation. A client typo like { "emial": "x@y.com" } validates as "no email present plus an unknown field" instead of the clean error you want. Add it by default.

2. Confusing required with type

Listing a property under properties does NOT make it required. You must also add it to the required array. Conversely, required only checks presence. A wrong-type field still fails, but on the type check, not the required check.

3. Using format without enabling assertion

In Ajv you must require('ajv-formats')(ajv). In Python jsonschema pass format_checker=FormatChecker(). Without this, format: email is metadata only and accepts any string. I burned half a day on this one.

4. oneOf where anyOf is correct

oneOf rejects data that matches more than one subschema. If your subschemas overlap (a value that is both a positive integer and a multiple of 5), oneOf rejects. Use it only for genuinely disjoint cases like discriminated unions.

5. multipleOf with floats

IEEE 754 cannot exactly represent 0.1. { "multipleOf": 0.1 } will reject values you expect to pass. Use integer units (cents, basis points) instead.

6. Recompiling schemas on every request

Ajv's compile() is expensive. The compiled validator is fast. Compile once at module load, store the function, reuse it.

Closing thought

JSON Schema looks verbose at first. Often the schema is longer than the data. That is the point. Every constraint you encode is one bug you cannot ship. Start with your top three API endpoints, then your config files, then your cross-service messages. Within a sprint you will catch at least one bug that would have made it to production.

If you want a sandbox, try the JSON Schema Reference Tutorial and an online validator like jsonschemavalidator.net. And if you ever debug a pattern validation that is misbehaving, a regex tester is faster than guessing.


Originally published at calculators.im.

Top comments (0)