DEV Community

Bala Paranj
Bala Paranj

Posted on

Versioned Schema Contracts in a Go CLI: How obs.v0.1 Prevents Silent Breaks

How embedding schema versions in every data file — observations, controls, output, baselines — enables forward compatibility, fail-fast loading, and contract testing without external schema registries.

A user upgrades the CLI from v0.8 to v0.9. Their observation files still say "schema_version": "obs.v0.1". The new CLI adds a field to the output schema. Is the old input still valid? Can the new output be read by downstream tools?

Without versioned schemas, you're guessing. With them, the answer is in the data: the file says what version it speaks, the tool says what versions it accepts, and the mismatch is a clear error — not a silent corruption.

Every Data File Carries Its Version

{
  "schema_version": "obs.v0.1",
  "captured_at": "2026-01-01T00:00:00Z",
  "assets": [...]
}
Enter fullscreen mode Exit fullscreen mode
dsl_version: ctrl.v1
id: CTL.S3.PUBLIC.001
name: Block Public Access
unsafe_predicate:
  any:
    - field: properties.public
      op: eq
      value: true
Enter fullscreen mode Exit fullscreen mode
{
  "schema_version": "out.v0.1",
  "kind": "ASSESSMENT",
  "run": {...},
  "findings": [...]
}
Enter fullscreen mode Exit fullscreen mode

Four schema versions in the system:

Schema Format Purpose
obs.v0.1 JSON Observation snapshots (cloud resource state)
ctrl.v1 YAML Control definitions (security rules)
out.v0.1 JSON Evaluation output (findings, verdicts)
baseline.v0.1 JSON Baseline artifacts (accepted posture)

Fail-Fast at Load Time

The loader checks the version before parsing the body:

func (v *Validator) validateDocument(raw []byte, cfg docConfig, opts ...Option) (*diag.Assessment, error) {
    // 1. Parse just the version field
    var partial struct {
        Version string `json:"schema_version" yaml:"dsl_version"`
    }
    cfg.Unmarshal(raw, &partial)

    // 2. Check if this version is accepted
    if !slices.Contains(cfg.Accepted, actual) {
        return unsupportedVersionResult(actual, cfg.Accepted,
            "Use a supported schema version"), nil
    }

    // 3. Validate the full document against the versioned schema
    diags, err := v.Validate(Request{
        Kind:          schemas.Kind(cfg.Kind),
        ActualVersion: actual,
        Data:          raw,
    })
    // ...
}
Enter fullscreen mode Exit fullscreen mode

If an observation file says "schema_version": "obs.v0.2" and the tool only knows obs.v0.1, the error is immediate and clear:

Schema version "obs.v0.2" is not supported. Accepted versions: obs.v0.1
Enter fullscreen mode Exit fullscreen mode

No partial parsing. No silent field dropping. No "it loaded but something is wrong."

Schemas are Embedded in the Binary

//go:embed embedded/*/*/*.json
var embeddedFS embed.FS
Enter fullscreen mode Exit fullscreen mode

The JSON Schema files are compiled into the binary via go:embed. The tool doesn't need network access, a schema registry, or a config directory to validate input. This is critical for air-gapped environments.

The schema for controls includes additionalProperties: false at every level:

{
  "type": "object",
  "additionalProperties": false,
  "required": ["dsl_version", "id", "name", "description", "type"],
  "properties": {
    "op": {
      "type": "string",
      "enum": ["eq", "ne", "lt", "gt", "missing", "present", "contains", "in"]
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

A typo like operater: eq (instead of op: eq) is caught immediately — unknown fields are rejected, not silently ignored.

Why Versions Live in the Data, Not the Tool

The version is in the JSON/YAML file, not in the CLI's flag or config. This means:

  • Files are self-describing. You can identify what a file is by reading it — no filename convention needed.
  • Multiple versions can coexist. A directory can contain obs.v0.1 and (future) obs.v0.2 files. The loader handles each according to its declared version.
  • Downstream tools know what they're reading. A CI pipeline that consumes out.v0.1 output can validate it against the published schema for that version.

Versioned schema contracts are used in Stave, a Go CLI for offline security evaluation. The embedded JSON Schemas validate input at load time with additionalProperties: false catching typos. The trace.v0.1 logic trace schema was added following the same pattern.

Top comments (0)