How embedding schema versions in every data file — observations, controls, output, baselines — enables forward compatibility, fail-fast loading, and contract testing without external schema registries.
A user upgrades the CLI from v0.8 to v0.9. Their observation files still say "schema_version": "obs.v0.1". The new CLI adds a field to the output schema. Is the old input still valid? Can the new output be read by downstream tools?
Without versioned schemas, you're guessing. With them, the answer is in the data: the file says what version it speaks, the tool says what versions it accepts, and the mismatch is a clear error — not a silent corruption.
Every Data File Carries Its Version
{
"schema_version": "obs.v0.1",
"captured_at": "2026-01-01T00:00:00Z",
"assets": [...]
}
dsl_version: ctrl.v1
id: CTL.S3.PUBLIC.001
name: Block Public Access
unsafe_predicate:
any:
- field: properties.public
op: eq
value: true
{
"schema_version": "out.v0.1",
"kind": "ASSESSMENT",
"run": {...},
"findings": [...]
}
Four schema versions in the system:
| Schema | Format | Purpose |
|---|---|---|
obs.v0.1 |
JSON | Observation snapshots (cloud resource state) |
ctrl.v1 |
YAML | Control definitions (security rules) |
out.v0.1 |
JSON | Evaluation output (findings, verdicts) |
baseline.v0.1 |
JSON | Baseline artifacts (accepted posture) |
Fail-Fast at Load Time
The loader checks the version before parsing the body:
func (v *Validator) validateDocument(raw []byte, cfg docConfig, opts ...Option) (*diag.Assessment, error) {
// 1. Parse just the version field
var partial struct {
Version string `json:"schema_version" yaml:"dsl_version"`
}
cfg.Unmarshal(raw, &partial)
// 2. Check if this version is accepted
if !slices.Contains(cfg.Accepted, actual) {
return unsupportedVersionResult(actual, cfg.Accepted,
"Use a supported schema version"), nil
}
// 3. Validate the full document against the versioned schema
diags, err := v.Validate(Request{
Kind: schemas.Kind(cfg.Kind),
ActualVersion: actual,
Data: raw,
})
// ...
}
If an observation file says "schema_version": "obs.v0.2" and the tool only knows obs.v0.1, the error is immediate and clear:
Schema version "obs.v0.2" is not supported. Accepted versions: obs.v0.1
No partial parsing. No silent field dropping. No "it loaded but something is wrong."
Schemas are Embedded in the Binary
//go:embed embedded/*/*/*.json
var embeddedFS embed.FS
The JSON Schema files are compiled into the binary via go:embed. The tool doesn't need network access, a schema registry, or a config directory to validate input. This is critical for air-gapped environments.
The schema for controls includes additionalProperties: false at every level:
{
"type": "object",
"additionalProperties": false,
"required": ["dsl_version", "id", "name", "description", "type"],
"properties": {
"op": {
"type": "string",
"enum": ["eq", "ne", "lt", "gt", "missing", "present", "contains", "in"]
}
}
}
A typo like operater: eq (instead of op: eq) is caught immediately — unknown fields are rejected, not silently ignored.
Why Versions Live in the Data, Not the Tool
The version is in the JSON/YAML file, not in the CLI's flag or config. This means:
- Files are self-describing. You can identify what a file is by reading it — no filename convention needed.
-
Multiple versions can coexist. A directory can contain
obs.v0.1and (future)obs.v0.2files. The loader handles each according to its declared version. -
Downstream tools know what they're reading. A CI pipeline that consumes
out.v0.1output can validate it against the published schema for that version.
Versioned schema contracts are used in Stave, a Go CLI for offline security evaluation. The embedded JSON Schemas validate input at load time with additionalProperties: false catching typos. The trace.v0.1 logic trace schema was added following the same pattern.
Top comments (0)