Ahmed Hany Gamal

Posted on Apr 17

Implementing a JSON Schema Validator from Scratch - Week 8

#jsonschema #learning #typescript #showdev

It’s been over a month since the last update. I initially planned to continue posting at a lower frequency due to GSoC, but an unexpected task came up, took most of my time, and I wasn’t able to maintain progress on either the blog or my GSoC proposals.
That has now been dealt with, so now I'm back to my weekly blog posts.

It took a bit to re-familiarize myself with the system, but progress picked up quickly afterward.

This week, I implemented a good amount of keywords and refactored parts of the system’s foundation.

Implemented Keywords

I implemented the main property applicator keywords (patternProperties, additionalProperties, unevaluatedProperties).

`patternProperties`

This was straightforward because JavaScript already provides a native ECMA-262 regex engine.

`additionalProperties`

The implementation relies on annotations produced by properties and patternProperties. Any instance property not covered by those annotations is validated against the additionalProperties schema.

`unevaluatedProperties`

This ended up being the simplest to implement.
The implementation relies on the evaluatedProperties field (from the pending output unit being used during the evaluation). Any instance property not covered by evaluatedProperties is validated against the unevaluatedProperties schema.

Important Note

These keywords were simple and straightforward to implement due to the architectural changes from weeks 6 and 7:

Evaluation tracking enables unevaluatedProperties
Keyword phases ensure correct execution order, allowing additionalProperties and unevaluatedProperties to rely on prior annotations

Foundational Changes

It's time for my favorite weekly task, refactoring the system's foundation.

Path Handling

Path construction (evaluationPath, schemaLocation, instanceLocation) is now handled centrally by ValidationContext.evaluate, instead of being managed by individual keywords. Keywords are only responsible for their additional path steps.

This is the only foundational change that's been completed, I'm still working on the next two.

Validator Restructuring

The validator.ts file doesn't just export a validate function anymore. A single validate function would make supporting reference keywords impractical.

validator.ts now has a Validator class, which includes a Draft object and a schema registry.

Schemas can be added to the schema registry either by using the registerSchema function, or having keywords like $id and $anchor in the schema (more on that in the next section)

Pre-validation Work

Keywords are split into two groups, evaluation keywords, and indexing/registration keywords.
The evaluation keywords are the normal/default ones, while the indexing/registration keywords are the ones that update the schema registry.

When a schema is registered via the registerSchema function, the system scans it for indexing keywords and updates the schema registry accordingly. Each keyword defines how its entries are resolved (e.g., $anchor vs $id).

Even though there are a lot of foundational changes, the process has been surprisingly enjoyable.
I believe this is because each component in the system is fairly decoupled from the rest, and I was mostly working on a new component this week, with minimal changes to the already existing ones.
This is vastly different from week 6 (which I truly didn't enjoy), as I had to refactor an already existing component in an already existing system, so I had to be extra cautious not to break the component or the system around it, since even small changes could have unintended side effects.

Conclusion

I believe the remaining difficult keywords are:

The static reference keywords ($id, $ref)
The dynamic reference keywords ($dynamicRef, $dynamicAnchor)
The if / then / else keywords.

Hopefully I can finish them all by the end of this month.

Also, I expect to continue refining the system’s foundation weekly, but that doesn’t bother me as much now, especially since I believe I've written it well enough to easily update and refactor.

As always, the code can be found on GitHub

DEV Community