Welcome to the Tsonnet series!
If you're not following the series so far, you can check out how it all started in the first post of the series.
In the previous post, I fixed Tsonnet's lazy evaluation inconsistency in objects:
Now it's time to tackle object self-references.
The problem: objects can't talk to themselves
Consider this perfectly reasonable configuration:
{
one: 1,
two: self.one + 1
}
This should evaluate to { "one": 1, "two": 2 }
, but currently Tsonnet has no clue what's going on:
dune exec -- tsonnet samples/objects/self_reference.jsonnet
samples/objects/self_reference.jsonnet:3:14 Unexpected char: .
3: two: self.one + 1
^^^^^^^^^^^^^^^^^^^^^
The lexer doesn't even recognize the dot! We need object field access, stat.
Time to catch up.
The lexer and parser dance
First things first -- let's teach our lexer about dots and the self
keyword:
diff --git a/lib/lexer.mll b/lib/lexer.mll
index ab2f910..7b0ec29 100644
--- a/lib/lexer.mll
+++ b/lib/lexer.mll
@@ -61,7 +61,9 @@ rule read =
| '~' { BITWISE_NOT }
| '=' { ASSIGN }
| ';' { SEMICOLON }
+ | '.' { DOT }
| "local" { LOCAL }
+ | "self" { SELF }
| id { ID (Lexing.lexeme lexbuf) }
| _ { raise (SyntaxError ("Unexpected char: " ^ Lexing.lexeme lexbuf)) }
| eof { EOF }
The parser follows suit, understanding that SELF DOT field
makes an ObjectFieldAccess
:
diff --git a/lib/parser.mly b/lib/parser.mly
index efbd372..3a35c36 100644
--- a/lib/parser.mly
+++ b/lib/parser.mly
@@ -18,6 +18,8 @@
%token COMMA
%token LEFT_CURLY_BRACKET RIGHT_CURLY_BRACKET
%token COLON
+%token DOT
+%token SELF
%token PLUS MINUS MULTIPLY DIVIDE
%left PLUS MINUS
%left MULTIPLY DIVIDE
@@ -52,6 +54,7 @@ assignable_expr:
| e1 = assignable_expr; op = bin_op; e2 = assignable_expr { BinOp (with_pos $startpos $endpos, op, e1, e2) }
| op = unary_op; e = assignable_expr { UnaryOp (with_pos $startpos $endpos, op, e) }
| varname = ID; LEFT_SQR_BRACKET; e = assignable_expr; RIGHT_SQR_BRACKET { IndexedExpr (with_pos $startpos $endpos, varname, e) }
+ | SELF; DOT; field = ID { ObjectFieldAccess (with_pos $startpos $endpos, field) }
;
Extending the AST
We need two new AST nodes to capture self-references:
diff --git a/lib/ast.ml b/lib/ast.ml
index 5cd601a..3b428aa 100644
--- a/lib/ast.ml
+++ b/lib/ast.ml
@@ -38,6 +38,8 @@ type expr =
| Ident of position * string
| Array of position * expr list
| Object of position * object_entry list
+ | ObjectSelf of Env.env_id
+ | ObjectFieldAccess of position * string
| BinOp of position * bin_op * expr * expr
| UnaryOp of position * unary_op * expr
| Local of position * (string * expr) list
Here's the clever bit: ObjectSelf
carries an Env.env_id
-- a unique identifier for each object. This lets us know exactly which self
belongs to which object.
Putting self into the environment
When interpreting objects, we were already generating the object ID, but now we add self
to the environment:
and interpret_object env (pos, entries) =
let* obj_id = Env.Id.generate () in
+ let env' = Env.add_local "self" (ObjectSelf obj_id) env in
(* First add locals and object fields to env *)
- let* env' = List.fold_left
+ let* env'' = List.fold_left
(fun result entry ->
let* env' = result in
match entry with
@@ -114,7 +117,7 @@ and interpret_object env (pos, entries) =
| ObjectField (attr, expr) ->
ok (Env.add_obj_field attr expr obj_id env')
)
- (ok env)
+ (ok env')
entries
in
Notice how we add "self"
as a special variable pointing to the object's unique ID? This is how we track which object we're in.
When we encounter self.field
, the magic happens in interpret_object_field_access
:
and interpret_object_field_access env (pos, field) =
let* (_, evaluated_expr) = Env.find_var "self" env
~succ:(fun env' expr ->
match expr with
| ObjectSelf obj_id ->
Env.get_obj_field field obj_id env'
~succ:interpret
~err:(Error.error_at pos)
| _ ->
Error.error_at pos "Can't use self outside of an object"
)
~err:(Error.error_at pos)
in ok (env, evaluated_expr)
This looks up self
in the environment, extracts the object ID, and uses it to retrieve the requested field. Neat!
Our basic case now works:
$ dune exec -- tsonnet samples/objects/self_reference.jsonnet
{ "one": 1, "two": 2 }
But wait -- what about using self
outside of objects?
Scope checking: keeping self safe
Let's see what should happen when we use self
outside an object:
local _one = 1;
local _two = self.one + 1;
{
one: _one,
two: _two
}
Jsonnet correctly catches this:
$ jsonnet samples/errors/object_self_out_of_scope.jsonnet
samples/errors/object_self_out_of_scope.jsonnet:2:14-18 Can't use self outside of an object.
local _two = self.one + 1;
But after implementing object field access, Tsonnet misbehaves:
$ dune exec -- tsonnet samples/errors/object_self_out_of_scope.jsonnet
{ "one": 1, "two": 1 }
Like, WHAT?!
This is happening because Tsonnet is LAZY! 🥁
Yeah, lazy evaluation is tricky sometimes, may seem wrong when it is not, but here it really is a bug. We need scope validation to catch improper self
usage before evaluation begins.
Things aren't so simple, however. We can't enforce self
out of object scope like we do with local variables. When we add self
to the environment to be later evaluated, by the time it evaluates it could be outside its scope, like the little surprise we got earlier.
So, how did I solve this? Introducing a new module that performs an eager pass through the AST.
I know, one more pass. Now we are going to have:
- Lexing
- Parsing
- Scope checking
- Type checking
- Interpretation
It's well worth it, I promise.
Here's the scope validator in its entirety:
(* This module handles eager scope analysis to ensure that identifiers
like 'self' are used in appropriate contexts before lazy evaluation begins.
*)
open Ast
open Result
open Syntax_sugar
type context = {
in_object: bool;
object_depth: int;
current_locals: string list;
}
let empty_context = {
in_object = false;
object_depth = 0;
current_locals = []
}
let enter_object_scope context = {
context with
in_object = true;
object_depth = context.object_depth + 1
}
let add_locals_to_context locals context = {
context with
current_locals = locals @ context.current_locals
}
let rec _validate expr context =
match expr with
| Unit | Null _ | Number _ | String _ | Bool _ -> ok ()
| Ident (pos, varname) ->
(* Identifier validation - the heart of scope checking *)
validate_ident pos varname context
| Array (_, exprs) ->
validate_expression_list exprs context
| Object (_, entries) ->
(* Object validation - this is where scope context changes *)
validate_object entries context
| ObjectFieldAccess (pos, _) ->
validate_object_field_access pos context
| Local (_, vars) ->
validate_locals vars context
| Seq exprs ->
validate_expression_list exprs context
| BinOp (_, _, e1, e2) ->
validate_binop e1 e2 context
| UnaryOp (_, _, expr) ->
_validate expr context
| IndexedExpr (_, _, index_expr) ->
_validate index_expr context
| _ ->
(* For any other expression types, no special scope validation needed *)
ok ()
and validate_ident pos varname context =
if varname = "self" && not context.in_object
then Error.trace ("Can't use self outside of an object") pos >>= error
else ok ()
and validate_expression_list exprs context =
List.fold_left
(fun acc expr -> acc >>= fun _ -> _validate expr context)
(ok ())
exprs
and validate_object entries context =
let object_context = enter_object_scope context in
(* First pass: collect all local variable names from ObjectExpr entries
This handles cases like: { local x = 1; field: x } *)
let local_vars = collect_local_variables entries in
let context_with_locals = add_locals_to_context local_vars object_context in
validate_object_entries entries context_with_locals
and validate_object_entries entries context =
List.fold_left
(fun acc entry ->
acc >>= fun _ ->
match entry with
| ObjectField (_, expr) -> _validate expr context
| ObjectExpr expr -> _validate expr context
)
(ok ())
entries
and collect_local_variables entries =
List.fold_left
(fun acc entry ->
match entry with
| ObjectExpr (Local (_, vars)) ->
acc @ (List.map (fun (name, _) -> name) vars)
| _ -> acc
)
[]
entries
and validate_object_field_access pos context =
(* This catches cases like: local x = self.field; outside of objects *)
if not context.in_object
then Error.trace ("Can't use self outside of an object") pos >>= error
else ok ()
and validate_locals vars context =
(* This is crucial - it catches: local x = self.field; outside objects *)
List.fold_left
(fun acc (_, expr) -> acc >>= fun _ -> _validate expr context)
(ok ())
vars
and validate_binop e1 e2 context =
_validate e1 context >>= fun _ -> _validate e2 context
(* This function performs a single eager pass through the AST to validate
that all identifiers are used in appropriate scopes. It catches scope
errors before lazy evaluation begins, while still preserving the lazy
evaluation benefits in the main type checker. *)
let validate expr = _validate expr empty_context
The validator tracks whether we're inside an object and catches invalid self
usage:
and validate_object_field_access pos context =
(* This catches cases like: local x = self.field; outside of objects *)
if not context.in_object
then Error.trace ("Can't use self outside of an object") pos >>= error
else ok ()
Now we integrate scope validation into the type checker:
let check expr =
- translate expr Env.empty >>= fun _ -> Env.Id.reset (); ok expr
+ Scope.validate expr
+ >>= fun _ -> translate expr Env.empty
+ >>= fun _ ->
+ Env.Id.reset ();
+ ok expr
Perfect! Now invalid self
usage gets caught early:
$ dune exec -- tsonnet samples/errors/object_self_out_of_scope.jsonnet
samples/errors/object_self_out_of_scope.jsonnet:2:13 Can't use self outside of an object
2: local _two = self.one + 1;
^^^^^^^^^^^^^^^^^^^^^^^^^^
The infinite loop protection
Now that self-references work, we need to prevent infinite cycles between object fields. Consider this problematic case:
{
a: self.b,
b: self.a,
}
Jsonnet detects this after hitting a stack limit:
$ jsonnet samples/semantics/invalid_binding_cycle_object_fields.jsonnet
RUNTIME ERROR: max stack frames exceeded.
samples/semantics/invalid_binding_cycle_object_fields.jsonnet:2:8-14 object <anonymous>
samples/semantics/invalid_binding_cycle_object_fields.jsonnet:3:8-14 object <anonymous>
samples/semantics/invalid_binding_cycle_object_fields.jsonnet:2:8-14 object <anonymous>
samples/semantics/invalid_binding_cycle_object_fields.jsonnet:3:8-14 object <anonymous>
samples/semantics/invalid_binding_cycle_object_fields.jsonnet:2:8-14 object <anonymous>
samples/semantics/invalid_binding_cycle_object_fields.jsonnet:3:8-14 object <anonymous>
samples/semantics/invalid_binding_cycle_object_fields.jsonnet:2:8-14 object <anonymous>
samples/semantics/invalid_binding_cycle_object_fields.jsonnet:3:8-14 object <anonymous>
samples/semantics/invalid_binding_cycle_object_fields.jsonnet:2:8-14 object <anonymous>
samples/semantics/invalid_binding_cycle_object_fields.jsonnet:3:8-14 object <anonymous>
...
samples/semantics/invalid_binding_cycle_object_fields.jsonnet:3:8-14 object <anonymous>
samples/semantics/invalid_binding_cycle_object_fields.jsonnet:2:8-14 object <anonymous>
samples/semantics/invalid_binding_cycle_object_fields.jsonnet:3:8-14 object <anonymous>
samples/semantics/invalid_binding_cycle_object_fields.jsonnet:2:8-14 object <anonymous>
samples/semantics/invalid_binding_cycle_object_fields.jsonnet:3:8-14 object <anonymous>
samples/semantics/invalid_binding_cycle_object_fields.jsonnet:2:8-14 object <anonymous>
samples/semantics/invalid_binding_cycle_object_fields.jsonnet:3:8-14 object <anonymous>
samples/semantics/invalid_binding_cycle_object_fields.jsonnet:2:8-14 object <anonymous>
Field "a"
During manifestation
That's messy. We can do better with compile-time detection.
We extend our existing cycle detection to handle object fields:
and check_expr_for_cycles venv expr seen =
match expr with
| Unit | Null _ | Number _ | String _ | Bool _ -> ok ()
| Array (_, exprs) -> iter_for_cycles venv seen exprs
- | Object (_, entries) ->
- List.fold_left
- (fun ok entry -> ok >>= fun _ ->
- match entry with
- | ObjectField (_, expr) -> check_expr_for_cycles venv expr seen
- | ObjectExpr expr -> check_expr_for_cycles venv expr seen
- )
- (ok ())
- entries
+ | Object (_, entries) -> check_object_for_cycles venv entries seen
+ | ObjectFieldAccess (pos, field) -> check_object_field_for_cycles venv field pos seen
| Ident (pos, varname) -> check_cyclic_refs venv varname seen pos
The new check_object_field_for_cycles
function handles self.field
references by looking up the object ID and checking for cycles:
and check_object_field_for_cycles venv field pos seen =
(match Env.find_opt "self" venv with
| Some (TobjectSelf obj_id) ->
let obj_field = Env.uniq_field_ident obj_id field in
check_cyclic_refs venv obj_field seen pos
| _ -> ok ()
)
And for objects themselves, we check all fields for cycles during type checking, right after we translate the local variables:
and translate_object venv pos entries =
let* obj_id = Env.Id.generate () in
let venv' = Env.add_local "self" (TobjectSelf obj_id) venv in
(* Translate locals *)
let* venv'' = List.fold_left
(fun result entry ->
let* venv = result in
match entry with
| ObjectExpr expr ->
let* (venv', _) = translate expr venv in (ok venv')
| ObjectField (attr, expr) ->
ok (Env.add_obj_field attr (Lazy expr) obj_id venv)
)
(ok venv')
entries
in
(* Check for cyclical references among object fields *)
let* () = List.fold_left
(fun ok' entry -> ok' >>= fun _ ->
match entry with
| ObjectField (attr, _) ->
check_cyclic_refs venv'' (Env.uniq_field_ident obj_id attr) [] pos
| _ -> ok'
)
(ok ())
entries
in
(* Then translate object fields *)
let* entry_types = List.fold_left
(fun result entry ->
let* entries' = result in
match entry with
| ObjectField (attr, _) ->
let* (_, entry_ty) = Env.get_obj_field attr obj_id venv''
~succ:translate_lazy
~err:(Error.error_at pos)
in ok (entries' @ [TobjectField (attr, entry_ty)])
| _ ->
result
)
(ok [])
entries
in
ok (venv, Tobject entry_types)
Now cyclical object fields are caught at type check time:
$ dune exec -- tsonnet samples/semantics/invalid_binding_cycle_object_fields.jsonnet
samples/semantics/invalid_binding_cycle_object_fields.jsonnet:3:7 Cyclic reference found for 1->a
3: b: self.a,
^^^^^^^^^^^^^^
And even mixed cycles between local variables and object fields:
$ dune exec -- tsonnet samples/semantics/invalid_binding_cycle_object_field_and_local.jsonnet
samples/semantics/invalid_binding_cycle_object_field_and_local.jsonnet:2:14 Cyclic reference found for 1->b
2: local a = self.b,
^^^^^^^^^^^^^^^^^^^^^
And cram tests won't let these errors creep in again:
diff --git a/samples/semantics/invalid_binding_cycle_object_field_and_local.jsonnet b/samples/semantics/invalid_binding_cycle_object_field_and_local.jsonnet
new file mode 100644
index 0000000..39f6957
--- /dev/null
+++ b/samples/semantics/invalid_binding_cycle_object_field_and_local.jsonnet
@@ -0,0 +1,4 @@
+{
+ local a = self.b,
+ b: a,
+}
diff --git a/samples/semantics/invalid_binding_cycle_object_fields.jsonnet b/samples/semantics/invalid_binding_cycle_object_fields.jsonnet
new file mode 100644
index 0000000..22a0905
--- /dev/null
+++ b/samples/semantics/invalid_binding_cycle_object_fields.jsonnet
@@ -0,0 +1,4 @@
+{
+ a: self.b,
+ b: self.a,
+}
diff --git a/test/cram/semantics.t b/test/cram/semantics.t
index b26b131..56d4cec 100644
--- a/test/cram/semantics.t
+++ b/test/cram/semantics.t
@@ -56,3 +56,17 @@
1: local a = b;
^^^^^^^^^^^^
[1]
+
+ $ tsonnet ../../samples/semantics/invalid_binding_cycle_object_fields.jsonnet
+ ../../samples/semantics/invalid_binding_cycle_object_fields.jsonnet:3:7 Cyclic reference found for 1->a
+
+ 3: b: self.a,
+ ^^^^^^^^^^^^^^
+ [1]
+
+ $ tsonnet ../../samples/semantics/invalid_binding_cycle_object_field_and_local.jsonnet
+ ../../samples/semantics/invalid_binding_cycle_object_field_and_local.jsonnet:2:14 Cyclic reference found for 1->b
+
+ 2: local a = self.b,
+ ^^^^^^^^^^^^^^^^^^^^^
+ [1]
Conclusion
The error messages still show internal identifiers like 1->b
instead of self.b
, but that's polish for another day. The core functionality is solid -- objects can now reference themselves safely and correctly.
Here is the entire diff.
In the upcoming post, I will most likely tackle the outer-most object reference. Don't know what it is? Then see you on the next one.
Thanks for reading Bit Maybe Wise! If you enjoyed watching objects become self-aware (without the robot uprising), subscribe for more tales of scope validation, infinite loop prevention, and the occasional "WHAT?!" moment when lazy evaluation gets too lazy!
Photo by Patrick von der Wehd on Unsplash
Top comments (0)