DEV Community

Cover image for Tsonnet #16 - Late binding and Jsonnet inconsistency
Hercules Lemke Merscher
Hercules Lemke Merscher

Posted on • Originally published at bitmaybewise.substack.com

Tsonnet #16 - Late binding and Jsonnet inconsistency

Welcome to the Tsonnet series!

If you're just joining, you can check out how it all started in the first post of the series.

In the previous post, we added support for local variables:

There's one aspect that I intentionally left out of scope that I want to talk about now: late binding.

What is late binding? Paraphrasing Wikipedia:

Late binding is a computer programming mechanism in which the method being called upon an object, or the function being called with arguments, is looked up by name at runtime.

In Tsonnet, even though we lack objects and functions yet, we aim to make the whole language lazily evaluated -- a requirement to be compatible with Jsonnet.

The Rationale for Lazy Semantics

In the page Rationale for Lazy Semantics, the design rationale of Jsonnet explains why it adopts the lazy evaluation semantics.

It says:

Therefore, for consistency, the whole language is lazy.

However, Jsonnet is rather inconsistent with late binding.

Let me show you.

Late binding examples

Here's a late binding assignment in Jsonnet:

// samples/variables/late_binding_array.jsonnet
local x = ["apple", y[1]],
y = [x[0], "banana"];
x
Enter fullscreen mode Exit fullscreen mode

This works according to the design rationale:

$ jsonnet samples/variables/late_binding_array.jsonnet
[
   "apple",
   "banana"
]
Enter fullscreen mode Exit fullscreen mode

Now, let's go one step back and try a simpler scenario:

// samples/variables/late_binding_simple.jsonnet
local a = 1;
local c = a + b;
local b = 2;
c
Enter fullscreen mode Exit fullscreen mode

It does not work:

$ jsonnet samples/variables/late_binding_simple.jsonnet
samples/variables/late_binding_simple.jsonnet:2:15-16 Unknown variable: b

local c = a + b;
Enter fullscreen mode Exit fullscreen mode

not funny

Our minds are conditioned to think in a linear fashion, but given lazy semantics, the local c binding should be left unevaluated, not causing any error. When c is eventually required, b, which was not in scope earlier, now is, making c valid.

I know it seems confusing if you're coming from a language where the compiler eagerly evaluates definitions from top to bottom. How are we supposed to use something not in scope yet, right!? But in lazy semantics, the compiler defers the evaluation, keeping track of the symbols that refer to the blocks of code (still unevaluated).

Turns out, Tsonnet supports late binding out of the box, given the implementation covered in the previous post:

$ dune exec -- tsonnet samples/variables/late_binding_simple.jsonnet
3
Enter fullscreen mode Exit fullscreen mode

Array access, not yet:

$ dune exec -- tsonnet samples/variables/late_binding_array.jsonnet
samples/variables/late_binding_array.jsonnet:1:22 Invalid syntax

1: local x = ["apple", y[1]],
   ^^^^^^^^^^^^^^^^^^^^^^^
Enter fullscreen mode Exit fullscreen mode

But only because we haven't implemented index-based access for arrays.

Let's solve this now.

Index-based access for arrays

We need a new expr variant type to cover index-based access:

diff --git a/lib/ast.ml b/lib/ast.ml
index cf2bf81..d712ba5 100644
--- a/lib/ast.ml
+++ b/lib/ast.ml
@@ -32,6 +32,7 @@ type expr =
   | Local of position * (string * expr) list
   | Unit
   | Seq of expr list
+  | IndexedExpr of position * string * expr

 let dummy_pos = {
   startpos = Lexing.dummy_pos;
@@ -43,4 +44,36 @@ let dummy_expr = Unit
 let pos_from_lexbuf (lexbuf : Lexing.lexbuf) : position =
   { startpos = lexbuf.lex_curr_p;
     endpos = lexbuf.lex_curr_p;
-  };
+  }
+
+let string_of_type = function
+  | Null _ -> "Null"
+  | Number (_, number) ->
+    (match number with
+    | Int _ -> "Int"
+    | Float _ -> "Float")
+  | Bool _ -> "Bool"
+  | String _ -> "String"
+  | Ident _ -> "Identity"
+  | Array _ -> "Array"
+  | Object _ -> "Object"
+  | BinOp (_, bin_op, _, _) ->
+    let prefix = "Binary Operation" in
+    let bin_op = match bin_op with
+    | Add -> "+"
+    | Subtract -> "-"
+    | Multiply -> "*"
+    | Divide -> "/"
+    in prefix ^ " " ^ bin_op
+  | UnaryOp (_, unary_op, _) ->
+    let prefix = "Unary Operation" in
+    let unary_op = match unary_op with
+    | Plus -> "+"
+    | Minus -> "-"
+    | Not -> "!"
+    | BitwiseNot -> "~"
+    in prefix ^ " " ^ unary_op
+  | Local _ -> "Local"
+  | Unit -> "()"
+  | Seq _ -> "Sequence"
+  | IndexedExpr _ -> "Indexed Expression"
Enter fullscreen mode Exit fullscreen mode

The string_of_type will come in handy to output friendlier error messages.

The parser can then make use of IndexedExpr:

diff --git a/lib/parser.mly b/lib/parser.mly
index c4fc9b4..bf88591 100644
--- a/lib/parser.mly
+++ b/lib/parser.mly
@@ -50,7 +50,8 @@ assignable_expr:
   | e = scoped_expr { e }
   | e = literal { e }
   | e1 = assignable_expr; op = bin_op; e2 = assignable_expr { BinOp (with_pos $startpos $endpos, op, e1, e2) }
   | op = unary_op; e = assignable_expr; { UnaryOp (with_pos $startpos $endpos, op, e) }
+  | varname = ID; LEFT_SQR_BRACKET; e = assignable_expr; RIGHT_SQR_BRACKET { IndexedExpr (with_pos $startpos $endpos, varname, e) }
   ;

 scoped_expr:
Enter fullscreen mode Exit fullscreen mode

Let's add a new function to the Env module to facilitate the variable lookups:

let find_var varname env ~succ ~err =
  match Map.find_opt varname env with
  | Some expr ->
    let* (env', evaluated_expr) = succ env expr in
    (* Since `succ` has evaluated expr, we can now memoize it
       and subsequent look ups operating in this new environment
       will already have it evaluated *)
    let updated_env = Map.add varname evaluated_expr env' in
    Result.ok (updated_env, evaluated_expr)
  | None -> err ("Undefined variable: " ^ varname)
Enter fullscreen mode Exit fullscreen mode

The memoizing is a bonus to make our implementation faster in future lookups. It's nice when we can get a performance boost almost for free.

Now comes the interpreter.

We were returning arrays as-is before, but to make indexed access work properly, we need to evaluate each entry before returning it:

diff --git a/lib/tsonnet.ml b/lib/tsonnet.ml
index f136fd6..574d2af 100644
@@ -1,8 +1,6 @@
 open Ast
 open Result
-
-let (let*) = Result.bind
-let (>>=) = Result.bind
+open Syntax_sugar

 (** [parse s] parses [s] into an AST. *)
 let parse (filename: string) =
@@ -60,11 +58,21 @@ let interpret_unary_op (op: unary_op) (evaluated_expr: expr) =
 (** [interpret expr] interprets and reduce the intermediate AST [expr] into a result AST. *)
 let rec interpret env expr =
   match expr with
-  | Null _ | Bool _ | String _ | Number _ | Array _ | Object _ -> ok (env, expr)
+  | Null _ | Bool _ | String _ | Number _ | Object _ -> ok (env, expr)
+  | Array (pos, exprs) ->
+    (let rec eval' env' exprs' =
+      match exprs' with
+      | [] -> ok (env', [])
+      | e :: exprs ->
+        (let* (env1, expr') = interpret env' e in
+        let* (env2, rest) = eval' env1 exprs in
+        ok (env2, expr' :: rest))
+    in eval' env exprs >>= fun (env3, exprs') -> ok (env3, Array (pos, exprs'))
+    )
Enter fullscreen mode Exit fullscreen mode

The identity match can now use Env.find_var:

   | Ident (pos, varname) ->
-    (match Env.Map.find_opt varname env with
-    | Some expr -> interpret env expr
-    | None -> Error.trace ("Undefined variable: " ^ varname) pos >>= error)
+    Env.find_var varname env
+      ~succ:(fun env' expr -> interpret env' expr)
+      ~err:(fun err_msg -> Error.trace err_msg pos >>= error)
   | BinOp (pos, op, e1, e2) ->
     (let* (env1, e1') = interpret env e1 in
     let* (env2, e2') = interpret env1 e2 in
Enter fullscreen mode Exit fullscreen mode

Didn't change much here, but we now have memorized lookups.

And, the IndexedExpr to wrap it up:

@@ -93,6 +101,23 @@ let rec interpret env expr =
     | [] -> ok (env, Unit)
     | [expr] -> interpret env expr
     | (expr :: exprs) -> interpret env expr >>= fun (env', _) -> interpret env' (Seq exprs))
+  | IndexedExpr (pos, varname, index_expr) ->
+    Env.find_var varname env
+      ~succ:(fun env' expr ->
+      match expr with
+      | Array (_, exprs) ->
+        let* (env', idx_expr') = interpret env' index_expr in
+        (match idx_expr' with
+        | Number (_, Int i)->
+          (let len = List.length exprs in
+          if i >= 0 && i < len
+          then ok (env', List.nth exprs i)
+          else Error.trace ("Index out of bounds. Trying to access index " ^ string_of_int i ^ " but \"" ^ varname ^ "\" length is " ^ string_of_int len) pos >>= error)
+        | expr' -> Error.trace ("Expected Integer index, got " ^ Ast.string_of_type expr') pos >>= error
+        )
+      | _ -> Error.trace ("Expected array, found: " ^ varname) pos >>= error
+      )
+      ~err:(fun err_msg -> Error.trace err_msg pos >>= error)

 let run (filename: string) : (string, string) result =
   let env = Env.Map.empty in
Enter fullscreen mode Exit fullscreen mode

Since we need to look up the array index value here, Env.find_var will look it up, and then we can do all sorts of checks, like index out of bounds, invalid index type, etc.

It looks ugly, I know, sorry! Refactoring will follow, but this should be sufficient to get done with our important feature.

Attentive readers might have noticed the open Syntax_sugar. This is a new module to combine function aliases reused across the project:

let (let*) = Result.bind
let (>>=) = Result.bind
Enter fullscreen mode Exit fullscreen mode

Let's add a few new sample files to cover some use cases -- I'm ignoring the others already presented earlier.

This is to cover arrays' index out of bounds:

// samples/errors/array_index_out_of_bounds.jsonnet
local list = [1,2,3];
local index = 4;
list[index]
Enter fullscreen mode Exit fullscreen mode

This is to cover arrays' invalid index based access:

// samples/errors/array_index_not_int.jsonnet
local list = [1,2,3];
local index = "O";
list[index]
Enter fullscreen mode Exit fullscreen mode

Cram tests for the happy paths:

diff --git a/test/cram/variables.t b/test/cram/variables.t
index b9f54f1..2889c7f 100644
--- a/test/cram/variables.t
+++ b/test/cram/variables.t
@@ -9,3 +9,10 @@

   $ tsonnet ../../samples/variables/scoped.jsonnet
   42
+
+  $ tsonnet ../../samples/variables/late_binding_simple.jsonnet
+  3
+
+  $ tsonnet ../../samples/variables/late_binding_array.jsonnet
+  [ "apple", "banana" ]
+
Enter fullscreen mode Exit fullscreen mode

And here are the cram tests for the unhappy paths:

diff --git a/test/cram/errors.t b/test/cram/errors.t
index 7b2e2ab..6a85e3b 100644
--- a/test/cram/errors.t
+++ b/test/cram/errors.t
@@ -37,3 +37,24 @@
      ^^^^^^^^^^^^^^^^
   [1]

+  $ tsonnet ../../samples/errors/undefined_local.jsonnet
+  ../../samples/errors/undefined_local.jsonnet:3:0 Undefined variable: c
+  
+  3: c
+     ^
+  [1]
+
+  $ tsonnet ../../samples/errors/array_index_out_of_bounds.jsonnet
+  ../../samples/errors/array_index_out_of_bounds.jsonnet:3:0 Index out of bounds. Trying to access index 4 but "list" length is 3
+  
+  3: list[index]
+     ^^^^^^^^^^^
+  [1]
+
+  $ tsonnet ../../samples/errors/array_index_not_int.jsonnet
+  ../../samples/errors/array_index_not_int.jsonnet:3:0 Expected Integer index, got String
+  
+  3: list[index]
+     ^^^^^^^^^^^
+  [1]
+
Enter fullscreen mode Exit fullscreen mode

And we are done:

$ dune exec -- tsonnet samples/variables/late_binding_simple.jsonnet
3

$ dune exec -- tsonnet samples/variables/late_binding_array.jsonnet
[ "apple", "banana" ]

$ dune exec -- tsonnet samples/errors/array_index_not_int.jsonnet
samples/errors/array_index_not_int.jsonnet:3:0 Expected Integer index, got String

3: list[index]
   ^^^^^^^^^^^
$ dune exec -- tsonnet samples/errors/array_index_out_of_bounds.jsonnet
samples/errors/array_index_out_of_bounds.jsonnet:3:0 Index out of bounds. Trying to access index 4 but "list" length is 3

3: list[index]
   ^^^^^^^^^^^
Enter fullscreen mode Exit fullscreen mode

Conclusion

We can write code declaratively in Tsonnet, independent of declaration order. It should keep working consistently, and tests will guarantee that this property still holds in the future.

Eventually, when Tsonnet is ready for practical use, editor tooling such as jump to definition will help with the non-linear declarations. Tools surrounding the language are as important as the language itself.

As mentioned, refactoring will follow to address the ugly IndexedExpr. The interpret function is becoming bloated the more we introduce new code, making it difficult to read. I will explore some improvements soon.

The diff with all the changes can be seen here.


Don't bind yourself late to Tsonnet updates! Subscribe to Bit Maybe Wise now, but feel free to read the posts in any order you want.

Photo by Erik Mclean on Unsplash

Top comments (0)