DEV Community

Hercules Lemke Merscher
Hercules Lemke Merscher

Posted on • Originally published at bitmaybewise.substack.com

Tsonnet #37 - Call me maybe, but make it typed, part 3

Welcome to the Tsonnet series!

If you're not following along, check out how it all started in the first post of the series.

In the previous post, we added default function arguments:

Now let's implement closures.

Telephone cabins

Captured and ready to be called

We should accept a closure that is going to be assigned to a local, and immediately called closures:

// samples/functions/closure.jsonnet

// binding closure
local id = function(x) x;
// immediately called closure
(function(x) id(x) * id(x))(5)
Enter fullscreen mode Exit fullscreen mode

Parsing: teaching the grammar new tricks

We need two new AST variants:

diff --git a/lib/ast.ml b/lib/ast.ml
index f4c1472..e317b6e 100644
--- a/lib/ast.ml
+++ b/lib/ast.ml
@@ -92,6 +92,8 @@ type expr =
   | IndexedExpr of position * string * expr
   | FunctionDef of position * (string * (string * expr option) list * expr)
   | FunctionCall of position * string * expr list
+  | Closure of position * ((string * expr option) list * expr)
+  | ClosureCall of position * (string * expr option) list * expr * expr list

 and object_entry =
   | ObjectField of string * expr
Enter fullscreen mode Exit fullscreen mode

Then we update the lexer to recognise the new keyword:

diff --git a/lib/lexer.mll b/lib/lexer.mll
index a4c3745..4d65f55 100644
--- a/lib/lexer.mll
+++ b/lib/lexer.mll
@@ -80,6 +80,7 @@ rule read =
   | "self" { SELF }
   | "$" { TOP_LEVEL_OBJ }
   | "in" { IN }
+  | "function" { FUNCTION }
   | id { ID (Lexing.lexeme lexbuf) }
   | _ { raise (SyntaxError ("Unexpected char: " ^ Lexing.lexeme lexbuf)) }
   | eof { EOF }
Enter fullscreen mode Exit fullscreen mode

And the parser:

diff --git a/lib/parser.mly b/lib/parser.mly
index 290b280..8b3821e 100644
--- a/lib/parser.mly
+++ b/lib/parser.mly
@@ -13,6 +13,7 @@
 %token NULL
 %token <bool> BOOL
 %token <string> STRING
+%token FUNCTION
 %token LEFT_SQR_BRACKET RIGHT_SQR_BRACKET
 %token LEFT_PAREN RIGHT_PAREN
 %token COMMA
@@ -42,11 +43,13 @@
 prog:
   | e = expr; EOF { e }
   | e = expr_seq; EOF { e }
+  | e = closure; EOF { e }
   ;

 expr:
   | e = assignable_expr { e }
   | e = vars { e }
+  | e = closure_call { e }
   ;
Enter fullscreen mode Exit fullscreen mode

I stumbled upon a shift-reduce issue trying to make closure part of the assignable_expr parsing rule — it would make more semantic sense there, but the grammar conflicts beat me. I'll come back to fix it later.

Here's the short version of why it's not feasible as-is: if closure lives in assignable_expr, then (function(x) x) becomes reachable via two paths simultaneously — as a parenthesised closure via scoped_expr, and as the first half of a closure_call. When the parser sees ( function(x) body . ) and hits ), it can't decide whether to reduce into scoped_expr or shift ) as part of closure_call — that would require two tokens of lookahead, which LR(1) doesn't have. The only clean fix would be generalising funcall to accept any expression as the callee, so (function(x) x)(5) parses naturally as "call this expression with these args" — but that's a bigger refactor touching the AST, type checker, and interpreter.

Here's how the new closure and closure_call rules are implemented for now:

@@ -171,7 +174,9 @@ obj_field_access:
   ;

 var:
-  varname = ID; ASSIGN; e = assignable_expr { (varname, e) };
+  | varname = ID; ASSIGN; e = assignable_expr { (varname, e) }
+  | varname = ID; ASSIGN; e = closure { (varname, e) }
+  ;

 vars:
   | LOCAL; vars = separated_nonempty_list(COMMA, var) { Local (with_pos $startpos $endpos, vars) }
@@ -203,4 +208,19 @@ funcall:
   | fname = ID;
     LEFT_PAREN; params = separated_nonempty_list(COMMA, assignable_expr); RIGHT_PAREN
     { FunctionCall (with_pos $startpos $endpos, fname, params) }
+  ;
+
+closure:
+  | FUNCTION;
+    LEFT_PAREN; params = separated_nonempty_list(COMMA, fundef_param); RIGHT_PAREN;
+    body = assignable_expr { Closure (with_pos $startpos $endpos, (params, body)) }
+  ;
+
+closure_call:
+  | LEFT_PAREN; FUNCTION;
+    LEFT_PAREN; def_params = separated_nonempty_list(COMMA, fundef_param); RIGHT_PAREN;
+    body = assignable_expr;
+    RIGHT_PAREN;
+    LEFT_PAREN; call_params = separated_nonempty_list(COMMA, assignable_expr); RIGHT_PAREN;
+    { ClosureCall (with_pos $startpos $endpos, def_params, body, call_params) }
   ;
Enter fullscreen mode Exit fullscreen mode

Type checking: the elephant in the room

The new type variants mirror the ones in ast.ml:

diff --git a/lib/type.ml b/lib/type.ml
index 05bc187..ee912b3 100644
--- a/lib/type.ml
+++ b/lib/type.ml
@@ -19,6 +19,8 @@ type tsonnet_type =
   | Tunresolved
   | TfunctionDef of (string * tsonnet_type) list (* params: name * type *) * expr (* body *) * tsonnet_type (* return *)
   | TfunctionCall of tsonnet_type list * tsonnet_type
+  | Tclosure of (string * tsonnet_type) list * expr
+  | TclosureCall of tsonnet_type list * tsonnet_type
 and t_object_entry =
   | TobjectField of string * tsonnet_type
   | TobjectExpr of tsonnet_type
Enter fullscreen mode Exit fullscreen mode

The collect_free_idents function needs updating for the new variants — I also noticed I'd forgotten FunctionCall here earlier (oops!):

@@ -84,6 +95,10 @@ let rec collect_free_idents = function
   | ObjectFieldAccess (_, _, exprs) -> List.concat_map collect_free_idents exprs
   | IndexedExpr (_, name, e) -> name :: collect_free_idents e
   | Local (_, vars) -> List.concat_map (fun (_, e) -> collect_free_idents e) vars
+  | FunctionCall (_, name, args) -> name :: List.concat_map collect_free_idents args
+  | Closure (_, (_, body)) -> collect_free_idents body
+  | ClosureCall (_, _, body, args) ->
+    collect_free_idents body @ List.concat_map collect_free_idents args
   | _ -> []
Enter fullscreen mode Exit fullscreen mode

Then we pattern-match on the new AST nodes in translate:

@@ -178,6 +193,9 @@ let rec translate venv expr =
   | IndexedExpr (pos, varname, index_expr) -> translate_indexed_expr venv (pos, varname, index_expr)
   | FunctionDef (pos, def) -> translate_function_def venv (pos, def)
   | FunctionCall (pos, fname, params) -> translate_function_call venv (pos, fname, params)
+  | Closure (pos, (params, body)) -> translate_closure venv (pos, params, body)
+  | ClosureCall (pos, def_params, body, call_params) ->
+    translate_closure_call venv (pos, def_params, body, call_params)
   | expr' ->
     error (Error.Msg.type_invalid_expr (string_of_type expr'))
Enter fullscreen mode Exit fullscreen mode

I also added a catch-all for Tany in binary operations — if either side is Tany, we let it through rather than erroring:

@@ -472,6 +490,7 @@ and translate_bin_op venv pos op e1 e2 =
   | LessThan, Tstring, Tstring -> ok (venv'', Tbool)
   | LessThanOrEqual, Tstring, Tstring -> ok (venv'', Tbool)
   | In, Tstring, (Tobject _ | Tany | TruntimeObject _ | TobjectPtr _) -> ok (venv'', Tbool)
+  | _, Tany, _ | _, _, Tany -> ok (venv'', Tany)
   | _ -> Error.error_at pos Error.Msg.invalid_binary_op
Enter fullscreen mode Exit fullscreen mode

translate_function_call also needs to handle Tclosure — if a bound name resolves to a closure, we re-dispatch to translate_closure_call:

@@ -561,9 +580,70 @@ and translate_function_call venv (pos, fname, call_params) =
       let resolved_fun = TfunctionDef (resolved_params, body_expr, resolved_return) in
       let venv_with_resolved_fun = Env.add_local fname resolved_fun venv' in
       ok (venv_with_resolved_fun, resolved_return)
+  | Some (Lazy expr) ->
+    let* (venv', resolved) = translate venv expr in
+    (* Re-dispatch with the resolved type *)
+    let venv'' = Env.add_local fname resolved venv' in
+    translate_function_call venv'' (pos, fname, call_params)
+  | Some (Tclosure (def_params, body_expr)) ->
+    translate_closure_call venv (pos, 
+      List.map (fun (name, _ty) -> (name, None)) def_params,
+      body_expr, call_params)
   | _ ->
     Error.error_at pos (Error.Msg.var_not_found fname)
Enter fullscreen mode Exit fullscreen mode

translate_closure is pretty similar to translate_function_def:

and translate_closure venv (_pos, params, body) =
  let* params_typed = List.fold_left
    (fun acc (name, default) ->
      let* params' = acc in
      match default with
      | Some default_expr ->
        let* (_, default_ty) = translate venv default_expr in
        ok (params' @ [(name, default_ty)])
      | None ->
        ok (params' @ [(name, Tunresolved)])
    )
    (ok [])
    params
  in
  ok (venv, Tclosure (params_typed, body))
Enter fullscreen mode Exit fullscreen mode

And translate_closure_call is pretty similar to translate_function_call:

and translate_closure_call venv (pos, def_params, body_expr, call_params) =
  let num_call = List.length call_params in
  let num_def = List.length def_params in
  let num_required =
    List.length (List.filter (fun (_, default) -> Option.is_none default) def_params)
  in
  if num_call < num_required || num_call > num_def
  then Error.error_at pos (Error.Msg.wrong_number_of_params num_def num_call)
  else
    let* (venv', resolved_params) =
      List.fold_left
        (fun acc (index, (param_name, default)) ->
          let* (venv', params') = acc in
          if index < num_call
          then
            let call_param = List.nth call_params index in
            let* (venv'', call_param_type) = translate venv' call_param in
            ok (venv'', params' @ [(param_name, call_param_type)])
          else
            match default with
            | Some default_expr ->
              let* (venv'', default_type) = translate venv' default_expr in
              ok (venv'', params' @ [(param_name, default_type)])
            | None -> Error.error_at pos (Error.Msg.wrong_number_of_params num_def num_call)
        )
        (ok (venv, []))
        (List.mapi (fun i p -> (i, p)) def_params)
    in
    let body_venv = List.fold_left
      (fun env (name, ty) -> Env.add_local name ty env)
      venv'
      resolved_params
    in
    let* (_, body_type) = translate body_venv body_expr in
    ok (venv, body_type)
Enter fullscreen mode Exit fullscreen mode

There's room for simplification here, but let's move on for now.

It's worth pausing on a behaviour difference between Tsonnet and standard Jsonnet. Given this sample:

// samples/semantics/invalid_function_call_type.jsonnet
local my_function(x) = x * 2;
my_function(3) && my_function("oops")
Enter fullscreen mode Exit fullscreen mode

Jsonnet fails at runtime after evaluating the first call, because && expects a boolean and gets a number first:

$ jsonnet samples/semantics/invalid_function_call_type.jsonnet
RUNTIME ERROR: Unexpected type number, expected boolean
    samples/semantics/invalid_function_call_type.jsonnet:2:1-38 $
    During evaluation
Enter fullscreen mode Exit fullscreen mode

Tsonnet instead type-checks the function calls before even getting to the &&, and catches the type mismatch earlier:

dune exec -- tsonnet samples/semantics/invalid_function_call_type.jsonnet
ERROR: samples/semantics/invalid_function_call_type.jsonnet:2:18 Expected type Number, got String

2: my_function(3) && my_function("oops")
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Enter fullscreen mode Exit fullscreen mode

I think that's a more sensible error. Worth noting: I want to support polymorphic parameters and return types eventually, but for this prototype, this is enough.

Interpreting: where closures actually close over things

During interpretation, a closure definition is returned as-is — it just gets stored in the environment. The actual work happens at call time:

diff --git a/lib/interpreter.ml b/lib/interpreter.ml
index 6ad259e..7412fb9 100644
--- a/lib/interpreter.ml
+++ b/lib/interpreter.ml
@@ -28,6 +28,8 @@ let rec interpret env expr =
   | IndexedExpr (pos, varname, index_expr) -> interpret_indexed_expr env (pos, varname, index_expr)
   | FunctionDef (pos, def) -> interpret_function_def env (pos, def)
   | FunctionCall (pos, fname, params) -> interpret_function_call env (pos, fname, params)
+  | Closure _ -> ok (env, expr)
+  | ClosureCall (pos, def_params, body, params) -> interpret_closure_call env (pos, def_params, body, params)
Enter fullscreen mode Exit fullscreen mode

The function application logic was straightforward to generalise into apply_function, which both interpret_function_call and interpret_closure_call now share:

and apply_function env pos def_params body call_params =
  let num_call = List.length call_params in
  let num_def = List.length def_params in
  let num_required =
    List.length (
      List.filter (fun (_, default) -> Option.is_none default) def_params
    )
  in
  if num_call < num_required || num_call > num_def
  then
    Error.error_at pos (Error.Msg.wrong_number_of_params num_def num_call)
  else
    let* evaluated_call_params =
      List.fold_left
        (fun acc param ->
          let* params = acc in
          let* (_, v) = interpret env param in
          ok (params @ [v])
        )
        (ok [])
        call_params
    in
    let* bindings =
      List.fold_left
        (fun acc (index, (param_name, default)) ->
          let* bindings = acc in
          if index < num_call
          then
            ok (bindings @ [(param_name, List.nth evaluated_call_params index)])
          else
            match default with
            | Some default_expr -> ok (bindings @ [(param_name, default_expr)])
            | None -> Error.error_at pos (Error.Msg.wrong_number_of_params num_def num_call)
        )
        (ok [])
        (List.mapi (fun i p -> (i, p)) def_params)
    in
    let env' = List.fold_left
      (fun env (k, v) -> Env.add_local k v env)
      env
      bindings
    in
    let* (_, result) = with_fresh_evaluating_bindings
      (fun () -> interpret env' body)
    in
    ok (env, result)
Enter fullscreen mode Exit fullscreen mode

interpret_function_call and interpret_closure_call now just delegate to it:

 and interpret_function_call env (pos, fname, call_params) =
   match Env.find_opt fname env with
+  | Some (Closure (pos, (def_params, body)))
   | Some (FunctionDef (pos, (_, def_params, body))) ->
-    ...
+    apply_function env pos def_params body call_params
   | _ ->
     Error.error_at pos (Error.Msg.var_not_found fname)

+and interpret_closure_call env (pos, def_params, body, call_params) =
+  apply_function env pos def_params body call_params
Enter fullscreen mode Exit fullscreen mode

I also used this opportunity to fix a leftover hard-coded error string I'd missed during an earlier refactoring:

diff --git a/lib/error.ml b/lib/error.ml
-  let type_wrong_number_of_params expected got =
-    Printf.sprintf "Expected %d argument(s), got %d" expected got
+  let wrong_number_of_params expected got =
+    Printf.sprintf "Expected %d argument(s), got %d" expected got
Enter fullscreen mode Exit fullscreen mode

The rename also moves the message out of the type_-prefixed block, since both the type checker and the interpreter now share it.

Does it work? (Yes it does)

$ dune exec -- tsonnet samples/functions/closure.jsonnet
25
Enter fullscreen mode Exit fullscreen mode

And the cram test:

diff --git a/test/cram/functions.t b/test/cram/functions.t
index fa29868..5c77837 100644
--- a/test/cram/functions.t
+++ b/test/cram/functions.t
@@ -6,3 +6,6 @@

   $ tsonnet ../../samples/functions/default_args.jsonnet
   12
+
+  $ tsonnet ../../samples/functions/closure.jsonnet
+  25
Enter fullscreen mode Exit fullscreen mode

Conclusion

Closures are in. Binding them to locals, calling them immediately, passing them around — it all works. The type checker was the trickier part, requiring extra plumbing to re-dispatch when a bound name turns out to be a closure, but the interpreter side practically wrote itself once apply_function was factored out. The shift-reduce issue in the parser is still there, nagging at me from the grammar file — but that's a problem for future me, probably involving a refactor to let funcall accept any expression as callee.

Here is the entire diff.

Next up, I think this can be simplified, and I will do so.


Thanks for reading Bit Maybe Wise! Closures capture their environment. This newsletter captures compilers being built in the open. Subscribe before the environment gets GC'd.

Photo by The Now Time on Unsplash

Top comments (0)