DEV Community

Jason Barr
Jason Barr

Posted on

Create Your Own Programming Language 5: Vectors and Records

Welcome to the next entry in the "Create Your Own Programming Language" series. In this article we're going to add vectors and records to the Wanda Programming Language, plus member expressions with dot syntax.

I know, I know, if you're a hardcore Lisper that's heresy. It's not my first heresy and it will be far from my last one, so it is what it is. It does complicate reading and parsing a bit, but that's the price we pay for convenience.

We're also going to add some destructuring in variable assignment, but only one level deep. No nested destructuring because I didn't want to make things overly complex. Type checking for destructuring was an interesting exercise, though, so I wanted to at least show you how to implement some destructuring even if it's not as fully-featured as what you can do in JavaScript or Clojure. We're also only allowing destructuring in variable declarations, not in set expressions.

As always, if you haven't read the last post in the series about variables and building the type checker you'll want to make sure to do that before continuing on with this one.

We good? Good. Let's go.

Fixing a Mistake from Last Time

The first thing we need to do is fix a mistake I made in the previous article. I've fixed the code there, but there's a good chance you read that article before I made the fixes in it so I'm going to mention them here as well.

What happened is I didn't adequately make sure the TypeChecker class returns type-annotated nodes based on the original AST nodes.

In its original state some AST nodes got decorated with types, but others did not. Here are the updated methods to make sure all nodes in the AST get decorated with types:

  checkProgram(node, env) {
    let body = [];
    let type;
    let i = 0;
    for (let expr of node.body) {
      if (i === node.body.length - 1) {
        const node = this.check(expr, env);
        type = node.type;
        body.push(node);
      } else {
        const node = this.check(expr, env);
        body.push(node);
      }
    }

    return { kind: node.kind, body, srcloc: node.srcloc, type };
  }

  checkDoExpression(node, env) {
    let body = [];
    for (let expr of node.body) {
      const node = this.check(expr, env);
      body.push(node);
    }

    return {
      kind: node.kind,
      body,
      srcloc: node.srcloc,
      type: infer(node, env),
    };
  }

  checkVariableDeclaration(node, env) {
    if (node.typeAnnotation) {
      const annotType = fromTypeAnnotation(node.typeAnnotation);
      check(node.expression, annotType, env);
      env.checkingOn = true;
      env.set(node.lhv.name, annotType);
      return { ...node, type: annotType };
    }

    const type = infer(node, env);
    env.set(node.lhv.name, type);
    return {
      kind: node.kind,
      lhv: node.lhv,
      expression: this.check(node.expression, env),
      srcloc: node.srcloc,
      type,
    };
  }

  checkSetExpression(node, env) {
    if (env.checkingOn) {
      const nameType = env.getType(node.lhv.name);
      check(node.expression, nameType, env);
      return { ...node, type: nameType };
    }

    return {
      kind: node.kind,
      lhv: node.lhv,
      expression: this.check(node.expression, env),
      srcloc: node.srcloc,
      type: infer(node, env),
    };;
  }
Enter fullscreen mode Exit fullscreen mode

Everything else is ok as it was originally published. Now let's move on to the new changes!

Preliminary Changes

First, I had to remove colons as valid symbol characters in src/lexer/utils.js. Make sure your isSymbolChar function looks like this:

export const isSymbolChar = (ch) =>
  /[=@~<>%&|?\\/^*&#'\p{L}\p{N}_$!+-]/u.test(ch);
Enter fullscreen mode Exit fullscreen mode

I had accidentally included them originally, so I want to make sure you remove the colon from the symbol character regex like I did.

Note that this means you can now write variable type annotations with the colon adjacent to the symbol instead of having to put a space in between, e.g.:

(var (x: number) 7)
Enter fullscreen mode Exit fullscreen mode

I also changed the constructor for SyntaxException in src/shared/exceptions.js so it can take an optional 3rd parameter expected:

constructor(value, srcloc, expected = "") {
    super(
      `Syntax Exception: invalid syntax ${value} found at ${srcloc.file} (${
        srcloc.line
      }:${srcloc.col})${expected ? ` (expected ${expected})` : ""}`
    );
  }
Enter fullscreen mode Exit fullscreen mode

Changes to The Lexer

We need to add a few tokens to the lexer. Vector literals will be written by putting a list of expressions between square brackets ([ and ]), and record literals will be written by putting entries between curly braces ({ and }). We'll also include the ampersand (&) for use as a rest variable indicator during destructuring. We'll also add a token for the dot for use with member expressions.

Let's add character matchers for braces, brackets, and the ampersand in src/lexer/utils.js:

// after isRParen
export const isLBrack = (ch) => /\[/.test(ch);
export const isRBrack = (ch) => /\]/.test(ch);
export const isLBrace = (ch) => /\{/.test(ch);
export const isRBrace = (ch) => /\}/.test(ch);
export const isAmp = (ch) => /&/.test(ch);
Enter fullscreen mode Exit fullscreen mode

Now let's add the token types to the TokenTypes enum in src/lexer/TokenTypes.js:

export const TokenTypes = {
  // after RParen
  LBrack: "LBrack",
  RBrack: "RBrack",
  LBrace: "LBrace",
  RBrace: "RBrace",
  Dot: "Dot",
  Amp: "Amp",
};
Enter fullscreen mode Exit fullscreen mode

Finally, the changes to the actual lexer. We'll need to add the new functions to the imports in src/lexer/Lexer.js:

import {
  isAmp,
  // ...
  isLBrace,
  isLBrack,
  //...
  isRBrace,
  isRBrack,
  //...
} from "./utils.js";
Enter fullscreen mode Exit fullscreen mode

To keep it simple I've just added additional else if cases in the main loop in the tokenize method:

// after else if (isRParen(ch)) ...
      } else if (isLBrack(ch)) {
        const { pos, line, col, file } = this.input;
        this.input.next(); // skip over punc
        tokens.push(
          Token.new(TokenTypes.LBrack, ch, SrcLoc.new(pos, line, col, file))
        );
      } else if (isRBrack(ch)) {
        const { pos, line, col, file } = this.input;
        this.input.next(); // skip over punc
        tokens.push(
          Token.new(TokenTypes.RBrack, ch, SrcLoc.new(pos, line, col, file))
        );
      } else if (isLBrace(ch)) {
        const { pos, line, col, file } = this.input;
        this.input.next(); // skip over punc
        tokens.push(
          Token.new(TokenTypes.LBrace, ch, SrcLoc.new(pos, line, col, file))
        );
      } else if (isRBrace(ch)) {
        const { pos, line, col, file } = this.input;
        this.input.next(); // skip over punc
        tokens.push(
          Token.new(TokenTypes.RBrace, ch, SrcLoc.new(pos, line, col, file))
        );
      } else if (isDot(ch)) {
        const { pos, line, col, file } = this.input;
        this.input.next(); // skip over punc
        tokens.push(
          Token.new(TokenTypes.Dot, ch, SrcLoc.new(pos, line, col, file))
        );
      } else if (isAmp(ch)) {
        const { pos, line, col, file } = this.input;
        this.input.next(); // skip over punc
        tokens.push(
          Token.new(TokenTypes.Amp, ch, SrcLoc.new(pos, line, col, file))
        );
      } else {
// error case...
Enter fullscreen mode Exit fullscreen mode

If you'd prefer to refactor this into a dedicated method for lexing punctuation tokens, that's fine too. You'd probably need to add a character matching function for punctuation tokens in src/lexer/utils.js and add it to your import.

Changes to The Reader

We need to add two methods to the Reader class. First, update your imports in src/reader/Reader.js to import the Exception class:

import { Exception } from "../shared/exceptions.js";
Enter fullscreen mode Exit fullscreen mode

The two methods we're going to add are expect, which looks for a certain kind of token and throws an error if it doesn't get it, and lookahead, which looks ahead n tokens in the stream and returns that token:

  expect(expected, actual) {
    if (expected !== actual) {
      throw new Exception(`Expected ${expected} token; got ${actual}`);
    }
  }

  lookahead(n = 1) {
    return this.tokens[this.pos + n];
  }
Enter fullscreen mode Exit fullscreen mode

Now we need to make some changes in src/reader/read.js to handle the new forms. We'll need to handle vector literals, record literals, member expressions with dot syntax, and both vector and record patterns for destructuring.

We're going to keep vector and record patterns very simple: each is just a list of symbols. No renaming in record patterns like you can do when destructuring objects in JavaScript, so no need to worry about colons in record patterns.

[a b c]
{a b c}
Enter fullscreen mode Exit fullscreen mode

Indicate a rest variable with an ampersand, e.g.:

(var [a &b] [1 2 3])
Enter fullscreen mode Exit fullscreen mode

A vector literal is just a list of expressions between square brackets, e.g.:

[1 2 3]
Enter fullscreen mode Exit fullscreen mode

Note that you can put commas between the expressions and the lexer will read them as whitespace:

[1, 2, 3]
Enter fullscreen mode Exit fullscreen mode

A record literal is a list of key/value pairs (keys and values separated by colons) between curly braces:

{a: "hi"}
Enter fullscreen mode Exit fullscreen mode

A member expression is an object and property with a dot between them. An object can be any expression, but a property will always be a valid symbol:

object.property
Enter fullscreen mode Exit fullscreen mode

Reading vector/record literals and patterns is fairly straight forward: we look for [ or { and then read accordingly.

Reading member expressions requires a little more work.

Pratt Parsing

We're actually going to use a simplified version of Vaughan Pratt's precedence algorithm to read member expressions. Pratt's algorithm can be used to read just about any expression type, including ternary, binary, and unary expressions, but we're just going to use it for member expressions. We won't be introducing binary expression syntax into Wanda or anything like that.

Pratt's algorithm works by parsing in a loop. You assign all operations a numeric precedence. The default is 0, so any operator will override the default precedence.

Start by parsing the left hand expression. Then you peek at the next token. If it's an operator, get its precedence.

Then you enter a loop. While the precedence for the current operator is greater than the precedence of the just-parsed expression, you continue parsing expressions until you either run out of expressions to parse or you run into an operator with lower precedence.

Since we only have one operator, the dot for member expressions, we just keep parsing member expressions until we run out of dots.

First, we'll need a helper function to get the current token's precedence:

const getPrec = (token) => PREC[token?.type] ?? 0;
Enter fullscreen mode Exit fullscreen mode

Then at the top of read.js add this object:

const PREC = {
  [TokenTypes.Dot]: 90,
};
Enter fullscreen mode Exit fullscreen mode

The object is there so getPrec can get the precedence for the dot operator, which is the only token that has a precedence value.

Next, a function to read member expressions. In src/reader/read.js add the readMemberExpression function (I have it just above readForm in my code):

const readMemberExpression = (reader, left) => {
  const tok = reader.next();
  const prec = getPrec(tok);
  reader.expect(TokenTypes.Dot, tok.type);
  const property = readExpr(reader, prec);
  return {
    type: "MemberExpression",
    object: left,
    property,
    srcloc: left.srcloc,
  };
};
Enter fullscreen mode Exit fullscreen mode

Here's the new version of readExpr that implements Pratt's algorithm:

const readExpr = (reader, bp = 0) => {
  let left = readForm(reader);
  let tok = reader.peek();
  let prec = getPrec(tok);

  while (bp < prec) {
    left = readMemberExpression(reader, left);
    tok = reader.peek();
    prec = getPrec(tok);
  }

  return left;
};
Enter fullscreen mode Exit fullscreen mode

Believe it or not, you can use Pratt's algorithm to parse just about any kind of expression. I have a toy project that can parse basically any kind of expression as long as you feed it a table with precedence, arity, and associativity if you want to see what's possible with a Pratt parser (obviously since it's a toy it's not for production use - there are definitely bugs).

I've added some checks to readForm to throw SyntaxExceptions if it runs into a closing brace or bracket in the wrong position, and made it dispatch to readVector and readMaybeRecord functions on encountering an opening bracket or brace:

const readForm = (reader) => {
  const tok = reader.peek();

  switch (tok.type) {
    case TokenTypes.RParen:
      // there shouldn't be an RParen here
      throw new SyntaxException(tok.value, tok.srcloc);
    case TokenTypes.RBrack:
      // there shouln't be an RBrack here
      throw new SyntaxException(tok.value, tok.srcloc);
    case TokenTypes.RBrace:
      // there shouldn't be an RBrace here
      throw new SyntaxException(tok.value, tok.srcloc);
    case TokenTypes.LParen:
      return readList(reader);
    case TokenTypes.LBrack:
      return readVector(reader);
    case TokenTypes.LBrace:
      return readMaybeRecord(reader);
    default:
      return readAtom(reader);
  }
};
Enter fullscreen mode Exit fullscreen mode

The readVector function is straightforward:

const readVector = (reader) => {
  // Get srcloc info from opening bracket and skip it
  let tok = reader.next();
  const srcloc = tok.srcloc;
  let lastTok = tok;
  tok = reader.peek();
  /** @type {Form[]} */
  let members = [];

  while (tok?.type !== TokenTypes.RBrack) {
    if (!tok) {
      throw new SyntaxException("EOF", lastTok.srcloc, "]");
    }

    members.push(readExpr(reader));
    lastTok = tok;
    tok = reader.peek();
  }

  // skip closing bracket
  reader.skip();

  return {
    type: "VectorLiteral",
    members,
    srcloc,
  };
};
Enter fullscreen mode Exit fullscreen mode

The readMaybeRecord function dispatches to readRecordLiteral and readRecordPattern. readRecordLiteral uses readProperty as a helper. Here are all 4 functions:

const readMaybeRecord = (reader) => {
  let tok = reader.next();
  const srcloc = tok.srcloc;
  // First token after brace should always be a symbol
  tok = reader.peek();
  reader.expect(TokenTypes.Symbol, tok.type);
  tok = reader.lookahead(1);

  if (tok.type === TokenTypes.Keyword && tok.value === ":") {
    // record literal = { prop : value, prop2: value2 }
    return readRecordLiteral(reader, srcloc);
  } else {
    // record pattern = { prop, prop2 }
    return readRecordPattern(reader, srcloc);
  }
};

const readProperty = (reader) => {
  let tok = reader.peek();
  let srcloc = tok.srcloc;

  reader.expect(TokenTypes.Symbol, tok.type);

  const key = readExpr(reader);

  tok = reader.peek();
  reader.expect(":", tok.value);
  reader.skip();

  const value = readExpr(reader);

  return {
    type: "Property",
    key,
    value,
    srcloc,
  };
};

const readRecordLiteral = (reader, srcloc) => {
  let tok = reader.peek();
  /** @type {Property[]} */
  let properties = [];
  let lastTok = tok;

  while (tok?.type !== TokenTypes.RBrace) {
    if (!tok) {
      throw new SyntaxException("EOF", lastTok.srcloc, "}");
    }

    reader.expect(TokenTypes.Symbol, tok.type);
    properties.push(readProperty(reader));
    lastTok = tok;
    tok = reader.peek();
  }

  // skip closing brace
  reader.skip();

  return {
    type: "RecordLiteral",
    properties,
    srcloc,
  };
};

const readRecordPattern = (reader, srcloc) => {
  /** @type {Token[]} */
  let properties = [];
  let tok = reader.peek();
  let lastTok = tok;

  while (tok?.type !== TokenTypes.RBrace) {
    if (!tok) {
      if (!tok) {
        throw new SyntaxException("EOF", lastTok.srcloc, "}");
      }
    }

    tok = reader.peek();
    properties.push(readExpr(reader));
    lastTok = tok;
    tok = reader.peek();
  }

  // skip closing brace
  reader.skip();

  return {
    type: "RecordPattern",
    properties,
    srcloc,
  };
};
Enter fullscreen mode Exit fullscreen mode

That's it for changes to the reader, and hopefully now you understand parsing a little better!

Changes to The Parser

First, we need to add some new node types to the ASTTypes enum in src/parser/ast.js:

export const ASTTypes = {
  // additional node types
  VectorLiteral: "VectorLiteral",
  VectorPattern: "VectorPattern",
  Property: "Property",
  RecordLiteral: "RecordLiteral",
  RecordPattern: "RecordPattern",
  MemberExpression: "MemberExpression",
};
Enter fullscreen mode Exit fullscreen mode

Let's also add node constructors:

export const AST = {
  // additional node constructors...
  VectorPattern(members, srcloc, rest = false) {
    return {
      kind: ASTTypes.VectorPattern,
      members,
      srcloc,
      rest,
    };
  },

  VectorLiteral(members, srcloc) {
    return {
      kind: ASTTypes.VectorLiteral,
      members,
      srcloc,
    };
  },

  Property(key, value, srcloc) {
    return {
      kind: ASTTypes.Property,
      key,
      value,
      srcloc,
    };
  },

  RecordPattern(properties, srcloc, rest = false) {
    return {
      kind: ASTTypes.RecordPattern,
      properties,
      srcloc,
      rest,
    };
  },

  RecordLiteral(properties, srcloc) {
    return {
      kind: ASTTypes.RecordLiteral,
      properties,
      srcloc,
    };
  },

  MemberExpression(object, property, srcloc) {
    return {
      kind: ASTTypes.MemberExpression,
      object,
      property,
      srcloc,
    };
  },
};
Enter fullscreen mode Exit fullscreen mode

In src/parser/parse.js we need to handle parsing the new forms. In parseExpr we'll delegate to a new parseComplexForm function to handle the new forms:

const parseExpr = (form) => {
  if (form instanceof Cons) {
    return parseList(form);
  }

  if (form instanceof Token && form.type === TokenTypes.Amp) {
    // Will be handled when parsing vector or record pattern
    return form;
  }

  if (form instanceof Token) {
    return parsePrimitive(form);
  }

  return parseComplexForm(form);
};
Enter fullscreen mode Exit fullscreen mode

We'll need a parseProperty helper function for parsing record literals:

const parseProperty = (form) => {
  const key = parseExpr(form.key);
  const value = parseExpr(form.value);
  return AST.Property(key, value, form.srcloc);
};
Enter fullscreen mode Exit fullscreen mode

The parseComplexForm function handles all the forms that aren't primitives or lists, so all the new forms added today:

const parseComplexForm = (form) => {
  switch (form.type) {
    case "VectorLiteral": {
      const members = form.members.map(parseExpr);
      return AST.VectorLiteral(members, form.srcloc);
    }
    case "RecordLiteral": {
      const properties = form.properties.map(parseProperty);
      return AST.RecordLiteral(properties, form.srcloc);
    }
    case "RecordPattern": {
      let properties = [];
      let rest = false;

      for (let prop of form.properties) {
        if (prop instanceof Token && prop.type === TokenTypes.Amp) {
          rest = true;
          continue;
        }

        if (prop instanceof Token && prop.type !== TokenTypes.Symbol) {
          throw new SyntaxException(prop.type, prop.srcloc, TokenTypes.Symbol);
        }

        properties.push(parseExpr(prop));
      }

      return AST.RecordPattern(properties, form.srcloc, rest);
    }
    case "MemberExpression": {
      const object = parseExpr(form.object);
      const property = parseExpr(form.property);
      return AST.MemberExpression(object, property, form.srcloc);
    }
    default:
      // this should never happen
      throw new SyntaxException(form.type, form.srcloc);
  }
};
Enter fullscreen mode Exit fullscreen mode

You probably noticed that vector pattern is missing. That's because in the reader phase a vector pattern is indistinguishable from a vector literal made up only of symbols. We'll have to convert vector literals to vector patterns if they're in the left hand position of a variable declaration.

Here's the function to do that:

const convertVectorLiteralToVectorPattern = (parsedLhv) => {
  let members = [];
  let rest = false;
  for (let mem of parsedLhv.members) {
    if (mem instanceof Token && mem.type === TokenTypes.Amp) {
      rest = true;
      continue;
    }

    if (mem.kind === ASTTypes.VectorLiteral) {
      mem = convertVectorLiteralToVectorPattern(mem);
    } else if (
      mem.kind !== ASTTypes.Symbol &&
      mem.kind !== ASTTypes.RecordPattern
    ) {
      throw new SyntaxException(
        mem.kind,
        mem.srcloc,
        `${ASTTypes.Symbol} or ${ASTTypes.RecordPattern}`
      );
    }

    members.push(mem);
  }

  return AST.VectorPattern(members, parsedLhv.srcloc, rest);
};
Enter fullscreen mode Exit fullscreen mode

And here's the new version of parseVariableDeclaration that handles vector and record patterns:

const parseVariableDeclaration = (decl) => {
  let [_, lhv, expression] = decl;

  let parsedLhv,
    typeAnnotation = null;
  if (lhv instanceof Cons) {
    // has type annotation
    const realLhv = lhv.get(0);
    // convert to array and get rid of ":" when passing into parseTypeAnnotation
    typeAnnotation = parseTypeAnnotation([...lhv.cdr].slice(1));
    parsedLhv = parseExpr(realLhv);
  } else {
    parsedLhv = parseExpr(lhv);
  }

  if (parsedLhv.kind === ASTTypes.VectorLiteral) {
    parsedLhv = convertVectorLiteralToVectorPattern(parsedLhv);
  }

  const parsedExpression = parseExpr(expression);

  return AST.VariableDeclaration(
    parsedLhv,
    parsedExpression,
    decl.srcloc,
    typeAnnotation
  );
};
Enter fullscreen mode Exit fullscreen mode

Finally, we need to be able to parse type annotations for vector and record types. A vector type annotation is similar to that for a list: you simply write vector and put the type in a list, e.g. vector (number).

The type annotation for a record type is like an object type literal in TypeScript. It looks like an object literal, but with a type instead of a value for the object property:

type Person = {
  name: string
  age: number
}
Enter fullscreen mode Exit fullscreen mode

Note that we don't put semicolons after property types because in Wanda a semicolon starts a comment.

First, add the new type annotation types to the TATypes enum in src/parser/parseTypeAnnotation.js:

export const TATypes = {
  // additional TA types
  Vector: "Vector",
  Object: "Object",
};
Enter fullscreen mode Exit fullscreen mode

Parsing vector annotations is just like parsing list annotations:

const parseVectorAnnotation = (type) => {
  const vectorType = parseTypeAnnotation(type);
  return { kind: TATypes.Vector, vectorType };
};
Enter fullscreen mode Exit fullscreen mode

Don't forget to add the vector case to the switch statement in parseTypeAnnotation:

      // other cases
      case "vector":
        return parseVectorAnnotation(annotation[1]);
      // default case
Enter fullscreen mode Exit fullscreen mode

We find a quirk when we go to parse object type literals: the reader has already processed them as RecordLiteral forms! That's to be expected because the reader has no way to disambiguate between a form and a type annotation. So we'll check for a type of "RecordLiteral" on the annotation and dispatch that to parseObjectAnnotation in parseTypeAnnotation:

export const parseTypeAnnotation = (annotation) => {
  // checking for Cons and array
  if (annot.type === "RecordLiteral") {
    return parseObjectAnnotation(annot);
  }
  // switch statement inside if (annot.type === TokenTypes.Symbol)
};
Enter fullscreen mode Exit fullscreen mode

Finally, here's the parseObjectAnnotation function:

const parseObjectAnnotation = (annot) => {
  let properties = [];
  for (let prop of annot.properties) {
    const name = prop.key.value;
    const propType = parseTypeAnnotation(prop.value);
    properties.push({ name, propType });
  }

  return { kind: TATypes.Object, properties };
};
Enter fullscreen mode Exit fullscreen mode

That's it for changes to the parser! Now we need to add some shared types and functions before moving on to the type checker.

Changes to Shared Functionality

I've added 3 new exception types to the compiler. I didn't add a TypeException last time when first making the type checker, but I did for this article. I've also added ReferenceException and RuntimeException. In src/shared/exceptions.js:

export class TypeException extends Exception {
  constructor(msg, srcloc) {
    super(`${msg} at ${srcloc.file} ${srcloc.line}:${srcloc.col}`);
  }
}

export class ReferenceException extends Exception {
  constructor(msg, srcloc) {
    super(`${msg} at ${srcloc.file} ${srcloc.line}:${srcloc.col}`);
  }
}

export class RuntimeException extends Exception {
  constructor(msg) {
    super(msg);
    this[Symbol.for(":dict")] = { message: msg };
  }
}
Enter fullscreen mode Exit fullscreen mode

RuntimeException won't have access to AST nodes, so there's no parameter for srcloc info in its constructor.

Also, since RuntimeException may be accessible from inside Wanda code, the exception object gets the same :dict property all Wanda objects will get. I've also added it to Exception. I'll talk more about the :dict property below.

There are also a few utils we'll need to eventually share between different compiler modules, so let's go ahead and add them to src/shared/utils.js:

export const isNullish = (obj) => obj == null;

export const hasProperty = (obj, prop) => !isNullish(obj?.[prop]);

export const hasMethod = (obj, methodName) =>
  hasProperty(obj, methodName) && typeof obj[methodName] === "function";
Enter fullscreen mode Exit fullscreen mode

Changes to The Type Checker

We've got a fair bit of work to do to add new types to the type checker.

Note that we're just going to consider records to be object types, and we'll use the same object types when we add classes and objects to the language later.

Object types are structural, so the type checker cares more about what shape an object has and the types of its properties than it does about what class was used to construct an object. This greatly simplifies object type checking, but it will result in some less helpful error messages when debugging object types unless we do a little extra work when it comes to handling object classes.

First, we need to add new members to the TypeTypes enum in src/typechecker/types.js:

export const TypeTypes = {
  // other types...
  Vector: "Vector",
  Property: "Property",
  Object: "Object",
};
Enter fullscreen mode Exit fullscreen mode

In src/typechecker/constructors.js I've renamed listType to list and added constructors for vector and object:

export const list = (listType) => ({ kind: TypeTypes.List, listType });

export const vector = (vectorType) => ({
  kind: TypeTypes.Vector,
  vectorType,
});

export const object = (properties) => ({ kind: TypeTypes.Object, properties });
Enter fullscreen mode Exit fullscreen mode

We also need isVector and isObject validators in src/typechecker/validators.js:

export const isVector = (type) => {
  return type.kind === TypeTypes.Vector;
};

export const isObject = (type) => {
  return type.kind === TypeTypes.Object;
};
Enter fullscreen mode Exit fullscreen mode

We need to be able to construct concrete types from vector and object type annotations in src/typechecker/fromTypeAnnotation.js. I've also added the ability to annotate a type as any, which is currently necessary in some cases because we don't yet have generics in the language:

    case TATypes.AnyLiteral:
      return Type.any;
    // additional switch cases from NumberLiteral to List...
    case TATypes.Vector: {
      const vectorType = fromTypeAnnotation(typeAnnotation.vectorType, typeEnv);
      return Type.vector(vectorType);
    }
    case TATypes.Object: {
      const propTypes = typeAnnotation.properties.map((prop) => ({
        name: prop.name,
        type: fromTypeAnnotation(prop.propType, typeEnv),
      }));
      return Type.object(propTypes);
    }
    // default case...
Enter fullscreen mode Exit fullscreen mode

We'll add a propType helper function to easily get the type of an object property in src/typechecker/propType.js:

export const propType = (type, name) => {
  const prop = type.properties.find(({ name: propName }) => propName === name);
  return prop ? prop.type : null;
};
Enter fullscreen mode Exit fullscreen mode

This function returns null instead of throwing an error if the property type isn't found so we can instead throw the error from a point in the type checker closer to the actual error and thus provide better error messages.

Now we need to be able to check if an object is a subtype of another. An object is a subtype of another if it has all the same properties as the supertype and the corresponding properties are all subtypes of the same property on the supertype object. A subtype object can also have additional properties, but those aren't checked because they're not relevant to checking against the supertype.

We also add a case for vector types in src/typechecker/isSubtype.js:

  // additional cases
  if (Type.isVector(type1) && Type.isVector(type2)) {
    return isSubtype(type1.vectorType, type2.vectorType);
  }

  if (Type.isObject(type1) && Type.isObject(type2)) {
    return type2.properties.every(({ name: type2name, type: type2type }) => {
      const type1type = propType(type1, type2name);
      if (!type1type) return false;
      else return isSubtype(type1type, type2type);
    });
  }
  // default return
Enter fullscreen mode Exit fullscreen mode

We also need to be able to convert our new types to strings in src/typechecker/typeToString.js:

    // additional cases...
    case TypeTypes.Vector:
      return `vector (${typeToString(type.vectorType)})`;
    case TypeTypes.Object:
      return `{ ${type.properties
        .map((p) => `${p.name} : ${typeToString(p.type)}`)
        .join(", ")} }`;
    // default case
Enter fullscreen mode Exit fullscreen mode

We're also going to add functions to unify types. Unification is something you do when attempting to infer types. You unify types within a set of constraints and produce the most general type that satisfies all constraints for all types being unified.

It's especially important in Hindley-Damas-Milner type inference, which is used in ML-family languages like OCaml, but it's useful in bidirectional type checking as well.

Unification algorithms can get pretty complex, but ours is simple: we just return the common supertype of all types being unified or null if there is no common supertype. If any of the types being unified is the any type, we return that and effectively turn type checking off.

Note that in the future we'll add a new kind of type called intersection types that will introduce a common supertype for all types. When we do that, we'll return that type instead of null.

We'll add a function to unify 2 types and another to unify a list of types in src/typechecker/unify.js:

export const unify = (type1, type2) => {
  if (isSubtype(type1, type2)) {
    return type2;
  } else if (isSubtype(type2, type1)) {
    return type1;
  }

  return null;
};

export const unifyAll = (...types) => {
  return types.reduce((unified, type) => {
    if (unified === null) return null;
    if (Type.isAny(unified)) return unified;

    return unify(unified, type);
  });
};
Enter fullscreen mode Exit fullscreen mode

Next we'll add inference for our new types in src/typechecker/infer.js. First, update exports to include the new SyntaxException and new unifyAll and propType functions:

import { Exception, TypeException } from "../shared/exceptions.js";
import { unifyAll } from "./unify.js";
import { propType } from "./propType.js";
Enter fullscreen mode Exit fullscreen mode

Now in the infer function we need new dispatch cases for the new AST nodes:

    // previous cases
    case ASTTypes.VectorLiteral:
      return inferVectorLiteral(ast, env);
    case ASTTypes.RecordLiteral:
      return inferRecordLiteral(ast, env);
    case ASTTypes.MemberExpression:
      return inferMemberExpression(ast, env);
    // default case
Enter fullscreen mode Exit fullscreen mode

And we need to infer types for vector and record literals and member expressions:

const inferVectorLiteral = (node, env) => {
  if (node.members.length === 0) {
    // change this to never when we add union types
    return Type.vector(Type.any);
  }

  const types = node.members.map((m) => infer(m, env));
  const unified = unifyAll(...types);

  if (unified === null && env.checkingOn) {
    throw new TypeException(
      `Incompatible types in Vector literal`,
      node.srcloc
    );
  } else if (unified === null) {
    return Type.any;
  }

  return Type.vector(unified);
};

const inferRecordLiteral = (node, env) => {
  const properties = node.properties.map((prop) => ({
    kind: Type.Type.Property,
    name: prop.key.name,
    type: infer(prop.value, env),
  }));
  return Type.object(properties);
};

const inferMemberExpression = (node, env) => {
  const prop = node.property;
  const object = infer(node.object, env);

  if (!Type.isObject(object)) {
    if (env.checkingOn) {
      throw new TypeException(
        `Member expression expects object type; ${Type.toString(object)} given`,
        node.srcloc
      );
    } else {
      return Type.any;
    }
  }

  const type = propType(object, prop.name);

  if (!type && env.checkingOn) {
    throw new TypeException(
      `Property ${prop.name} not found on object of type ${Type.toString(
        object
      )}`,
      node.srcloc
    );
  }

  return type ?? Type.any;
};
Enter fullscreen mode Exit fullscreen mode

Note that in vector literals we unify all types included in the vector, which returns a common subtype. If unification fails (returns null) and type checking is on, we throw an error.

When inferring a member expression, we require the object's type to be an object type. We'll need to do some work here later on if we decide to allow method calls on non-object types (i.e. calling a method like String.prototype.toUpperCase).

When inferring a record literal we simply collect the inferred type of all properties and use them to construct an object type.

In src/typechecker/check.js we need to update our imports:

import { TypeException } from "../shared/exceptions.js";
import { isSubtype } from "./isSubtype.js";
import { propType } from "./propType.js";
Enter fullscreen mode Exit fullscreen mode

We'll update check to use the new TypeException and dispatch to our checkObject function for checking object types:

export const check = (ast, type, env) => {
  if (ast.kind === ASTTypes.RecordLiteral && Type.isObject(type)) {
    return checkObject(ast, type, env);
  }

  const inferredType = infer(ast, env);

  if (!isSubtype(inferredType, type)) {
    throw new TypeException(
      `Type ${Type.toString(
        inferredType
      )} is not a valid subtype of ${Type.toString(type)}`,
      ast.srcloc
    );
  }
};
Enter fullscreen mode Exit fullscreen mode

Here's checkObject. Note that it gets property names from the node, checks if every property name on the type is present in the node's property names, then checks the type of each property. Note that for record literals we require the node to have all properties specified on the type, without extra properties. This effectively disallows using record literals as subtypes of an object type on the assumption that since it's a literal object adding any extra properties is a mistake (the same assumption TypeScript makes). We can revisit this later if we decide we need to allow record literals to be subtypes of the types we check them against:

const checkObject = (ast, type, env) => {
  const astProps = ast.properties.map((prop) => ({
    name: prop.key.name,
    expr: prop.value,
  }));

  type.properties.forEach(({ name }) => {
    const astProp = astProps.find(({ name: astName }) => astName === name);

    if (!astProp) {
      throw new TypeException(
        `Property ${name} not found on record literal`,
        ast.srcloc
      );
    }
  });

  astProps.forEach(({ name, expr }) => {
    const pType = propType(type, name);

    if (!pType) {
      throw new TypeException(
        `Property ${name} not found on object of type ${Type.toString(type)}`,
        ast.srcloc
      );
    }

    check(expr, pType, env);
  });
};
Enter fullscreen mode Exit fullscreen mode

The TypeChecker class needs edits both to handle checking types for the new AST nodes and for handling destructuring in variable assignment.

First, we have several new imports so I'm just going to give you the whole list:

import { ASTTypes } from "../parser/ast.js";
import { Exception, TypeException } from "../shared/exceptions.js";
import { TypeEnvironment } from "./TypeEnvironment.js";
import { check } from "./check.js";
import { infer } from "./infer.js";
import { fromTypeAnnotation } from "./fromTypeAnnotation.js";
import { Type } from "./Type.js";
import { propType } from "./propType.js";
Enter fullscreen mode Exit fullscreen mode

Now here are the new cases to dispatch in the switch statement in the check method:

      // other cases...
      case ASTTypes.MemberExpression:
        return this.checkMemberExpression(node, env);
      case ASTTypes.RecordLiteral:
        return this.checkRecordLiteral(node, env);
      case ASTTypes.VectorLiteral:
        return this.checkVectorLiteral(node, env);
      // default case
Enter fullscreen mode Exit fullscreen mode

The methods for these cases are actually pretty simple. Here's checkMemberExpression:

  checkMemberExpression(node, env) {
    return { ...node, type: infer(node, env) };
  }
Enter fullscreen mode Exit fullscreen mode

checkRecordLiteral:

  checkRecordLiteral(node, env) {
    return { ...node, type: infer(node, env) };
  }
Enter fullscreen mode Exit fullscreen mode

and checkVectorLiteral:

  checkVectorLiteral(node, env) {
    return { ...node, type: infer(node, env) };
  }
Enter fullscreen mode Exit fullscreen mode

Since we're disallowing destructuring in set expressions, add this to the top of checkSetExpression:

    if (node.lhv.kind !== ASTTypes.Symbol) {
      throw new TypeException(
        `Cannot use destructuring with set! assignment`,
        node.srcloc
      );
    }
    // rest of method
Enter fullscreen mode Exit fullscreen mode

The big change comes to checkVariableDeclaration with the addition of destructuring.

If we have a vector destructuring pattern, we need to make sure the type being assigned is either a vector or a list.

If we have a record destructuring pattern, we need to be sure the type being assigned is an object.

We also need to handle rest variables. In the case of a vector pattern, this is simple: just assign the vector or list type to the rest variable.

With a record it gets more complicated. We have to keep track of which properties have already been used, then construct a new object type of the remaining properties and assign that to the rest variable.

Here's the new version of checkVariableDeclaration:

  checkVariableDeclaration(node, env) {
    let type;

    if (node.typeAnnotation) {
      type = fromTypeAnnotation(node.typeAnnotation, env);
      check(node.expression, type, env);
      env.checkingOn = true;
    } else {
      type = infer(node, env);
    }

    if (node.lhv.kind === ASTTypes.Symbol) {
      env.set(node.lhv.name, type);
    } else if (node.lhv.kind === ASTTypes.VectorPattern) {
      if (!Type.isVector(type) && !Type.isList(type)) {
        throw new TypeException(
          `Vector pattern destructuring must take a vector or list type`,
          node.srcloc
        );
      } else {
        let i = 0;
        for (let mem of node.lhv.members) {
          if (node.lhv.rest && i === node.lhv.members.length - 1) {
            env.set(mem.name, type);
          } else {
            env.set(mem.name, type.vectorType ? type.vectorType : type.listType);
          }
          i++;
        }
      }
    } else if (node.lhv.kind === ASTTypes.RecordPattern) {
      if (!Type.isObject(type)) {
        throw new TypeException(
          `Cannot destructure non-object type with record pattern`,
          node.srcloc
        );
      } else {
        let i = 0;
        /** @type {string[]} */
        let used = [];
        for (let prop of node.lhv.properties) {
          if (node.lhv.rest && i === node.lhv.properties.length - 1) {
            const unusedProps = type.properties.filter(
              (p) => !used.includes(p.name)
            );

            env.set(prop.name, Type.object(unusedProps));
          } else {
            const pType = propType(type, prop.name);

            if (!pType) {
              throw new TypeException(
                `Property ${
                  prop.name
                } not found on object of type ${Type.toString(type)}`,
                node.srcloc
              );
            }

            env.set(prop.name, pType);
            used.push(prop.name);
          }

          i++;
        }
      }
    }

    return {
      kind: node.kind,
      lhv: node.lhv,
      expression: this.check(node.expression, env),
      srcloc: node.srcloc,
      type,
    };
  }
Enter fullscreen mode Exit fullscreen mode

That's it for changes to the type checker. As you can see, checking types can be a lot of work. IMO It's worth it, though, when you consider how many bugs type checking can catch.

Changes to The Runtime

We need to make some updates to the runtime to handle our new expressions.

The biggest change is that we need to actually make the runtime object available from within compiled code. The reason we need to do that is because we're converting raw JS objects to a special object structure for Wanda.

Wanda objects have a :dict keyword property under the hood, and the :dict property is what actually holds its key/value pairs. The base object is for metadata. That means we need a getField function in the runtime that handles getting values from the :dict property of a Wanda object and can also get values from a raw JS object if it needs to.

We'll make some other changes as well, but that's the biggest one.

First, let's add a function to src/runtime/utils.js:

export const makeKeyword = (str) => Symbol.for(`:${str}`);
Enter fullscreen mode Exit fullscreen mode

Next we're going to add some functions to convert back and forth between Wanda and JS values. Add these 2 functions to src/runtime/conversion.js:

import { makeFunction } from "./makeFunction.js";
import { hasDict, hasMetaField, makeObject } from "./object.js";
import { makeKeyword } from "./utils.js";

export const makeWandaValue = (val) => {
  switch (typeof val) {
    case "undefined":
      return null;
    case "object":
      if (val === null) {
        return null;
      }

      if (Array.isArray(val)) {
        return val;
      }

      if (val.constructor?.name === "Cons") {
        return val;
      }

      return makeObject(val);
    default:
      return val;
  }
};

export const makeJSValue = (val) => {
  if (val === null) {
    return null;
  }

  switch (typeof val) {
    case "object":
      if (hasDict(val)) {
        return val[makeKeyword("dict")];
      }
    default:
      return val;
  }
};
Enter fullscreen mode Exit fullscreen mode

As you can see, this file requires 3 functions from src/runtime/object.js, which doesn't exist yet. Let's create it and add the following functions:

import { RuntimeException } from "../shared/exceptions.js";
import { hasProperty, hasMethod as hM } from "../shared/utils.js";
import { makeSymbol } from "./makeSymbol.js";
import { makeKeyword } from "./utils.js";

const DICT = makeKeyword("dict");

export const hasDict = (obj) => hasProperty(obj, DICT);

export const hasField = (obj, field) => {
  if (hasDict(obj)) {
    return hasProperty(obj[DICT], field);
  }

  return hasProperty(obj[field]);
};

export const hasMethod = (obj, method) => {
  if (hasDict(obj)) {
    return hM(obj[DICT], method);
  }

  return hM(obj, method);
};

export const getField = (obj, field) => {
  const value = hasDict(obj) ? obj[DICT][field] : obj?.[field];

  if (value === undefined) {
    failRuntime(
      `Field ${
        typeof field === "symbol" ? field.description : field
      } not found on object`
    );
  }

  if (typeof value === "function") {
    return hasDict(obj) ? value.bind(obj[DICT]) : value.bind(obj);
  }

  return value;
};

export const hasMetaField = (obj, field) => {
  const kw = makeKeyword(field);
  return hasProperty(obj[kw]);
};

export const getMetaField = (obj, field) => {
  const kw = makeKeyword(field);
  const value = obj[kw];

  if (value === undefined) {
    failRuntime(
      `Field ${
        typeof field === "symbol" ? field.description : field
      } not found on object`
    );
  }

  return value;
};

export const addMetaField = (obj, field, value) => {
  const meta = makeKeyword(field);
  Object.defineProperty(obj, meta, {
    configurable: false,
    enumerable: false,
    writable: false,
    value,
  });
};

export const makeObject = (obj) => {
  let newObj = {};
  addMetaField(newObj, "dict", obj);
  addMetaField(newObj, "constructor", function (...args) {
    return new obj.constructor(...args);
  });

  Object.defineProperty(newObj[makeKeyword("constructor")], "name", {
    configurable: false,
    enumerable: false,
    writable: false,
    value: obj.constructor?.name ?? "WandaObject",
  });

  // to allow destructuring
  for (let [k, v] of Object.entries(obj)) {
    newObj[makeSymbol(k)] = v;
  }

  return newObj;
};

export const failRuntime = (msg) => {
  throw new RuntimeException(msg);
};
Enter fullscreen mode Exit fullscreen mode

Now we're finally going to use the makeFunction function in src/runtime/makeFunction.js (although we'll be adding more to it in the future):

import { makeWandaValue } from "./conversion.js";
import { addMetaField } from "./object.js";

export const makeFunction = (func) => {
  const fn = (...args) => makeWandaValue(func(...args));
  addMetaField(fn, "wanda", true);

  return fn;
};
Enter fullscreen mode Exit fullscreen mode

We'll need a function to convert a string to a Wanda number in src/runtime/makeNumber.js. This will do more if I decide to add things like arbitrary-precision decimals, rational numbers, or other kinds of numbers to Wanda, but this will do fine for now:

export const makeNumber = (str) => Number(str);
Enter fullscreen mode Exit fullscreen mode

Now we need to add a package. We'll need the ability to generate unique, valid JavaScript identifiers in the compiler. You'll see one reason why when we get to the changes to the emitter, but it has to do with handling destructuring. We're going to use the cuid2 package that generates unique identifiers.

npm install @paralleldrive/cuid2
Enter fullscreen mode Exit fullscreen mode

Now import it into src/runtime/makeSymbol.js:

import { createId } from "@paralleldrive/cuid2";
Enter fullscreen mode Exit fullscreen mode

And add this function at the bottom of the file:

export const makeGenSym = () => PREFIX + createId();
Enter fullscreen mode Exit fullscreen mode

Finally, let's update src/runtime/makeRuntime.js:

import { fail } from "../shared/fail.js";
import { makeFunction } from "./makeFunction.js";
import * as utils from "./utils.js";
import * as obj from "./object.js";
import { makeSymbol, makeGenSym } from "./makeSymbol.js";
import { makeWandaValue, makeJSValue } from "./conversion.js";
import { makeNumber } from "./number.js";

export const makeRuntime = () => {
  return {
    ...utils,
    makeFunction,
    ...obj,
    makeSymbol,
    makeGenSym,
    makeWandaValue,
    makeJSValue,
    fail,
    makeNumber,
  };
};
Enter fullscreen mode Exit fullscreen mode

That's it for changes to the runtime! Now let's look at changes to the emitter.

Changes to The Emitter

The first change to the emitter is we need to add the runtime object in the global environment so it's available to compiled code, in src/emitter/emitGlobalEnv.js:

import path from "path";
import { ROOT_PATH } from "../../root.js";
import { makeGlobal } from "../runtime/makeGlobals.js";

export const emitGlobalEnv = () => {
  const globalEnv = makeGlobal();
  let code = `import { makeGlobal } from "${path.join(
    ROOT_PATH,
    "./src/runtime/makeGlobals.js"
  )}";
import { makeRuntime } from "${path.join(
    ROOT_PATH,
    "./src/runtime/makeRuntime.js"
  )}";

const globalEnv = makeGlobal();
rt = makeRuntime();
`;

  for (let [k] of globalEnv) {
    code += `${k} = globalEnv.get("${k}");\n`;
  }

  return code;
};
Enter fullscreen mode Exit fullscreen mode

Now in the emitter we can call the runtime functions in emitted code as long as we prefix them with rt.

Now for changes to the Emitter class. First, let's update the imports into src/emitter/Emitter.js:

import { ASTTypes } from "../parser/ast.js";
import { ReferenceException, SyntaxException } from "../shared/exceptions.js";
import { Namespace } from "../shared/Namespace.js";
import { makeGenSym, makeSymbol } from "../runtime/makeSymbol.js";
Enter fullscreen mode Exit fullscreen mode

We need to add cases in the emit method for all our new AST nodes:

      // other cases...
      case ASTTypes.MemberExpression:
        return this.emitMemberExpression(node, ns);
      case ASTTypes.RecordLiteral:
        return this.emitRecordLiteral(node, ns);
      case ASTTypes.RecordPattern:
        return this.emitRecordPattern(node, ns);
      case ASTTypes.VectorLiteral:
        return this.emitVectorLiteral(node, ns);
      case ASTTypes.VectorPattern:
        return this.emitVectorPattern(node, ns);
      // default case...
Enter fullscreen mode Exit fullscreen mode

We're going to edit some methods that emit primitive values to use the new makeWandaValue and makeNumber runtime functions:

  emitBoolean(node, ns) {
    return `rt.makeWandaValue(${node.value})`;
  }

  emitNil(node, ns) {
    return `rt.makeWandaValue(${null})`;
  }

  emitNumber(node, ns) {
    return `rt.makeNumber("${node.value}")`;
  }

  emitString(node, ns) {
    return `rt.makeWandaValue(${"`" + node.value.slice(1, -1) + "`"})`;
  }
Enter fullscreen mode Exit fullscreen mode

In emitting member expressions we'll use the rt.getField function:

  emitMemberExpression(node, ns) {
    return `rt.getField(${this.emit(node.object, ns)}, "${
      node.property.name
    }")`;
  }
Enter fullscreen mode Exit fullscreen mode

In emitting record literals we'll use the rt.makeObject function:

  emitRecordLiteral(node, ns) {
    let code = "rt.makeObject({";

    for (let prop of node.properties) {
      code += `"${prop.key.name}": ${this.emit(prop.value, ns)}, `;
    }

    code += "})";
    return code;
  }
Enter fullscreen mode Exit fullscreen mode

We'll use rt.makeWandaValue for emitting vector literals:

  emitVectorLiteral(node, ns) {
    let code = "rt.makeWandaValue([";

    for (let mem of node.members) {
      code += `${this.emit(mem, ns)}, `;
    }

    code += "])";
    return code;
  }
Enter fullscreen mode Exit fullscreen mode

Emitting record patterns is fairly straightforward except for one thing: handling rest variables. Since we're using the :dict property on Wanda variables, that greatly complicates destructuring objects.

For most properties it's simple, because we're making the JavaScript identifier versions of the property names and putting them on the base objects, but handling the rest variable is a whole ordeal and we're just not going to mess with it in this method. We'll handle it when emitting variable declarations instead. So we just break out of the loop when we get to the rest variable here.

  emitRecordPattern(node, ns) {
    let code = "{";

    let i = 0;
    for (let prop of node.properties) {
      if (node.rest && i === node.properties.length - 1) {
        // this is the rest variable, which requires extra work
        // see: this.emitVariableDeclarationAssignment
        break;
      } else {
        code += `${this.emit(prop, ns)}, `;
      }

      i++;
    }

    code += "}";
    return code;
  }
Enter fullscreen mode Exit fullscreen mode

Emitting vector patterns, on the other hand, is just what you'd expect:

  emitVectorPattern(node, ns) {
    let code = "[";

    let i = 0;
    for (let mem of node.members) {
      if (node.rest && i === node.members.length - 1) {
        code += `...${this.emit(mem, ns)}`;
      } else {
        code += `${this.emit(mem, ns)}, `;
      }
      i++;
    }

    code += "]";
    return code;
  }
Enter fullscreen mode Exit fullscreen mode

There are big changes to emitVariableDeclaration to handle destructuring. First, we need to make sure to populate the namespace with the mappings from Wanda identifiers to JavaScript identifiers. Then we'll delegate emitting the actual code to the new emitVariableDeclarationAssignment method.

  emitVariableDeclaration(node, ns) {
    if (node.lhv.kind === ASTTypes.Symbol) {
      const name = node.lhv.name;

      if (ns.has(name)) {
        throw new ReferenceException(
          `Name ${name} has already been accessed in the current namespace; cannot access name before its definition`,
          node.srcloc
        );
      }

      const translatedName = makeSymbol(name);
      ns.set(name, translatedName);
    } else if (
      node.lhv.kind === ASTTypes.VectorPattern ||
      node.lhv.kind === ASTTypes.RecordPattern
    ) {
      const members =
        node.lhv.kind === ASTTypes.RecordPattern
          ? node.lhv.properties
          : node.lhv.members;

      for (let mem of members) {
        if (ns.has(mem.name)) {
          throw new ReferenceException(
            `Name ${mem.name} has already been accessed in the current namespace; cannot access name before its definition`,
            mem.srcloc
          );
        }
        ns.set(mem.name, makeSymbol(mem.name));
      }
    }

    return this.emitVariableDeclarationAssignment(
      node.lhv,
      node.expression,
      ns
    );
  }
Enter fullscreen mode Exit fullscreen mode

In emitVariableDeclarationAssignment, handling cases where the LHV is a symbol or a vector pattern are straightforward: just emit both sides of the assignment in the context of a variable declaration.

  emitVariableDeclarationAssignment(lhv, rhv, ns) {
    if (lhv.kind === ASTTypes.Symbol) {
      return `var ${makeSymbol(lhv.name)} = ${this.emit(rhv, ns)}`;
    } else if (lhv.kind === ASTTypes.VectorPattern) {
      return `var ${this.emit(lhv, ns)} = ${this.emit(rhv, ns)}`;
    } // handling record pattern...
  }
Enter fullscreen mode Exit fullscreen mode

Handling the record pattern is a bit of a mess. We generate a unique identifier and assign the actual object to it. Then we destructure the record pattern off the unique identifier, skipping over the rest variable.

We use the type the AST has been decorated with to construct a new object containing the properties that haven't yet been used, then emit another variable declaration that assigns that object to the rest variable's identifier.

And that's how we do it.

This is a case where inferring types helps us even if the type checker is turned off.

Here's the emitVariableDeclarationAssignment method in its entirety:

  emitVariableDeclarationAssignment(lhv, rhv, ns) {
    if (lhv.kind === ASTTypes.Symbol) {
      return `var ${makeSymbol(lhv.name)} = ${this.emit(rhv, ns)}`;
    } else if (lhv.kind === ASTTypes.VectorPattern) {
      return `var ${this.emit(lhv, ns)} = ${this.emit(rhv, ns)}`;
    } else if (lhv.kind === ASTTypes.RecordPattern) {
      /* Note that this DOES NOT WORK on rest variables in nested record patterns */
      // create random variable to hold object being destructured
      const gensym = makeGenSym();
      let origObjCode = `var ${gensym} = ${this.emit(rhv, ns)}`;
      let code = `${origObjCode};\n`;

      code += `var ${this.emit(lhv, ns)} = ${gensym};\n`;

      if (lhv.rest) {
        /** @type {string[]} */
        let used = [];
        let i = 0;
        for (let prop of lhv.properties) {
          if (i !== lhv.properties.length - 1) {
            used.push(prop.name);
          }
          i++;
        }

        /** @type {import("../parser/ast.js").Property[]} */
        const unusedProps = rhv.type.properties.filter((p) => {
          return !used.includes(p.name);
        });

        const restVarName = lhv.properties[lhv.properties.length - 1].name;
        let restObjCode = "{ ";

        for (let prop of unusedProps) {
          restObjCode += `"${prop.name}": rt.getField(${gensym}, "${prop.name}"), `;
        }

        restObjCode += "}";

        code += `var ${makeSymbol(restVarName)} = ${restObjCode}`;
      }

      return code;
    }
  }
Enter fullscreen mode Exit fullscreen mode

That's all for changes to the emitter. Now for changes to the printer.

Changes to The Printer

All the changes in src/printer/printString.js are to the "object" case in the switch statement:

    // other cases
    case "object":
      if (value === null) {
        return "nil";
      } else if (value.constructor?.name === "Cons") {
        return printList(value);
      } else if (Array.isArray(value)) {
        return `[${value.map(printString).join(", ")}]`;
      }

      return hasDict(value)
        ? JSON.stringify(value[Symbol.for(":dict")], null, 2)
        : JSON.stringify(value, null, 2);
    // default case
Enter fullscreen mode Exit fullscreen mode

We're punting a bit by just using JSON.stringify, but we'll probably clean that up later.

We've also added methods to the ASTPrinter class to print the new nodes. First, the new cases for the switch statement in print:

      // other cases...
case ASTTypes.RecordLiteral:
        return this.printRecordLiteral(node, indent);
      case ASTTypes.RecordPattern:
        return this.printRecordPattern(node, indent);
      case ASTTypes.VectorLiteral:
        return this.printVectorLiteral(node, indent);
      case ASTTypes.VectorPattern:
        return this.printVectorPattern(node, indent);
      case ASTTypes.MemberExpression:
        return this.printMemberExpression(node, indent);
      // default case
Enter fullscreen mode Exit fullscreen mode

And the methods:

  printRecordLiteral(node, indent) {
    const prStr = `${prIndent(indent)}RecordLiteral:`
    prStr += `${prIndent(indent + 2)}Properties:`

    for (let prop of node.properties) {
      prStr += `${this.print(prop.key, indent + 4)}`;
      prStr += `${this.print(prop.value, indent + 4)}`;
    }

    return prStr
  }

  printRecordPattern(node, indent) {
    return `${prIndent(indent)}${node.properties.map(p => p.name).join(", ")}`;
  }

  printVectorLiteral(node, indent) {
    let prStr = `${prIndent(indent)}VectorLiteral:`

    for (let mem of node.members) {
      prStr += `${this.print(mem, indent + 2)}`;
    }

    return prStr;
  }

  printVectorPattern(node, indent) {
    return `${prIndent(indent)}${node.members.map (p => p.name).join(", ")}`;
  }

  printMemberExpression(node, indent) {
    let prStr = `${prIndent(indent)}MemberExpression:`
    prStr += `${prIndent(indent + 2)}Object:`
    prStr += `${this.print(node.object, indent + 4)}`;
    prStr += `${prIndent(indent + 2)}Property:`
    prStr += `${this.print(node.property, indent + 4)}`

    return prStr;
  }
Enter fullscreen mode Exit fullscreen mode

That's it for changes to the printer, and that does it for changes to the compiler to add these new features.

Go ahead and fire up a REPL using your wanda command and check them out!

Conclusion

We added 5 new node types, 2 new kinds of types, the ability to access record members with dot notation, and toplevel destructuring to our compiler. You're doing a fantastic job if you're keeping up with everything!

As always, feel free to let me know in the comments if you're having any difficulties. If there are bugs I'll try to fix them and update the article accordingly.

You can see the state of the code as of the end of this article in the Wanda repo on the relevant tag.

There are tests in the repo, but I must confess I haven't updated tests for this article so there's nothing new there since the last time I published. I need to do better at keeping up with tests.

There are also JSDoc blocks for nearly all functions and types defined in the repo to help with Intellisense if you want to hack on the compiler a bit. If you come up with any awesome new features, feel free to submit a pull request!

In the next article we'll implement functions and the ability to define your own functions, which will include the ability to define function contracts in native JS modules. Stay tuned!

Top comments (0)