Brian Neville-O'Neill

Posted on Jul 8, 2019 • Originally published at blog.logrocket.com on Mar 7, 2019

TypeScript vs PureScript: Not all compilers are created equal

#typescript #functionalprogrammi #purescript #javascript

There are many ways to skin a cat, and for each one there is a statically typed language that compiles to JavaScript. Among the most known we have TypeScript (TS), PureScript (PS), Elm, Reason, Flow and Dart. But why are there so many? Why do they exist and what is their rationale? In this post, we’ll take a look at the first two languages. We’ll analyze their similarities and differences, see how they approach different situations and understand what guarantees they provide.

Types, restrictions and guarantees

All languages have types, even the most permissive ones such as JavaScript (JS). At some point, there is a mapping between the logical values of our program and their physical representation. Knowing how they are translated will help us understand things, like why in some languages 0.1 + 0.2 != 0.3. Languages like JavaScript, Python and PHP are dynamically typed , which implies that when the code is executed and there’s a problem with the types, the interpreter will need to decide whether to coerce the values or throw a runtime error.

"Hello" + 1 // "Hello1"
null.f()    // TypeError: Cannot read property 'f' of null

Coercing string and numbers can be really handy when creating messages, but as the epic talk WAT by Gary Bernhardt shows, it can get weird really fast which can lead to unexpected errors.

In contrast, statically typed languages such as TypeScript or PureScript make us think about types explicitly. Most languages will infer most of the types so we don’t have to be too verbose, but at some point, we’ll have to provide some information about the data we want to compute, and how we are going to compute it. That information will help other programmers (or even our future self) understand the code, and it will allow our tooling to give us information and warnings, apply automatic fixes and even assist with refactoring. If there is a problem with the program, we’ll have an error at compile time, so the feedback loop will be shorter.

Each language can introduce different restrictions that impact the way we program. These restrictions will give us certain guarantees that will increase our confidence in the code. For example, if the language doesn’t allow us to use null, we’ll have a guarantee that we won’t have NullPointerExceptions, the billion dollar mistake, and we’ll probably need a different concept to represent failure or emptiness.

TypeScript vs PureScript

TypeScript is a language created by Microsoft in 2012 with the idea to help developers work with large-scale applications. It is a JavaScript superset, which means that a valid JavaScript program is a valid TypeScript program. This decision tells us a lot about their rationale, instead of creating new language features (e.g. traits, pipe operator, pattern matching, etc) TypeScript focuses on adding ways to type existing JavaScript, closely following the specification updates. It’s stated very clearly in their latest roadmap goals, when they say “Types on every desk, in every home, for every JS developer” and “Productivity through strong tooling”.

PureScript is a language created by Phil Freeman in 2013 and it’s maintained by the community. It is a strict, purely functional language inspired by Haskell. As such it provides many features and restrictions aimed to improve code correctness and developer productivity, such as immutability, pattern matching, currying, type classes and do expressions among others. It uses JavaScript as the main compilation target because of the benefits of running on the web, server, mobile and even google sheets, but it can also compile to C, C++ and even Erlang.

TypeScript took off in 2015 when Angular announced that it was building its second version with it. The decision to closely follow JS, the developer experience from using tools like VSCode and the confidence given by embracing its restrictions, encouraged other teams to rewrite big projects like Vue, Jest and Yarn. According to the State of JS 2018, TypeScript adoption doubled from 2016 to 2018. All of this resulted in an explosion of learning resources and a big, healthy ecosystem.

PureScript is not that popular in comparison, but functional programming, in general, has caught the eyes of many developers. Languages like PHP or Java added lambda expressions which enable the use of higher-order patterns, and the popularity of libraries like React or Redux helped people adopt pure functions and immutability. Other languages such as Elm have bigger communities and are a really good starting point in the functional world, but PS has some nice features that we’ll analyze in the post. Despite being small, the PureScript community is very active in the functional programming slack (#purescript channel) and in its discourse page.

Dissecting the output

A compiler transforms a higher level language into a lower level language, for example, C and GO compile to machine code that can be executed directly on a device, Scala and Kotlin compile to Java ByteCode, intended to be run in the Java Virtual Machine (JVM), and TypeScript and PureScript compile to JavaScript. The difference between the previous examples is that both the machine code and the Java ByteCode are very low-level binary languages while JavaScript is a high-level language that still needs to be interpreted by an engine like Chrome’s V8 or Firefox’s SpiderMonkey.

In this section, we’ll analyze the result of the compiled output of a simple hello world module. In both cases, we’ll export a main function that prints two lines in the console and uses a helper private function. The TypeScript source pretty much resembles the compiled output. Notice that the type information is removed and some module code is added, but other than that, the code is the same.

TypeScript with commonjs module and es5 target

TypeScript has many compiler options that can increase or decrease the strictness level and change how the output is constructed. For example, the target option, which defaults to es5, allows us to use newer language features such as arrow functions, async-await and destructuring in older browsers. Another option is module, which we can use to best suit our build process. By default, it uses commonjs, which is the default module loader in Node and it can also serve as the input for Browserify, Webpack or Parcel. If we set the option to es6, then the output will resemble the input even more because we are using es6 imports, that can be later fed to tools like rollup.

Even if both examples do exactly the same, they don’t resemble too much. That’s because PureScript tries to look more like Haskell than JavaScript. Coming from a C-family language, this syntax can seem strange, but we’ll explain it as we go. For now, notice that the type information is also missing from the output. Being static means that all type checks are performed at compile time and don’t incur in runtime overhead. By default, PS uses commonjs as its module definition, so you can use it in Node directly or feed it to an external bundler. You can also instruct the compiler to bundle all your files using globals.

The compiled code doesn’t look like something we’d write as our first choice. It has a lot of weird words like Semigroup, Bind and Effect and we can see it has an extra level of indirection inside the main function, where we first create a computation using Effect_Console.log(“Hello”), and then immediately execute it using (). This indirection is due to a restriction imposed by the language. As its name implies, PureScript code must be pure. It’s not obvious here, but this restriction will allow us to compose and extend our computations, building complex features out of simpler ones.

The pureness restriction gives us powerful guarantees. We said both examples do exactly the same, and at this moment they do nothing (at least not by themselves). In both cases, we are creating a module that exports a main function, and that’s it. If we want the code to actually run we should, at some point, call main(). In TypeScript we could’ve added the invocation in the same file, after all, it doesn’t impose us the purity restriction. PureScript, on the other hand_,_ forbids us from doing it, thus it assures us that importing a module can’t result in executing unknown side effects, such as connecting to a database. A library such as colors could use the freedom JS/TS gives to “improve its syntax” by automatically patching the String.prototype when you import the library. Introducing new properties to String.prototype could seem innocuous at first, but as the smoosh gate showed us, it could become a problem.

The purity assurances come with a cost. To interact with existing JavaScript from PureScript we need to create bindings using the Foreign Function Interface, and make sure all impure code gets wrapped. TypeScript, being closer to the metal (if you can call JS a metal), just requires us to provide typing information, and we have the freedom to choose when we want to be pure, and when we don’t.

Expressing types

In order to let other users and tooling know what your data and functions look like, we need to provide type information. TypeScript, being a JavaScript superset belongs to the C-family syntax, in which values, keywords and type information are intertwined in the same expression. Among the basic types we have JS primitive types, that doesn’t make a distinction between float types and integer types, there is just number.

const PI: number = 3.1416

let age: number = 32

Another common C convention is that identifiers such as PI, SOME_REGEX or API_URL are written in uppercase to indicate they are constant values (as if the const keyword wasn’t enough). Keep in mind that for complex types, constant values are not the same as immutable values. This example is overly verbose and could be simplified. The compiler can infer from the value that the type is number, so there is no need to be explicit, in here we’re just showing the complete syntax.

If we recall the exclaim function, we can notice that only the input was typed. It’s common in simple cases like this to omit the return type and let the inference system save our precious keystrokes. But we could add the type explicitly to work as a post-condition, making sure the compiler fails if we have some discrepancy.

function exclaim (str: string): string {
    return str + "!!!";
}

We need to provide explicit types for the input of a top-level function, if we don’t, the compiler will infer the unsafe type any. This can lead to errors as the any silently propagates, which is why TS added a strictness option called no-implicit-any that will throw an error. To increase developer productivity through tooling, in version 3.2 TypeScript added a quick fix to its language services to suggest a type from the function usage.

Given its rationale, TypeScript has a lot of flexibility in the ways we can write functions and express their types. In the following example, exclaim1 and exclaim2 are analogous. There are many places where you have to add a function type definition, and it can be confusing knowing which syntax to use.

interface Exclaimable {
    exclaim1 (str: string): string
    exclaim2: (str: string) => string
}

If we are working with JavaScript files, we can avoid using a special syntax and just write the types using JSDoc. These features allow newcomers to experience some of the TypeScript benefits without going all in and is the kind of decisions that make me think of TS as tooling more than a new language (having special syntax just for the sake of being more expressive).

/**
 * Adds two numbers together
 * @param {number} a The first number to add
 * @param {number} b The second number to add
 */
function add (a, b) {
    return a + b
}

In the following example, functions sub and div are also analogous, but the later is written using arrow functions which is more concise. Receiving two parameters makes these functions harder to compose. So for mul we decided to take one argument at a time, which enable us to create new functions like times2 from it.

function sub (a: number, b: number) {
  return a - b
}

const div = (a: number, b: number) => a / b

const mul = (a: number) => (b: number) => a * b

const times2 = mul(2)

The downside of having mul written like this is that it seems weird when we want to call it with both arguments: mul(2)(4). If we want the best of both worlds, we can use a curry function like ramda’s, but it also has some limitations in TS, as it does not work with generic functions.

const mul = curry((a: number, b: number) => a * b)
mul(2, 2) // 4
mul(2)(2) // 4

PureScript , like Elm and Haskell, has a Hindley-Milner based type system which is well suited for a functional language, and makes easier the transition between them. We can notice that the type information is placed above using “::” to separate the identifier from its type_,_ and in a new line we use “=” to separate the identifier from its value. Even if the compiler can infer the type correctly, PS will warn us if we don’t provide explicit information for all top-level expressions.

pi :: Number
pi = 3.1416

age :: Int
age = 32

Being focussed on correctness, the primitive types makes the distinction between float numbers and integers. Also, notice that we don’t need the const or let keyword and that we write pi in lowercase as we have the guarantee that all data is immutable.

When we describe functions, the types are also written above the function implementation, decoupling the parameter name from its type. We use an arrow to separate the input from the output, so a type like “String → String” means “A function that given a string, returns a string”. If we don’t know the output type we can use an underscore to produce a warning like “Wildcard type definition has the inferred type String”.

exclaim :: String -> String
exclaim str = str <> "!!!"

what :: String -> _
what str = str

Unlike TypeScript , there is only one way to define a function type, which resembles the arrow function way in TS. All functions are automatically curried without the generic limitation, so we can create times2 just like before. By partially applying the number 2 to mul we change the signature “Number → Number → Number” into “Number → Number”.

add :: Number -> Number -> Number
add a b = a + b

sub :: Number -> Number -> Number
sub a b = a - b

div :: Number -> Number -> Number
div a b = a / b

mul :: Number -> Number -> Number
mul a b = a * b

times2 :: Number -> Number
times2 = mul 2

A big syntax difference from C-family languages is that the function application is not done surrounding the parameters with parenthesis, it is done by separating them with a space, so the PS expression “mul 2 4” its the same as the TS expression “mul(2)(4)”. It can be confusing at first, but it enables clearer syntax, as we’ll see in the next section.

Also notice that in both versions of “times2”, the b parameter is implicit. This technique is called point-free programming, which can save us the keystrokes of doing something like “const times2 = b => mul(2)(b)”. This is a powerful technique, but it shouldn’t be abused as there are times where it can reduce the legibility.

A language made for composition

In this section, we’ll leave TypeScript to rest for a bit and focus on what makes PureScript a language made with composition in mind. Let’s recall the main function from the section “dissecting the output”. There are three things we haven’t talked about: A special symbol “do”, a not so special symbol “$”, and the type of main, which doesn’t look like a function.

main :: Effect Unit
main = do
  log "Hello"
  log $ exclaim "World"

PureScript has a language feature called do notation that does different things depending on the underlying type. We could write a whole post describing it in detail, but for now, let’s just say it’s a way for us to call one effectful computation after the other in a way that resembles imperative programming.

To help us investigate $ and Effect Unit we’ll use the REPL to see the type of an expression and the kind of type. We need to have pulp installed and then execute “pulp repl”. Using the :t command we can see that log is a function that receives a String and returns an Effect Unit, the type of our main “function”.

$ pulp repl
PSCi, version 0.12.2
Type :? for help

import Prelude
import Effect
import Effect.Console

> :t log
String -> Effect Unit

All the expressions inside “do” must return an Effect Unit. The first call to log is trivial but the second poses a problem, as we want to log the exclaimed string. Given that function application is done using a space, if we write the expression log exclaim “World”, the compiler will throw an error because it understands that we are passing two arguments to a function that only accepts one. There are three common ways to write the expression that we want: With parenthesis, with apply ($) and with applyFlipped (#).

> :t log "Hello"
Effect Unit

> :t log exclaim "World"
Error found:
  Could not match type                    
    String -> String                    
  with type          
    String

> :t log (exclaim "World")
Effect Unit
> :t log $ exclaim "World"
Effect Unit
> :t exclaim "World" # log
Effect Unit

The symbols $ and # are not language features, they are just normal functions called apply and applyFlipped respectively and they are defined in the standard library Prelude. The special feature is that we can define an infix operator for any function of two arguments. As the documentation says, apply lets you omit parenthesis in some cases, making the code easier to read.

Looking at the source code, the implementation is pretty straight forward, but the types could use some explanation as these are the first abstract functions that we see. If we look at apply, the first part declares two type variables “a” and “b” that could be any concrete type. Then we receive two arguments, a function “f” that goes from (a → b) and a value “x” of type “a”. If we use log as our “f”, we can substitute the types to see that “a” will be of type String, and “b” will be Effect Unit. The implementation is just applying the argument “x” to the function “f”. Notice that applyFlipped is the same, but it first receives the value and then the function.

apply :: forall a b. (a -> b) -> a -> b
apply f x = f x

infixr 0 apply as $

applyFlipped :: forall a b. a -> (a -> b) -> b
applyFlipped x f = f x

infixl 1 applyFlipped as #

Once again, there is nothing special with $ and #, the language decisions that make this possible are: function application is done with a space, parenthesis only serve to define precedence and any function of two arguments can be infixed. This is a very powerful concept that Guy Steele describes in his talk growing a language, it involves well-thought syntax primitives that can compose into more complex constructs and it can be eventually be used to define a D omain S pecific L anguage.

In JavaScript/TypeScript there are many language features that could be implemented in PureScript userland without having to go through a committee. The pipe operator is a proposal in stage 1 that could enable better syntax for functional programmers, which does the same as PS applyFlipped (#). Async await is a feature around Promises that allow us to write code more imperatively, in PS we could combine do notation with the type Aff. And the optional chaining operator, which is at stage 1 could be replaced with do notation and the Maybe type.

Now that all the expressions inside do return the same type, let’s go back to the REPL to understand what the type means. We can use the :k command to inspect the kind of a type. For example, Unit and Number are regular types, but Effect and Array are type constructors. A type constructor is a function for types instead of values, hence the similar syntax “Type → Type”. The constructor can be applied to a type using a space (just like normal function application), so Array Number and Effect Unit will have the same kind “Type”. The type Unit comes from the word “unit of work” and it’s analogous to void in TypeScript.

> :k Number
Type

> :k Unit
Type

> :k Effect
Type -> Type

> :k Array
Type -> Type

> :k Effect Unit
Type
> :k Array Number
Type

We can think of Array as a simple data structure or we can think of it as a way to express a computation of multiple values. In the same way, we can think of Effect as a computation that modifies the world. Strict functional languages have the restriction to be pure, which enables a whole set of guarantees, but a programs main goal is to modify the world in some way, either by reading a file, mutating the DOM, etc. We can cope with this limitation by working with types that represent the effectful computations.

As we saw in the section “dissecting the output”, all Effects were compiled to functions, adding an extra level of indirection. This allows us to compose those computations before actually running them. In the first eight minutes of his talk “Constraints Liberate, Liberties Constrain”, Runar Bjarnason gives one of the best explanations of this concept that I’ve seen.

If we are gonna work with explosives, it's easier to work with the TNT than the exploded pieces.

and it also has this quote from David J. Wheeler

We can solve any problem by introducing an extra level of indirection.

A nice thing about expressing your computations this way is that you can encode what you want to do and some notion of how you want to do it, all in the type system. And we can create our programs as a combination of multiple computations like this:

Effect Unit: An effectful computation that changes the world in some way, synchronous writing a file to the console, mutating the DOM, etc
Array Student: A computation of multiple Students
Maybe User : A computation that may resolve in a user or may be empty
Either String Prime: A synchronous computation that can resolve to a prime number or fail with a string message
Aff BlogPost: An asynchronous effectful computation that can resolve to a blog post
State AST Number: An stateful computation that works with an AST and returns a Number

In this post, we’ve seen some differences between TypeScript and PureScript, more notable their rationale, the reason to exist. As always, the decision to use them over plain JavaScript depends more on factors like what your team is comfortable with, how much do you care for correctness vs speed of development, etc. Knowing what each language provides will help you make an informed decision. Please comment or share if you find it useful.

Plug: LogRocket, a DVR for web apps

LogRocket is a frontend logging tool that lets you replay problems as if they happened in your own browser. Instead of guessing why errors happen, or asking users for screenshots and log dumps, LogRocket lets you replay the session to quickly understand what went wrong. It works perfectly with any app, regardless of framework, and has plugins to log additional context from Redux, Vuex, and @ngrx/store.

In addition to logging Redux actions and state, LogRocket records console logs, JavaScript errors, stacktraces, network requests/responses with headers + bodies, browser metadata, and custom logs. It also instruments the DOM to record the HTML and CSS on the page, recreating pixel-perfect videos of even the most complex single page apps.

Try it for free.

The post TypeScript vs PureScript appeared first on LogRocket Blog.