canonical

Posted on Nov 9

A Monad Guide for Beginners

#programming #functionalreactiveprogramming #haskell #beginners

Recently, a new colleague joined our company. His surname is Bai, and he's quite young, so we all call him Xiao Bai. Xiao Bai has been learning functional programming lately, and a few days ago he came to me with a question.

Xiao Bai: I graduated from a top-tier 985 university. Why do I still feel confused after reading so many introductions to Monads? Is it the articles that are poorly written, or is it my comprehension that's the problem?

Me: What was your major?

Xiao Bai: Polymer Science. After graduation, I studied programming on my own at home for half a year.

Me: I see...

Finally, I decided to write this article to help Xiao Bai figure this out.

What is a Monad? From a pragmatic perspective, a Monad is a specific design pattern unique to functional programming. Why is it unique to functional programming? Because it primarily solves the problem of how to compose functions.

I. Functions

To study how functions compose, we first need to define what a function is. In programming languages, a function can be understood as a mapping from type A to type B, formally written as f: A -> B. A function f takes an input of type A and returns an output of type B.

Notably, functions themselves can also be inputs or outputs; these are called higher-order functions. With the concept of higher-order functions, we can define a transformation curry, which reduces multi-parameter functions to single-parameter functions. Theoretically, this means we only need to study single-parameter functions.

function add(x, y) {
    return x + y;
}

// After currying, it becomes a function that takes one parameter and returns a new function
function curry_add(x) {
    return function(y) {
        return x + y;
    }
}

add(3, 4) === curry_add(3)(4); // Both results are 7

// Mathematically, this can be seen as a type transformation:
// (A, B) -> C   is equivalent to  A -> (B -> C)

The somewhat intimidating form store => next => action => { ... } seen in Redux middleware is essentially a curried multi-parameter function.

The most important operation on functions is composition. If we have a function g: A -> B and another function f: B -> C, we can compose them into a new function h: A -> C.

// To better match the left-to-right execution order of code, we define compose(g, f) to mean data flows through g first, then through f
function compose(g, f) {
    return function(x) {
        // First execute g(x), then pass its result as input to f
        return f(g(x));
    }
}
// It's worth noting that this convention is the reverse of the `compose` definition order in some functional programming libraries (like Ramda).
// Our left-to-right composition defined here is commonly known as the `pipe` function in the community, as it resembles data flowing through a pipe.

The most important property of function composition is that it satisfies associativity. For an execution order of h -> g -> f, whether we first compose h and g, or first compose g and f, the final result is the same.

// Associativity: (h -> g) -> f  is equivalent to  h -> (g -> f)
compose(compose(h, g), f)  // is effectively equivalent to
compose(h, compose(g, f))

Satisfying associativity means we can compose arbitrarily because the computation result is independent of the local grouping order. This allows us to chain a long series of functions like a pipeline, e.g., f(g(h(x))) can be viewed as compose(h, g, f)(x) (assuming compose supports multiple arguments).

Monad essentially talks about functions satisfying associativity, but it's not simply about ordinary functions satisfying associativity (which is obvious). Rather, it's about functions of a special form that can also satisfy associativity through a new method of composition.

II. A Problem: Composing Promises

Suppose we have the following two asynchronous functions, and we want to compose them into a new asynchronous function:

// String -> Promise<User>
async function getUserById(userId) { /* ... */ }

// User -> Promise<Dept>
async function getDeptByUser(user) { /* ... */ }

Our goal is to create a function getDeptByUserId whose type should be String -> Promise<Dept>. If we try to use the ordinary compose defined above:

// Wrong attempt
const getDeptByUserId_wrong = compose(getUserById, getDeptByUser);

Why is this wrong? Let's look at the execution process:

getUserById(userId) returns a Promise<User>.
The ordinary compose would directly pass this Promise<User> object to getDeptByUser.
But getDeptByUser expects an input of type User, not Promise<User>!

Where is the problem? The return values of these functions are wrapped in a "container" – Promise. We need a new method of composition to handle functions that return values inside such "containers".

// g is the first function, f is the second function
function composeM(g, f) {
   return function(x) {
       // 1. First execute g(x), getting a Promise
       // 2. Use .then to extract the value from the Promise, then pass it to f
       return g(x).then(f);
    }
}

// Correct composition: first getUserById, then getDeptByUser
const getDeptByUserId = composeM(getUserById, getDeptByUser);

This special composeM perfectly solves the problem. More importantly, it can be proven that this new composition method also satisfies associativity! This is the gateway to Monads.

III. Another Example: Composing Lists

Let's look at another common "container" – arrays (List). Consider the following two functions:

// number -> number[] (List<number>)
function positive(x) {
    return x > 0 ? [x] : [];
}

// number -> number[] (List<number>)
function duplicate(x) {
    return [x, x];
}

Again, we cannot use ordinary compose to combine them. Array.prototype.flatMap happens to be our tool.

// g is the first function, f is the second function
function composeM_List(g, f) {
    return function(x) {
        return g(x).flatMap(f);
    }
}

// Composition: first positive, then duplicate
const p = composeM_List(positive, duplicate);
console.log([1, -1, 2].flatMap(p)); // Outputs [1, 1, 2, 2]

We again find that for functions returning arrays, we can also define a composeM that satisfies associativity.

IV. Monad

Now, the mathematicians are going to perform. Mathematics is a science that only cares about "form". Things that are formally the same can be considered completely equivalent in mathematics; this is the power of "abstraction".

In the Promise example, the function type was A -> Promise.
In the List example, the function type was A -> List.

Their common pattern is: A -> M.

M is just a symbol, an abstraction for a context.

When M is Promise, it represents an asynchronous computation.
When M is List, it represents a computation that may produce zero or more results.
When M is Identity (i.e., Identity<T> = T), then A -> Identity degenerates to A -> B, and we are back to ordinary functions.

The core of a Monad is to define a composition operation composeM that satisfies associativity and a unit element unit for functions of the type A -> M (called Kleisli arrows in category theory).

A system that satisfies associativity is called a Semigroup in mathematics. If this system also has an Identity Element, it upgrades to a Monoid.

unit is a special function that puts an ordinary value a into the Monad container. Its type is A -> M<A>.

// For Promise
function unit_Promise(a) {
    return Promise.resolve(a);
}

// For List
function unit_List(a) {
    return [a];
}

So, the "plain language" version of the cryptic phrase "A Monad is just a monoid in the category of endofunctors"² is: It is a compositional system for functions of type A -> M that has a unit element and satisfies associativity.

From composeM to bind (flatMap)

If we change to an "object-oriented" perspective, not starting from composing functions but from an existing monadic value ma (an object of type M<A>), we can define a more common method, usually called bind or flatMap.

bind: (ma: M<A>, f: A -> M) -> M

For Promise, it's then; for List, it's flatMap.

bind and composeM are equivalent: composeM(g, f) is equivalent to x => g(x).bind(f).

Monad Laws

The associativity and unit properties of composeM can be equivalently expressed using bind and unit. These are the famous Monad Laws. For clarity, we use "effectively equivalent" to describe these laws because, in practical programming, this usually means producing the same value or final state, not strict object reference equality (===).

Left Identity: unit(x).bind(f) is effectively equivalent to f(x) Putting a value into a container and then binding a function is equivalent to directly applying the value to the function.
Promise.resolve(x).then(f) is effectively equivalent to f(x) ¹
[x].flatMap(f) is effectively equivalent to f(x)
Right Identity: ma.bind(unit) is effectively equivalent to ma
Binding the unit function to a container does nothing.
promise.then(Promise.resolve) is effectively equivalent to promise
list.flatMap(x => [x]) is effectively equivalent to list
Associativity: ma.bind(f).bind(g) is effectively equivalent to ma.bind(x => f(x).bind(g))
A series of bind operations can be grouped arbitrarily.
promise.then(f).then(g) is effectively equivalent to promise.then(x => f(x).then(g))
list.flatMap(f).flatMap(g) is effectively equivalent to list.flatMap(x => f(x).flatMap(g))

Strictly speaking, .then(f) defers the execution of f to the next microtask, while directly calling f(x) might be synchronous. But from the perspective of data flow and final results, they are equivalent.

Annotation on the "Cryptic Text" for the Curious: An Endofunctor is that type constructor M that maps types within a category back to the same category (e.g., T -> M<T>); a Category here can be simply understood as the type system of your programming language; and a Monoid is precisely the compositional system we discussed that has a unit element (unit) and a binary operation (composeM) satisfying associativity.

V. Monad as a Design Pattern

Monad is a general abstraction mechanism for the process of functional computation. The key is to unify forms, unify operation patterns, and reuse code. Because this pattern is very common, some languages provide syntactic sugar for easier writing. For example, async/await is syntactic sugar for the Promise Monad, and for-comprehension / do syntax in languages like Scala/Haskell can be used for any Monad.

// for-comprehension syntax
for {
  x <- mx
  y <- my
} yield x + y

// Is desugared by the compiler into a series of nested flatMap/map calls
mx.flatMap { x =>
  my.map { y =>
    x + y
  }
}

This nested structure also explains why we prefer the more readable chained calls when writing code by hand: mx.flatMap(f).flatMap(g).

VI. What Does Monad Actually Do?

Compared to ordinary function composition, the key of Monad is the introduction of a wrapper structure M, which is equivalent to wrapping the value in a context (monadic value = a value in a context).

This allows us to hide some common logic (like asynchronous waiting, null checks, state passing, logging, etc.) into the handling of this context within the implementation of bind. This lets our main logic code be expressed clearly like a simple chain of functions. This is somewhat similar to the role of Aspect-Oriented Programming (AOP).

VII. State Monad: The Art of Encapsulating Side Effects

Monad has another special significance for functional languages: it provides a mechanism for environmental encapsulation, isolating side effects into a certain environment object, ensuring the "purity" of core functions. The State Monad is the best example.

Suppose we need to use random numbers in our program:
function addRandom(a) { return a + Random.nextInt(); }
This function depends on the global variable Random and has side effects. To make it pure, we must pass the state as a parameter:
function addRandom(a, random) { return [a + random.nextInt(), random]; }
The return value is a tuple [new value, new state]. This is a general pattern. The type signature of any pure function that requires state can be written as (Value, State) -> (NewValue, NewState).

Using currying, we can transform it into Value -> (State -> (NewValue, NewState)).
This perfectly matches the Monad form A -> M!

A is Value
M is State -> (NewValue, NewState)
M is the structure State -> (..., NewState), which we call State s
B is NewValue

So, our function type is A -> State s B, where State s B is an alias for s -> (B, s).

State s corresponds to the symbol M. Again, mathematics is a thorough formalism; as long as things can be replaced according to fixed rules into a specified form, they are essentially the same thing mathematically.

Once the forms match, we can define unit and bind:

// unit :: a -> State s a
function unit(a) {
    // The role of unit is to wrap the value a into the State Monad.
    // It returns a function that takes a state s and returns a and s unchanged.
    return function(s) {
        return [a, s];
    }
}

// bind :: State s a -> (a -> State s b) -> State s b
function bind(ma, f) {
    // The role of bind is to compose two stateful computations.
    // It also returns a function, which represents the entire composed computation.
    return function(s) {
        // 1. Execute the first computation ma with the initial state s, obtaining a new value a and a new state s1.
        const [a, s1] = ma(s);
        // 2. Pass the new value a to function f, obtaining the next stateful computation f(a).
        const mb = f(a);
        // 3. Execute the next computation mb with the new state s1, obtaining the final value b and the final state s2.
        const [b, s2] = mb(s1);
        // 4. Return the final result and the final state.
        return [b, s2];
    }
}

// We can use a simple flowchart to visually understand this state passing process:
//
//  Initial State(s) ---> [ Execute ma ] --produces-->  Value(a) and New State(s1)
//                     |
//                     | (Value a is used to generate the next computation)
//                     v
//  (s1 is passed in) ---> [ Execute f(a) ] --produces--> Final Value(b) and Final State(s2)
//
//  Final return: [b, s2]

The implementation of bind cleverly encapsulates the tedious process of "passing the state".

Why can the State Monad encapsulate side effects? Because it implements delayed computation. bind only assembles a bunch of functions into a larger function; it does not execute immediately. Only when you finally provide this large function with an initial state does the entire computation chain actually start.

VIII. Other Monads

The Monad pattern is ubiquitous in programming:

Option/Maybe Monad: Used for handling potentially null values. The ?. chained calls in Kotlin/Swift are its manifestation. If it encounters null, the entire chain "short-circuits" and returns null, avoiding nested if (a != null) checks.
Either/Result Monad: Used for handling operations that might fail. It can carry either a successful value or error information about the failure.
IO Monad: Used to encapsulate side effects interacting with the outside world (like reading/writing files, printing to the console), wrapping impure operations so that the rest of the program remains pure.

I hope this article helps dispel the fog around Monads. It's not magic, but a powerful, general, and elegant pattern for organizing and abstracting code. Mastering it means mastering a core idea of functional programming.