...Wat?
This is an article about how the most well known villain in the JS universe isn't really evil, just misunderstood.
Going to hell in a callbasket
I'm not going to dig too deep into the background of the term "callback hell"; instead I'll just point you to this nice article that explains the problem and some typical solutions. If you're unfamilar with the term, please go read that article; I'll wait.
Ok. So we're going to copy and paste the problematic code from the article, and then we're going to see how we might solve the problem without using promises and async/await:
const verifyUser = function(username, password, callback) {
dataBase.verifyUser(username, password, (error, userInfo) => {
if (error) {
callback(error);
} else {
dataBase.getRoles(username, (error, roles) => {
if (error) {
callback(error);
} else {
dataBase.logAccess(username, error => {
if (error) {
callback(error);
} else {
callback(null, userInfo, roles);
}
});
}
});
}
});
};
Flattening the pyramid
If we look at the code, we notice that every time we perform an asynchronous operation, we have to pass a callback to receive the result. Because we're defining all the result-receiving callbacks inline as anonymous functions, we end up with this huge pyramid of doom.
As a first step, let's perform a simple refactoring where we just copy and paste each anonymous callback function into a separate variable, introducing curried arguments to explicitly pass around variables that were being captured from the surrounding scope:
const verifyUser = (username, password, callback) =>
dataBase.verifyUser(username, password, f(username, callback));
const f = (username, callback) => (error, userInfo) => {
if (error) {
callback(error);
} else {
dataBase.getRoles(username, g(username, userInfo, callback));
}
};
const g = (username, userInfo, callback) => (error, roles) => {
if (error) {
callback(error);
} else {
dataBase.logAccess(username, h(userInfo, roles, callback));
}
};
const h = (userInfo, roles, callback) => (error, _) => {
if (error) {
callback(error);
} else {
callback(null, userInfo, roles);
}
};
If nothing else it's certainly a little flatter, but we now have some new problems with this code:
- The
if (error) { ... } else { ... }
business is being repeated everywhere - Our variable names for our intermediate expressions are meaningless
-
verifyUser
,f
,g
andh
are all tightly coupled to each other, since they reference each other directly
Seeing the pattern
Before we deal with any of those issues though, let's note some similarities between these expressions.
All of these functions accept some data and a callback
parameter. f
, g
and h
additionally accept a pair of arguments (error, something)
, of which only one will be a non-null
/undefined
value. If error
is non-null, the functions immediately feed error
to callback
and terminate. Otherwise, they use something
to do some more work, causing callback
to eventually be fed a different error, or null
and some result value.
Keeping these commonalities in mind, we'll embark on a program of refactoring our intermediate expressions so they look increasingly similar.
Cosmetic changes
I find if
statements really verbose, so we'll take a moment now to replace all these if
statements with ternary expressions. Since the return values are all being discarded anyway, this doesn't cause any change in the behavior of the code.
I'm also going to reduce the visual noise by shortening the repetitive error
and callback
variables to e
and cb
respectively:
const verifyUser = (username, password, cb) =>
dataBase.verifyUser(username, password, f(username, cb));
const f = (username, cb) => (e, userInfo) =>
e ? cb(e) : dataBase.getRoles(username, g(username, userInfo, cb));
const g = (username, userInfo, cb) => (e, roles) =>
e ? cb(e) : dataBase.logAccess(username, h(userInfo, roles, cb));
const h = (userInfo, roles, cb) => (e, _) =>
e ? cb(e) : cb(null, userInfo, roles);
Currying aggressively
Because we're about to start performing some serious gymnastics with function parameters, I'm going to take this opportunity to curry all the function arguments that can be curried. This introduces uniformity and facilitates further refactoring.
We can't easily curry the functions which accept a pair of arguments (e, xyz)
, since the underlying dataBase
API (which is opaque to us) requires the callback to simultaneously accept a possible error and a possible result. But all other occurrences of multi-parameter functions can (and will) be eliminated by currying.
We'll start with the dataBase
methods:
// Curried wrapper around the `dataBase` API
const DB = {
verifyUser: username => password => cb =>
dataBase.verifyUser(username, password, cb),
getRoles: username => cb =>
dataBase.getRoles(username, cb),
logAccess: username => cb =>
dataBase.logAccess(username, cb)
}
Now we will replace all usages of dataBase
with wrapped operations from DB
, and curry any remaining multi-parameter functions. Additionally, we'll replace the cb(null, userInfo, roles)
in h
with cb(null, { userInfo, roles })
, so that a callback always receives precisely two arguments: a possible error and a possible result.
const verifyUser = username => password => cb =>
DB.verifyUser(username)(password)(f(username)(cb));
const f = username => cb => (e, userInfo) =>
e ? cb(e) : DB.getRoles(username)(g(username)(userInfo)(cb));
const g = username => userInfo => cb => (e, roles) =>
e ? cb(e) : DB.logAccess(username)(h(userInfo)(roles)(cb));
const h = userInfo => roles => cb => (e, _) =>
e ? cb(e) : cb(null, { userInfo, roles });
Turning it inside out
Let's do some more refactoring. For reasons that will become clear momentarily, we're going to pull all the error checking code "outwards" one level. Instead of each step doing its own error checking, we'll use an anonymous function that receives the error e
or result v
of the current step, and forwards the result and callback to the next step if there are no problems:
const verifyUser = username => password => cb =>
DB.verifyUser(username)(password)((e, v) =>
e ? cb(e) : f(username)(cb)(v)
);
const f = username => cb => userInfo =>
DB.getRoles(username)((e, v) =>
e ? cb(e) : g(username)(userInfo)(cb)(v)
);
const g = username => userInfo => cb => roles =>
DB.logAccess(username)((e, _) =>
e ? cb(e) : h(userInfo)(roles)(cb)
);
const h = userInfo => roles => cb => cb(null, { userInfo, roles });
Note how the error handling has entirely disappeared from our final function: h
. It simply accepts a couple of parameters, builds up some composite result from them, and immediately turns around and feeds the result into a given callback. Let's rewrite h
to show this more clearly:
const h = userInfo => roles => {
const result = { userInfo, roles };
return cb => cb(null, result);
}
The cb
parameter is now being passed in various positions, so for consistency, we'll move around the arguments so that all the data goes first and the callback goes last:
const verifyUser = username => password => cb =>
DB.verifyUser(username)(password)((e, v) =>
e ? cb(e) : f(username)(v)(cb)
);
const f = username => userInfo => cb =>
DB.getRoles(username)((e, v) =>
e ? cb(e) : g(username)(userInfo)(v)(cb)
);
const g = username => userInfo => roles => cb =>
DB.logAccess(username)((e, _) =>
e ? cb(e) : h(userInfo)(roles)(cb)
);
const h = userInfo => roles => {
const result = { userInfo, roles };
return cb => cb(null, result);
}
verifyUser
and f
now look almost identical. They both:
- Receive some data and a callback
- Perform some asynchronous operation
- Receive an error or a value
- If the result is an error, immediately pass it to the callback
- Otherwise, pass the successful result and callback into some further step (
<next step>(v)(cb)
)
g
is very similar, but there is a twist. Instead of receiving a v
argument and passing it on to the next step if there are no problems, it unconditionally discards any successful result and passes only the callback to the next step.
To smooth out this wrinkle, we will rewrite g
so that it imitates the other two functions and passes on its (undefined) result. To deal with the unwanted result, we will introduce a dummy argument to the "next step", so that it discards whatever was passed:
const g = username => userInfo => roles => cb =>
DB.logAccess(username)((e, v) =>
e ? cb(e) : (_ => h(userInfo)(roles))(v)(cb) // the "next step" discards the result
);
Now it follows the same formula as verifyUser
and f
. For clarity, let's explicitly copy the asynchronous operation and "next step" of each function into local variables:
const verifyUser = username => password => {
const task = DB.verifyUser(username)(password);
const next = f(username);
return cb => task((e, v) => e ? cb(e) : next(v)(cb));
}
const f = username => userInfo => {
const task = DB.getRoles(username);
const next = g(username)(userInfo);
return cb => task((e, v) => e ? cb(e) : next(v)(cb));
}
const g = username => userInfo => roles => {
const task = DB.logAccess(username);
const next = _ => h(userInfo)(roles);
return cb => task((e, v) => e ? cb(e) : next(v)(cb));
}
const h = userInfo => roles => {
const result = { userInfo, roles };
return cb => cb(null, result);
}
Do you see the pattern?
Factoring out the pattern
By this point it is hopefully obvious that there is something very repetitive is happening. It looks like someone has copied and pasted code for handling errors and threading around callbacks into every function. Of course, this is deliberate; we have refactored our way to a unified pattern, so that we may copy and paste the repetition out.
Now, in one fell swoop, we can move all the error handling and callback thread business into a pair of helper functions:
const after = task => next =>
cb => task((e, v) => e ? cb(e) : next(v)(cb));
const succeed = v =>
cb => cb(null, v);
Our steps turn into:
const verifyUser = username => password =>
after
(DB.verifyUser(username)(password))
(f(username));
const f = username => userInfo =>
after
(DB.getRoles(username))
(g(username)(userInfo));
const g = username => userInfo => roles =>
after
(DB.logAccess(username))
(_ => h(userInfo)(roles));
const h = userInfo => roles =>
succeed({ userInfo, roles });
The error handling and callback threading has disappeared!
It's a good idea to pause here for a second. Try to inline the definitions of after
and succeed
into these new expressions, to convince yourself that they are equivalent to the ones we refactored away.
Ok, so we're getting warmer! f
, g
and h
don't seem to be doing much of anything anymore though...
Pruning dead weight
...so let's get rid of them! All we have to do is to work our way backwards from h
and inline each function into the definition that references it:
// Inline h into g
const g = username => userInfo => roles =>
after(DB.logAccess(username))(_ =>
succeed({ userInfo, roles })
);
// Inline g into f
const f = username => userInfo =>
after(DB.getRoles(username))(roles =>
after(DB.logAccess(username))(_ =>
succeed({ userInfo, roles })
)
);
// Inline f into verifyUser
const verifyUser = username => password =>
after(DB.verifyUser(username)(password))(userInfo =>
after(DB.getRoles(username))(roles =>
after(DB.logAccess(username))(_ =>
succeed({ userInfo, roles })
)
)
);
We can use referential transparency to introduce some temporary variables and make it a little more readable:
const verifyUser = username => password => {
const auth = DB.verifyUser(username)(password);
const roles = DB.getRoles(username);
const log = DB.logAccess(username);
return after(auth)(u =>
after(roles)(r =>
after(log)(_ =>
succeed({ userInfo: u, roles: r })
)
)
);
};
And there you have it! This is quite concise, doesn't repeat any error checking, and is roughly analogous to the Promise
version from the article we linked earlier. You invoke verifyUser
like so:
const main = verifyUser("someusername")("somepassword");
main((e, o) => (e ? console.error(e) : console.log(o)));
Final code
// Tools for sequencing callback APIs
const after = task => next =>
cb => task((e, v) => e ? cb(e) : next(v)(cb));
const succeed = v =>
cb => cb(null, v);
// Curried wrapper around the `dataBase` API
const DB = {
verifyUser: username => password => cb =>
dataBase.verifyUser(username, password, cb),
getRoles: username => cb =>
dataBase.getRoles(username, cb),
logAccess: username => cb =>
dataBase.logAccess(username, cb)
}
// Our implementation
const verifyUser = username => password => {
const auth = DB.verifyUser(username)(password);
const roles = DB.getRoles(username);
const log = DB.logAccess(username);
return after(auth)(u =>
after(roles)(r =>
after(log)(_ =>
succeed({ userInfo: u, roles: r })
)
)
);
};
The M-word
Are we done? Well, some of us might still find the code in verifyUser
a little too triangular. There are ways to fix this, but in order to explain how I first have to fess up to something.
I didn't independently discover the definitions of after
and succeed
in the process of refactoring this code. I actually had the definitions up front, since I copied them from a Haskell library where they go by the name of >>=
and pure
. Together, these two functions constitute the definition of the "continuation monad".
Why is this relevant? Well, it turns out that there are many handy ways to sequence together monadic computations that don't suffer from the pyramid-of-doom effect.
To illustrate, let's start by formatting the definition of verifyUser
a little bit differently:
const verifyUser = username => password => {
const auth = DB.verifyUser(username)(password);
const roles = DB.getRoles(username);
const log = DB.logAccess(username);
return
after(auth) (u =>
after(roles)(r =>
after(log) (_ =>
succeed({ userInfo: u, roles: r }))));
};
If you squint and ignore the parentheses, you might notice the similarity between this definition and the following Haskell function:
-- In Haskell, function application does not require parentheses,
-- and binary functions may be applied infix
verifyUser :: Username -> Password -> IO (UserInfo, Roles)
verifyUser username password =
let
auth = DB.verifyUser username password
roles = DB.getRoles username
log = DB.logAccess username
in
auth >>= \u ->
roles >>= \r ->
log >>= \_ ->
pure (u, r)
This pattern of using >>=
and functions to introduce new variables captured from the steps of a monadic computation is so common that there is special syntax sugar for it, called "do-notation". Here is the same computation in Haskell written with do-notation:
verifyUser' :: Username -> Password -> IO (UserInfo, Roles)
verifyUser' username password =
let
auth = DB.verifyUser username password
roles = DB.getRoles username
log = DB.logAccess username
in
do
u <- auth
r <- roles
_ <- log
pure (u, r)
Although we do not have general purpose do-notation in JS (perhaps we should!), there are various ways to simulate it. A detailed explanation of monads and do-notation is beyond the scope of this article, but for illustrative purposes, here is one way to write verifyUser
in JS with a simulated do-notation library:
const { mdo } = require("@masaeedu/do");
// `Cont` is our implementation of the continuation monad
const Cont = monad({ pure: succeed, bind: after });
const verifyUser = username => password => {
const auth = DB.verifyUser(username)(password);
const roles = DB.getRoles(username);
const log = DB.logAccess(username);
return mdo(Cont)(({ u, r }) => [
[u, () => auth ],
[r, () => roles],
() => log ,
() => Cont.pure({ userInfo: u, roles: r })
]);
};
This is well and good, but it is also worth noting that some monadic computations have a "fixed" structure, i.e. they might not utilize the result of previous steps to decide what to do next. Since such computations have no real need for explicitly binding over and naming the results of intermediate steps, they can be built up more conveniently by "traversing" a fixed container of the steps, which will eventually produce a corresponding container of results.
Luckily for us, our example is just such a "fixed structure" computation, in that each step is independent of the results of previous steps. This means it can also be written in the following, more concise ways:
const verifyUser = username => password => {
const auth = DB.verifyUser(username)(password);
const roles = DB.getRoles(username);
const log = DB.logAccess(username);
// Applicative lifting
const f = u => r => _ => ({ userInfo: u, roles: r });
return Cont.lift(f)([auth, roles, log]);
};
const verifyUser = username => password => {
const auth = DB.verifyUser(username)(password);
const roles = DB.getRoles(username);
const log = DB.logAccess(username);
// Traverse a dictionary of continuations into a continuation of a dictionary
return Obj.sequence(Cont)({
userInfo: auth,
roles: roles,
_: log
})
};
A detailed analysis of all the ways to build up monadic and applicative computations is beyond the scope of this post, but suffice it to say that there are a number of powerful, elegant tools for synthesizing computations in an arbitrary monad. By recognizing that our callback-based model of asynchronicity is monadic (specifically, that it corresponds to the continuation monad) and witnessing the relevant monad operations, we can apply these general purpose tools to async programming.
Conclusion
Okay, we made it! What are the takeaways? I hope I've managed to persuade you of the following:
- Referentially transparent refactoring is a powerful technique for eliminating repetition and discovering useful patterns
- "Callback hell" is not a problem innate to callbacks, but to a particular calling discipline for callback-based APIs. With the right approach, callback-based APIs can be concise and elegant to work with
- The concept of a "monad" in a programming context is not (merely) academic mumbo jumbo, but is a useful tool for recognizing and exploiting patterns that arise naturally in everyday programming
Further work
I have deliberately avoided introducing type signatures or concepts like monads until the very end of the post in order to keep things approachable. Perhaps in a future post we can re-derive this abstraction with the monad and monad-transformer concepts foremost in our minds, and with special attention to the types and laws.
Acknowledgements
Many thanks to @jlavelle, @mvaldesdeleon and @gabejohnson for providing feedback and suggestions on this post.
Top comments (17)
But why would you EVER want to do this if there is already a continuation monad implementation in the language itself that:
Yes, I'm talking about Promises. Yes, Promises are effectively continuation monads (not to the strictest definition of a monad but neither is the implementation in this post). The .then function is the monadic bind (e.g. '>>=') and Promise.resolve is 'pure' (which you don't need as often because .then will automatically perform a pure of a value that isn't a monad)
Here is a code example showing how they're methodologically equivalent (and qualitatively better) than what is described in the post. I'm doing this in the hopes that I never have to see anybody doing this in actual code I have to work with. Stop trying to be smart. Don't reinvent the wheel. Please...
The ways in which promises do not form a monad (and in fact not even pure values) are actually pretty well understood. In particular,
then
is not a lawfulbind
, andjoin
is impossible to express (because simply referring to a value causes effects to start happening).Moreover you cannot generally express interaction with traversable containers, functor composition, monad transformers, or other general purpose abstractions with respect to promises, because again, they don't actually form a monad.
Regarding "neither is the implementation in this post": I'm not sure you've actually grasped the content of the post if this is the conclusion you've arrived at. The operations given in this post precisely form a monad (including obeying all the relevant laws).
In the construction above,
verifyUser
is a pure, continuation returning function. In your snippet,async function verifyUser(user, password) { ... }
is not even really a function in the functional programming sense of the word.As a very simple example, the promise produced by mapping a Promise-based implementation of
deleteUser
over an array of usernames and taking the first element doesn't represent deleting the first user; instead every user in the database would be deleted. Conversely, doing the same thing with adeleteUser
based on a lawful asynchronicity monad, as given in the post, would be no different than taking the first element and then applyingdeleteUser
to it. Both would produce a continuation representing deleting the first user (nothing would actually start happening "behind the scenes").I'm not going to argue about strict definitions with you because you don't know what you're talking about and are opinionated about your misconceptions. I simply don't have the time.
They are the same - please inform yourself more thoroughly.
Not being able to implement
join
has nothing to do with it being a monad. It's because they're automatically flattened. You don't have to implementm (m a) -> m a
ifm (m a)
is equivalent tom a
. But again - it has nothing to do with it being a monad.My point is that they can be used the same way as your continuation monad implementation and as such should be favoured over it because they're standard language constructs. Period.
Also, of course
async function verifyUser(user, password) { ... }
is a pure function. It's referentially transparent in the sense that given the same parameters the Promise returned will always be the same. How that promise is consumed doesn't matter. Again - inform yourself.Lazy evaluation also doesn't have anything to do with purity or it being a monad. (regarding your
deleteUser
example. You're mixing up concepts that you don't seem to understand)Thanks, the definition you posted above is helpful. Try evaluating
const map = f => bind(x => pure(f(x))); map(pure)(pure(5))
to understand why this is not actually a lawful implementation ofbind
.Without having a
join
operation (which can be recovered asbind(id)
from a lawfulbind
), it's actually meaningless to talk about a "monad". Monads are fundamentally defined by an associativejoin
and an idempotentpure
, together forming a monoid.This isn't about lazy evaluation vs strict evaluation, but rather about pure vs impure evaluation. The term
verifyUser(user, password)
does not purely evaluate to a representation of an effect; instead it immediately starts performing effects in the course of its evaluation. The result of evaluating it is not dependent only on its inputs, but also on the state of the world.This means
verifyUser
isn't actually a function in the functional programming sense of the word, preventing us from reasoning equationally in programs that involve it. For example the following program:is not the same program as:
when using promises. It is when using a lawful asynchronicity monad (e.g. the continuation monad above). Whether this is bad or good depends on whether you prefer an imperative or functional style of reasoning.
Your definition of monads is wrong. It has nothing to do with
join
, their time of evaluation or 'imperative vs functional reasoning' lol.Here are the monadic laws proven with the Promise definitions from above - in js.
You're also wrong about the fact that the promises don't evaluate to the representation of an effect first. Of course they do. The point in time the underlying implementation decides to consume that value has no significance whatsoever. As I said - you're mixing up concepts, don't understand monads and likely don't understand Promises either.
This is the first time I've heard that monads have nothing to do with the join operation. You should share this revolutionary insight with the mathematics community.
Regarding the "proof" of the monadic laws above, unfortunately the laws don't hold for the definitions given (the proof-by-single-example notwithstanding). In fact, the definitions are not even well-typed.
Conveniently, to disprove something requires only a single counterexample:
I'd like to have discussed how the word "monad" refers to a particular kind of endofunctor with
join
andpure
natural transformations, but I really have to take a break from this conversation. I don't mind discussing things with people I disagree with, but the complete lack of manners displayed in your comments goes poorly with your total ignorance of the subject.Very interesting. A few things I like...
I like how
verifyUser
has now become Lazy.I love the
M[">>="]
function. First time I have seen an object used like this.Without Promises, you have eliminated a round trip to the event loop (for sync functions).
I had a hard time finding the
@masaeedu/mdo
package. I did find@masaeedu/do
though. Typo? Could you throw me a link?Cheers!
Hi @joelnet . Yeah, sorry. That's a typo on my part. The repo is here: github.com/masaeedu/do. There's no README yet, sadly, but here is a slightly more fleshed our runkit with usage examples for various monads: runkit.com/masaeedu/do-notation
Awesome thanks. There's some magic in that lib that I'm gonna have to play with to fully understand. I didn't know it was possible to do something like this:
Those proxies are some interesting things.
I recently did something related with the W Combinator. I also need to improve the docs :( But I like how your implementation allows you to assign values.
What's currying doing for this code, exactly?
Hi Pablo. What currying is doing for this code is turning an impure function of type (ignoring errors for simplicity):
, which is not a monadic value, into a pure function of the type:
for which the return type
({ userInfo: UserInfo, roles: Roles } -!-> Undefined) -!-> Undefined
(and more generally, the type(a -!-> r) -!-> r
) forms a monad.We're changing our perspective so that all of our impure functions of multiple arguments can instead be interpreted as pure functions that accept one less argument and return an impure, callback-accepting computation with possibility of failure.
Thanks for writing this up! I was first introduced to currying in Scala but I was a junior at the time and didn't get it at all. This shows it in a neat example.
I also like your step-by-step refactoring process. Cool to see how other people approach it.
Awesome! I first saw applicatives in Scala. Bring self taught from the imperative braces world not enough of that has rubbed off to transfer onto other languages. I would really like to see more articles about
do
that are written in as approachable way as this one."Perhaps in a future post we can re-derive this abstraction with the monad and monad-transformer concepts foremost in our minds, and with special attention to the types and laws." - Perhaps after that you could write it in Malbolge.
I see a Monadic Javascript Framework in the future....
tips fedora
But what about đź…±erformance?
History is littered with situations where someone comes up with higher abstractions and people say ”but what about performance”. Due to fear, uncertainty and doubt people avoid it for a time. Then whatever people thought might be show becomes mainstream if it brings higher developer productivity.
Sure some people need to drop into C or even assembler and maybe some of those people are even doing this from JavaScript. They probably measure for themselves before taking anyone else’s advice on performance. They won't ask about performance they tell you what they have measured about performance.
Most folks running JavaScript typically are not dropping to C. Most folks using garbage collected languages are awaiting IO and so high level programming abstractions run satisfactorily. In this article the code is calling a database over a network. The runtime will be doing garbage collection. The runtime will do a lot of optimizations. Swapping around functions probably will use more objects and more final machine code but measuring that difference on real applications outside of synthetic micro benchmarks is often really very hard.
People should try things out new approaches and measure the performance in their context. As a rule of thumb unless you know your working on performance critical sections of code it is most often worth trying to optimize for less bugs than more speed.
Some comments may only be visible to logged-in visitors. Sign in to view all comments.