A few days ago, I tracked down a bug causing a react component to never update. Debugging showed that the cause was the memoize
function of lodash.
In frontend projects we use memoization for different optimizations:
- Avoid component rendering -
React.memo
- Avoid re-computing internal component state -
useMemo
- Avoid re-computing information derived from the redux state -
createSelector
from reselect
The goal is always the same: do not redo an expensive computation if inputs are the same as the previous call. It is faster to just return the last computed result directly. More about memoization on Wikipedia
Using memoize in a react app
useMemo
, React.memo
, and createSelector
are usually enough for all your memoization needs. However hooks don't work in class components. If you still have some in your codebase, you need a custom memoization function to replicate the functionality of useMemo
. One implementation is described in the react docs.
Lodash being very common, using lodash/memoize
seems like a good option to implement the pattern without adding (yet) another dependency.
Problem 1: Lodash uses only the first parameter
Here is how the first example is interpreted by lodash internally:
This happens because the memoize function from lodash is only using the first parameter as a cache key by default. So, as long as the same first parameter is passed, the function always returns the same result.
On the other side, memoize-one
and the other implementations running in react
or reselect
re-compute the function when any parameter is changed, thus it always returns the right result.
The problem is not caused by a lodash behavior being undocumented. In fact the documentation states clearly that they're using the first parameter as a cache key. The root cause of those errors is that it's very different from the other implementations that often live in the same project and are supposed to provide the same functionality.
Problem 2: You don't need an unlimited cache
While the first difference may lead to visible bugs, this one may affect performances. This is usually hard to detect but it can have a big impact on the user experience.
Running the memoized functions 1000 times saves 1000 results in the cache. Does that mean that memoize is a good cache? Kind of. But this is not what we need from a memoize function.
Lodash uses a Map
to cache all function results associated with a key.
// from https://github.com/lodash/lodash/blob/master/memoize.js
memoized.cache = cache.set(key, result) || cache
...
memoize.Cache = Map
This means that ALL keys and return values will be saved (by default) forever.
If you don't have a lot of different keys, you won't see the difference. If you are using unique IDs, this can become problematic. Memory leaks are hard to track as they may only happen in specific use cases like a page that stays open for a long time. Using a cache that by default can create leaks is thus not recommended.
You could configure lodash cache to limit the number of saved values. I would argue that in a frontend application the best limit for a memoize cache is just one value: the latest computed one.
Memoization is used to avoid recomputing expensive things and make rendering faster. But the bottleneck is not recomputing just one thing. Performance issues happen when an application recomputes every expensive operation on every change.
Memoization with a cache containing just the last value allows your application to only do the few expensive computations that are impacted by a change. This should be enough in most cases.
Note: If you have expensive operations that are too slow to be done even once, then memoization is not the right tool to solve that problem anyway.
Postmortem: lodash/memoize is no more
The first option to fix the bug is to configure lodash memoize to match the react
, reselect
, memoize-one
... implementations.
let cacheKey;
let cacheResult;
// replace the cache to save one value
_.memoize.Cache = {
set: (key, result) => {
cacheKey = key;
cacheResult = result;
}
get: (key) => {
if(cacheKey == key) {
return cacheResult;
}
}
// ... other map functions
};
// create a resolver that maps all parameters to a key
const keyResolver = (...args) => JSON.stringify(args);
const add = (a, b) => a + b;
// use the resolver in a memoized function
const memoizedAdd = _.memoize(add, keyResolver);
While replacing the cache can be done once and for all, the keyResolver
to use all parameters as the cache key needs to be added to each new memoized function.
This made me choose a second option: Replace the memoize function by another-more straightforward- implementation. The easy part about having to switch from one memoize to another one is that there are already a lot of available implementations in most projects.
I used defaultMemoize
from reselect as a short-term replacement and will then either introduce memoize-one
or convert the component to be able to use hooks. The other change I'd like to do is adding a linting rule to warn users when they import lodash/memoize
.
As a more long-term fix for the whole community, we may want to rename the lodash function to something along the lines of cacheResults(fn, generateKey)
so that the name matches better the default behavior and not clash with the common memoize implementations.
Oldest comments (6)
There is no perfect memoization library. And even something barely usable.
I've got two really big and detailed articles on the subject.
Thanks for your post, Adrien.
Btw, I don't think it is bad that the memoize function just cache the first key.
Lodash was concibed to work with functional programming patterns. So technically, you don't have functions with more than one param. Curried functions just accept one input. (in most cases)
Also, if we go back to the functional programming principles, you don't need to set a lifetime for your values. If you are working with pure functions, they always will be the same, for the same input, same output.
Thank you for doing a write up on this - I was about to use memoize() and thankfully found your article!
You can pass a second function to memoize in order to override the cache key
Great article thanks for writing it! It helped me better understand a few types of caching and how one is different from the others.
Thanks for the post! Although I think this statement is misleading:
Each instance of a memoized function gets its' own cache so when there's no references to the memoized function instance and it's eligible for GC then so is the cache. I.e. keys and return values are not going to be saved forever unless you keep a reference to the memoized function around forever - similar to anything in JS basically.
memoize.Cache = Map
is only the cache constructor, it doesn't hold any references to the values or instances.