loading...

Lodash Memoize You should not use lodash for memoization

nioufe profile image Adrien TRAUTH ・5 min read

A few days ago, I tracked down a bug causing a react component to never update. Debugging showed that the cause was the memoize function of lodash.

const lodash = require('lodash'); const add = function(a, b){return a + b}; const memoizedAdd = lodash.memoize(add); console.log('1 + 1 = 1 + 2', memoizedAdd(1,2) === memoizedAdd(1,1));
This is not really what I expected

In frontend projects we use memoization for different optimizations:

  • Avoid component rendering - React.memo
  • Avoid re-computing internal component state - useMemo
  • Avoid re-computing information derived from the redux state - createSelector from reselect

The goal is always the same: do not redo an expensive computation if inputs are the same as the previous call. It is faster to just return the last computed result directly. More about memoization on Wikipedia

Using memoize in a react app

useMemo, React.memo, and createSelector are usually enough for all your memoization needs. However hooks don't work in class components. If you still have some in your codebase, you need a custom memoization function to replicate the functionality of useMemo. One implementation is described in the react docs.

// function component with memoization

const ComponentWithMemo = ({propA, propB}) => {

    const memoizedValue = useMemo(
        () => computeExpensiveValue(propA,propB), 
        [propA, propB]
    );

    return <p>{memoizedValue}</p>
}


//class component with memoization

import memoize from 'memoize-one';

class ComponentWithMemo extends React.Component {
   // Need to define a memoized function in the component
   memoizedCompute = memoize(computeExpensiveValue)

   render() {
       const {propA, propB} = this.props;
       // and call it on render
       const memoizedValue = this.memoizedCompute(propA, propB);
       return <p>{memoizedValue}</p>
   }
}
2 implementations of memoize to avoid re-computing on render

Lodash being very common, using lodash/memoize seems like a good option to implement the pattern without adding (yet) another dependency.

Problem 1: Lodash uses only the first parameter

Here is how the first example is interpreted by lodash internally:

var memoizedAdd = _.memoize(add); // cache = {}
memoizedAdd(1,1) // cache[1] = 2; return 2;
memoizedAdd(1,2) // return cache[1]; <== My :bug: is here
memoizedAdd(2,1) // cache[2] = 3; return 3;
Step-by-step of the memoized add example

This happens because the memoize function from lodash is only using the first parameter as a cache key by default. So, as long as the same first parameter is passed, the function always returns the same result.

On the other side, memoize-one and the other implementations running in react or reselect re-compute the function when any parameter is changed, thus it always returns the right result.

The problem is not caused by a lodash behavior being undocumented. In fact the documentation states clearly that they're using the first parameter as a cache key. The root cause of those errors is that it's very different from the other implementations that often live in the same project and are supposed to provide the same functionality.

Problem 2: You don't need an unlimited cache

While the first difference may lead to visible bugs, this one may affect performances. This is usually hard to detect but it can have a big impact on the user experience.

// const lodash = require('lodash'); const add = function(a, b){return a + b}; const lodashAdd = lodash.memoize(add); // use the memoized add 1000 times for(let i = 0; i<1000; i++){ lodashAdd(i,2); } console.log('lodash cache size: ', lodashAdd.cache.size);

Running the memoized functions 1000 times saves 1000 results in the cache. Does that mean that memoize is a good cache? Kind of. But this is not what we need from a memoize function.

Lodash uses a Map to cache all function results associated with a key.

// from https://github.com/lodash/lodash/blob/master/memoize.js
memoized.cache = cache.set(key, result) || cache
...
memoize.Cache = Map

This means that ALL keys and return values will be saved (by default) forever.

If you don't have a lot of different keys, you won't see the difference. If you are using unique IDs, this can become problematic. Memory leaks are hard to track as they may only happen in specific use cases like a page that stays open for a long time. Using a cache that by default can create leaks is thus not recommended.

You could configure lodash cache to limit the number of saved values. I would argue that in a frontend application the best limit for a memoize cache is just one value: the latest computed one.

Memoization is used to avoid recomputing expensive things and make rendering faster. But the bottleneck is not recomputing just one thing. Performance issues happen when an application recomputes every expensive operation on every change.

Memoization with a cache containing just the last value allows your application to only do the few expensive computations that are impacted by a change. This should be enough in most cases.

Note: If you have expensive operations that are too slow to be done even once, then memoization is not the right tool to solve that problem anyway.

Postmortem: lodash/memoize is no more

The first option to fix the bug is to configure lodash memoize to match the react, reselect, memoize-one... implementations.

let cacheKey;
let cacheResult;

// replace the cache to save one value
_.memoize.Cache = {
    set: (key, result) => {
        cacheKey = key;
        cacheResult = result;
    } 
    get: (key) => {
        if(cacheKey == key) {
            return cacheResult;
        }
    }
    // ... other map functions
};



// create a resolver that maps all parameters to a key
const keyResolver = (...args) => JSON.stringify(args);

const add = (a, b) => a + b;

// use the resolver in a memoized function
const memoizedAdd = _.memoize(add, keyResolver);

While replacing the cache can be done once and for all, the keyResolver to use all parameters as the cache key needs to be added to each new memoized function.

This made me choose a second option: Replace the memoize function by another-more straightforward- implementation. The easy part about having to switch from one memoize to another one is that there are already a lot of available implementations in most projects.

I used defaultMemoize from reselect as a short-term replacement and will then either introduce memoize-one or convert the component to be able to use hooks. The other change I'd like to do is adding a linting rule to warn users when they import lodash/memoize.

As a more long-term fix for the whole community, we may want to rename the lodash function to something along the lines of cacheResults(fn, generateKey) so that the name matches better the default behavior and not clash with the common memoize implementations.

Posted on Jul 26 '19 by:

nioufe profile

Adrien TRAUTH

@nioufe

Frontend & Log-Management & Monitoring.

Discussion

markdown guide
 

There is no perfect memoization library. And even something barely usable.
I've got two really big and detailed articles on the subject.

 

Thanks for your post, Adrien.
Btw, I don't think it is bad that the memoize function just cache the first key.

Lodash was concibed to work with functional programming patterns. So technically, you don't have functions with more than one param. Curried functions just accept one input. (in most cases)

Also, if we go back to the functional programming principles, you don't need to set a lifetime for your values. If you are working with pure functions, they always will be the same, for the same input, same output.

 

Thank you for doing a write up on this - I was about to use memoize() and thankfully found your article!