loading...
Cover image for A Beginner's Guide: Memoization

A Beginner's Guide: Memoization

milkstarz profile image malik Updated on ・8 min read

This article was originally posted on malikbrowne.com.

Last week, I was browsing different articles for guidance on the new lifecycle methods in React v16.3. I came across this article that talks about how a lot of developers may be using getDerivedStateFromProps wrong.

If you're not familiar with React, the method simply allows a component to update its internal state, as a result of a change in its props. However, the article recommended to not do something that I would do all the time in my code:

Use getDerivedStateFromProps or componentWillReceiveProps to ensure that a component only performs an expensive calculation for a re-render when the inputs change.

However, an easier and more concise way of performing this can be done with a functional programming technique called memoization.

As a growing programmer interested in performance, I love to come across new functional programming techniques that help speed up the code that I write on a day to day basis. Memoization was something that I had heard other engineers talk about in different algorithm problems. However, I never took the time to see what all the hype was all about - mostly because it sounded really complicated.

In this post, I'm gonna explain what pure functions are, how memoization works, and how YOU can combine them in React components to make your code more performant.

Let's start off by talking about pure functions.

let's get started

What is a pure function?

By definition, a pure function is a function that meets the following criteria:

  1. It is a function that always returns the same result if the same arguments are passed in.
  2. It is a function that does not produce any observable side effects to your application including:
    • Network requests
    • Data mutation
    • Logging to files
    • Change application state
  3. It is a function that only accesses the data you pass into it, making dependencies easy to define.

Something that may help this idea click is an analogy from this article which compares pure functions to a coffee grinder.

A pure function is like a coffee grinder: beans go in, the powder comes out, end of story.

Benefits

There are a few benefits to pure functions - two of them being:

  1. They can lead to more declarative programs which describe how different inputs relate to outputs.
  2. They can increase the testability of your code, and make debugging your code less of a nightmare.

However, it's good to note that side effects, in general, are not bad - which means that we don't have to make every single function pure.

Example of a pure function

Let's say that we have a recursive function that returns the factorial of a number:

const factorial = n => {
     if (n === 1) {
          return n;
      }
    return n * factorial(n - 1)
}

// factorial(4)
// 4! === 4 * 3 * 2 * 1 === 24

If we pass in factorial(4), our calculations would be made and return us the result, 24, every single time.

Since we now know a pure function will return the same value every time, wouldn't it be convenient if our function could remember (or cache) our results? That way the next time someone wants to calculate factorial(100), we could save time and resources and just give them the already stored answer.

That, my friends, is memoization.

What is memoization, really?

By definition,

Memoization is an optimization technique used primarily to speed up programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.

In layman's terms, this means the function will memorize the solution to a problem if you give it the same question. To achieve a simple solution of memoization, we can implement some type of cache in the form of a map, which our function could then refer to.

Here's what our factorial solution would look like with a memoized function:

// our original factorial function
const factorial = n => {
    if (n === 1) {
         return n;
     }
   return n * factorial(n - 1)
}
// a memoized function used to calculate our factorial
const scopedMemoizedFactorial = () => {
  const fakeCache = {};
  return (value) => {
    if (value in fakeCache) {
      // return the value from our fake cache
      return fakeCache[value];
    }
    else {
      // calculate our factorial
      const result = factorial(value);
      fakeCache[value] = result;
      return result;
    }
  }
}

Things to notice

  • scopedMemoizedFactorial returns a function which is called later. We can do this in JavaScript because functions are first class objects, which means we can use them as higher order functions and return another function.
  • The fakeCache can remember the values because of the closure it's implemented in
  • This only works because the function we're working with is pure, like we talked about before. If it didn't return the same value, our cache wouldn't return the right value for the output!

If you'd like to see an example of a general memoized function, check out this gist which shows a memoization pattern from JavaScript Patterns by Stoyan Stefanov.

Using Memoization in React

For our example, let's pretend we have a third party API that returns back some JSON about all of the users on our application. The data structure looks something like this:

[
    {
        name: "Malik",
        age: 24,
        company: "Meetup",
        // ...and a bunch of other fields like this
    },
    // ...and 996 other entries just like this
]

If you would like to see what the whole data set looks like, check out this link. (Thank you to JSON Generator for this!)

The requirements for our application is to create a search box that will filter through our list of users and return a sorted list of all users whose name matches a query.

The code without memoization would look like this:

class App extends React.PureComponent{
  state = {
    searchValue: ""
  };

  filterList = (list, searchValue) =>
    list.filter(member => member.name.toLowerCase().startsWith(searchValue));

  sortList = list =>
    list.sort((a, b) => {
      if (a.name < b.name) return -1;
      if (a.name > b.name) return 1;
      return 0;
    });

  handleInputChange = searchValue => {
    this.setState({ searchValue, inputChanged: true });
  };

  render() {
    const { searchValue, inputChanged } = this.state;
    const filteredMembers = this.filterList(data, searchValue);
    const members = this.sortList(filteredMembers);

    return (
      <div className="App">
        <h1>No Memoization Example</h1>
        <Search
          searchValue={searchValue}
          onInputChange={e => this.handleInputChange(e.target.value)}
          placeholder="Search for a member"
        />
        <div className="members">
          {members.map(member => {
            return <Member member={member} key={member._id} />;
          })}
        </div>
      </div>
    );
  }
}

Check out the code in action here.

This solution will work perfectly fine in most situations, but with large sets of data the application will slow down a lot.

This happens for two reasons:

  • Filtering large sets of data is an expensive operation
  • Other re-renders of the application will cause the function to call the expensive operation again.

Using the helper memoize-one we can easily add memoization to this example:

import memoize from 'memoize-one';

class App extends React.PureComponent {
  state = {
    searchValue: ""
  };

  filterList = memoize((list, searchValue) =>
    list.filter(member => member.name.toLowerCase().startsWith(searchValue))
  );

  sortList = memoize(list =>
    list.sort((a, b) => {
      if (a.name < b.name) return -1;
      if (a.name > b.name) return 1;
      return 0;
    })
  );

  handleInputChange = searchValue => {
    this.setState({ searchValue });
  };

  render() {
    const { searchValue } = this.state;
    const filteredMembers = this.filterList(data.slice(0, 50), searchValue);
    const members = this.sortList(filteredMembers);

    return (
      <div className="App">
        <h1>With Memoization Example</h1>
        <Search
          searchValue={searchValue}
          onInputChange={e => this.handleInputChange(e.target.value)}
          placeholder="Search for a member"
        />
        <div className="members">
          {members.map(member => {
            return <Member member={member} key={member._id} />;
          })}
        </div>
      </div>
    );
  }
}

memoize-one is great because it only stores the results of the last function call, so you don't have to worry about cache busting issues.

Important notes for performance

The idea of memoization is great and all, but keep in mind the main benefit of memoization: to store the results of expensive function calls.

I took our factorial solution and used the Performance Timeline API to time how long our function calls took (down to the microsecond):

// we use performance.now() to keep track of how long each call takes
const tick = () => performance.now();
const t0 = tick()

optimizedFactorial(5000); // calculated
const t1 = tick();
console.log(`The first call took ${t1 - t0}ms.`);
// The first call took 0.3999999971711077ms.

optimizedFactorial(5000); // cached
const t2 = tick();
console.log(`Our memoized call took ${t2 - t1}ms.`);
// Our memoized call took 2.2000000026309863ms.

optimizedFactorial(4999); // calculated again with different param
const t3 = tick();
console.log(`A call that wasn't stored in our cache took ${t3 - t2}ms.`);
// A call that wasn't stored in our cache took 0.3999999971711077ms

As you can see, on my computer the memoized call took over five times longer to get the same result. This is because, in order for our memoization technique to work, the computer needs to allocate memory for a new variable and instantiate it, which takes a chunk of time respectively before it can perform the calculation.

As a result, we can see that using the memoize technique in this solution would be a premature optimization - and would negatively impact the performance of our application.

Another thing to note is that this solution doesn't handle many pains in relation to "busting" a cache including:

  • Setting a max age or size
  • Exclusions for our cache

Both of these pains can lead to memory leaks in our application, which can be a nightmare to debug. Because of this, a lot of engineers tend to use memoization helpers which have already implemented solutions to the pains to handle those common issues. Some of these include:

In regards to memoization in React, this React blog post covers some of the main constraints. Since they used a similar example, I'll share them below:

  1. In most cases, you’ll want to attach the memoized function to a component instance. This prevents multiple instances of a component from resetting each other’s memoized keys.
  2. You will also want to use a memoization helper with a limited cache size in order to prevent memory leaks over time.
  3. None of the implementations shown in this section will work if props.members is recreated each time the parent component renders. But in most cases, this setup is appropriate.

Conclusion

Memoization is an awesome technique, that if used correctly, can supercharge your applications. Using more functional programming techniques can lead to easier and more predictable code, with high testability.

I highly recommend trying out memoization in one of your application via a package called memoize-one.

If you have any questions about any of the concepts in this article, feel free to leave a question in the comments!

I'm always open to hearing from people in the dev community, so feel free to reach out to me on Twitter as well. Tell me your opinion on using memoization for performance!

See you in the next one.

Posted on Jul 28 '18 by:

milkstarz profile

malik

@milkstarz

your everyday full stack engineer, foodie, and reader of things.

Discussion

markdown guide
 

As a result, we can see that using the memoize technique in this solution would be a premature optimization - and would negatively impact the performance of our application.

While I agree about preventing premature optimization, on recursive functions like your factorial function memoization can store intermediate results, leading to optimized performance even if you don't call the function with the same parameter.

For instance, factorial(4999) could be faster if factorial memoizes all intermediate values from 4999 to 0. Of course, it means a bigger memory cost since the cache will no longer contain 1 value but 5000. But I think the best use case for memoization is on recursive pure functions.

The drawback is that you can't memoize a function from the outside like you did with memoize-one: the function must be implemented with its own cache.

 
 

Reselect is great, and is probably one of the most popular tools used along with Redux.

A cool thing to note is that reselect allows you to pass in custom equality checks, or your own memoization function. I've played with it in the past - it's pretty cool.

 

It took me a while to understand memoization. I wish I'd had this guide!!

 

I did too. If this helps one person struggle less than I did to understand this this post was worth the time. :)

 

Thank you so much. I am new in functional world

 

Thanks for the great post malik. Awesome points about memory leaks!

Very small correction: Lehman's terms should be layman's terms

 

ahhhh, thank you! i made the correction. :)

 
 

No problem, glad you liked it.

 

Ah this is fantastic, great write-up! I've always meant to learn more about this.

 

Haha same here, figured I'd just write a blog post while I learned more about it too! :)