# Making Algorithms Faster with Memoization

### rileyflynn Nov 28

This post is in response to this post by Ivan.

I really enjoyed working on that problem, and I ended up arriving at a solution that wasn't discussed in that blog post. In this post, I will discuss how I arrived at my solution and how memoization can be used to make an algorithm much faster. Hopefully it will also serve as a relatively simple introduction to memoization for anyone who is new to the concept.

Let's start with a recap of the problem:

This problem was asked by Uber.

Given an array of integers, return a new array such that each element at index i of the new array is the product of all the numbers in the original array except the one at i.

For example, if our input was [1, 2, 3, 4, 5], the expected output would be [120, 60, 40, 30, 24]. If our input was [3, 2, 1], the expected output would be [2, 3, 6].

Follow-up: what if you can't use division?

The most obvious solution is to just loop through all the elements in the array, and for each element, loop through the array *again* and multiply all the other elements except for the current one. Something like this:

(There are some fancy functional array methods I could use here instead, but this is the basic idea)

```
let inputArray = [1, 2, 3, 4, 5];
let resultArray = [];
inputArray.forEach((item, index) => {
let total = 1;
inputArray.forEach((innerItem, innerIndex) => {
// make sure to compare by index because we cannot guarantee that every item is unique!
if (index !== innerIndex) {
total = innerItem * total;
}
});
resultArray.push(total);
});
console.log(resultArray) // output: [120, 60, 40, 30, 24]
```

Now, this isn't a great because we have a nested for-loop. In this case, it means our algorithm will have a time complexity of O(n²). If we have an array of length 5, we end up having to do 25 (5²) operations. That's not a big deal for small arrays, but what if the array is much larger? If our array has 1,000 items, we'd have to do 1 million operations. If our array has 10,000 items, we'd have to do 100 million operations. Not good. Just to drive the point home, here's a graph of time complexity, if you're not familiar. Note the difference between O(n) and O(n²):

A much better solution would be to just multiply all the items in the array together a single time, store the total product in a variable, and then for each item in the array, just divide the total product by that item:

```
let inputArray = [1, 2, 3, 4, 5];
let total = 1;
inputArray.forEach((item) => {
total = total * item
});
let resultArray = [];
inputArray.forEach((item) => {
resultArray.push(total/item);
});
console.log(resultArray) // output: [120, 60, 40, 30, 24]
```

Hopefully it makes sense why this is the case. (e.g. 1*2*3*4*5/3 is the same as 1*2*4*5 ).

With the above solution we've successfully removed the nested loop, and our algorithm is no longer in O(n²) time.

Now Uber is going to throw us for a loop. As a follow-up, they ask for a solution where you can't use division. Hmmm, our great solution doesn't work anymore. However, we can just use our initial solution, right? ...I guess so, but as we discussed above, it's not really a great solution because it's in O(n²) time. If you're ever doing a coding problem in an interview for a company like Uber, Google, or Facebook, and they ask you a question where the obvious answer is in O(n²) time, you can be pretty confident that there's a better solution you should be looking for.

Here's where memoization comes in. What is memoization? Here's the wikipedia definition:

In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.

Basically if there's any information that we need to calculate more than once, we can store it in a variable the first time, and then just refer to that variable later rather than calculating again.

How does this apply to the problem at hand? Well, let's consider that in order to get our result for a particular item i in the array, we can take the product of all the numbers *preceding* i, and multiply it by the product of all the numbers *following* i.

So let's say we're calculating the result for the third item in our array [1, 2, 3, 4, 5]. We can think about this as (1*2) * (4*5). Remember our second algorithm, where we started off by calculating the total product of *all* the items in the array? Since we're already calculating that result step by step, why don't we just store the result at each step of the sequence so that we can refer to it later, rather than recalculating.

```
let inputArray = [1, 2, 3, 4, 5];
let total = 1;
let cache = [];
inputArray.forEach((item, index) => {
total = item * total
cache[index] = total
});
console.log(cache) // output: [1, 2, 6, 24, 120]
// note that 1 = 1, 1*2 = 2, 1*2*3 = 6, 1*2*3*4 = 24, 1*2*3*4*5 = 120
```

Great, we've successfully stored our values. However, you might notice a problem here. The above code doesn't give us the product of the numbers *following* our current item i. So, if we're looking at the 3rd item in the inputArray, we need (1*2) * (4*5). Our cache gives us the result of 1*2, but it doesn't give us the result of (4*5). To solve this, as we loop through our inputArray, we also need to calculate the total product in *reverse* order. This way, we can have a complementary cache showing 5, 5*4, 5*4*3, etc. This is a bit more work for our algorithm, but not significantly so.

```
let inputArray = [1, 2, 3, 4, 5];
let frontTotal = 1;
let backTotal = 1;
let frontCache = [];
let backCache = [];
inputArray.forEach((item, index) => {
// calculate the total starting from the beginning of the array
frontTotal = item * frontTotal;
frontCache[index] = frontTotal;
// calculate the total starting from the end of the array
backTotal = inputArray[inputArray.length - 1 - index] * backTotal;
backCache[inputArray.length - 1 - index] = backTotal;
});
console.log(frontCache) // output: [1, 2, 6, 24, 120]
console.log(backCache) // output: [120, 120, 60, 20, 5]
```

You could also store the cached values in an object, or in a two-dimensional array, or something else, but exactly how your store the cached values isn't super important here. What is important is that you can retrieve them in an orderly way later.

Now we're all set up to finish the problem. All we have to do is loop through our input array, and for each item i, multiply the product of all numbers before i (from frontCache) by the product of all numbers after i (from backCache) We'll use JavaScript's map function to call a simple multiplication function for each item in the array. The only caveat here is that for the very first item in the array, we only consult the backCache, and for the very last item in the array, we only consult the frontCache.

```
// our code from before...
let inputArray = [1, 2, 3, 4, 5];
let frontTotal = 1;
let backTotal = 1;
let frontCache = [];
let backCache = [];
inputArray.forEach((item, index) => {
frontTotal = item * frontTotal;
frontCache[index] = frontTotal;
backTotal = inputArray[inputArray.length - index - 1] * backTotal;
backCache[inputArray.length - index - 1] = backTotal;
})
// new code...
let resultArray = inputArray.map((item, index) => {
if (index == 0 ) {
return backCache[index + 1];
} else if (index === inputArray.length - 1) {
return frontCache[index - 1];
} else {
return frontCache[index - 1] * backCache[index + 1];
}
});
console.log(resultArray) // output: [120, 60, 40, 30, 24]
```

All we're doing here is taking the value from the frontCache at the position *before* the current index and multiplying it by the value of the backCache at the position *after* the current index. As I stated above, it's not super important how you're storing and accessing these values, what's important is that you're not evaluating array[0]*array[1]...*array[n] etc. over and over for each and every item in the array.

Let's pull it all together and see how much memoization improves our performance:

```
// build a large array with random numbers 1-5
let inputArray = [];
let j = 1;
for (i = 1; i <= 10000; i++) {
inputArray[i-1] = Math.floor(Math.random() * (5)) + 1;
}
const slowApproach = (inputArray) => {
let resultArray = [];
inputArray.forEach((item, index) => {
let total = 1;
inputArray.forEach((innerItem, innerIndex) => {
if (index !== innerIndex) {
total = innerItem * total;
}
});
resultArray.push(total);
});
return resultArray;
};
// faster, with memoization
const fastApproach = (inputArray) => {
let frontTotal = 1;
let backTotal = 1;
let frontCache = [];
let backCache = [];
inputArray.forEach((item, index) => {
frontTotal = item * frontTotal;
frontCache[index] = frontTotal;
backTotal = inputArray[inputArray.length - 1 - index] * backTotal;
backCache[inputArray.length - 1 - index] = backTotal;
})
let resultArray = inputArray.map((item, index) => {
if (index == 0 ) {
return backCache[index + 1];
} else if (index === inputArray.length - 1) {
return frontCache[index- 1];
} else {
return frontCache[index- 1] * backCache[index + 1];
}
});
return resultArray
}
// compare performance
let t0 = performance.now();
slowApproach(inputArray);
let t1 = performance.now();
console.log("Call to slowApproach " + (t1 - t0) + " milliseconds.")
let t2 = performance.now();
fastApproach(inputArray);
let t3 = performance.now();
console.log("Call to fastApproach took " + (t3 - t2) + " milliseconds.")
```

So, on my computer, with an array of 10,000 items, the slow approach takes ~1500 milliseconds while the fast approach with memoization takes ~9 milliseconds.

Even better, for an array of 100,000 items, the slow approach takes ~90,000 milliseconds (1.5 minutes!) while the fast approach takes ~60 milliseconds (~1/20th of a second). Not bad!

And of course, as your array size increases, that gap will only widen.

Let me know if you have any comments on this. I'm sure there are ways to optimize this even further.

Thanks, for reading! You can find me on twitter here

(open source and free forever ❤️)

Hello rileyflynn,

What about skipping the calculation for both

`frontCache[lastKey]`

and`backCache[0]`

as those result are never used. It should be slightly faster:Quick point: This isn't memoization, or at least it's not of a form that I know of. Memoization is where you compute a result, store it, and reuse it when requested for that result again. As you use each of the intermediate results exactly once, this isn't memoization. It is of course quite a clever transformation of a problem from something that is naively O(n

^{2}) to O(n), and on a technical level it likely introduces quite a lot of SIMD parallelism as well - but it's not memoization.Thanks for reading and commenting. I was definitely thinking exact same thing as I was writing the post, and I was somewhat hesitant to use the term "memoization" here. The code does seem to be slightly outside the exact definition of memoization, insofar as I'm not caching the results of a pure function for given inputs. However, I do think what's happening here is still fundamentally the same—speeding up a program by caching the results of expensive operations, and then using those later rather than calculating the same value again. I suppose due to how I was approaching the problem, with the total product being calculated incrementally across a for-loop, the cache also had to be calculated incrementally, so it didn't really lend itself to a more functional approach, where you would call a function with a single input, and cache the result for that input.

I think we could meet the exact formal definition of memoization by doing essentially the same thing, but doing it recursively?:

Then later in the problem, when we're at, for example, myArray[4] we could call something like:

getTotalProduct(myArray, 3) * getReverseTotalProduct(myArray, 5)

and we would just get values from our cache, because we've already calculated results for those particular inputs...

This isn't about a "formal definition" of memoization, or indeed caching, as such. The basic misunderstanding is that memoisation cannot be applied to a problem where each request is unique. Your second version isn't memoization - it could be counted as pre-computing your arrays, but that's it. Fundamentally, you still only read each element of the cache once.

As I said before, the speedup here is from the way the problem has been transformed to being O(n) rather than O(n

^{2}). But for this particular problem, memoization isn't useful, because each answer is required exactly once. Because each query is unique, you cannot memoize.Memoization might be of use in a slightly different problem specification, where there's a user of the system who asks for different indicies and you cannot precompute the entire array; in that case, as the user may request the same index multiple times memoization is viable. But for this one, there just isn't anything that can be memoized.

Not sure I follow, to be honest. In the problem, we need to calculate, for example, the product of 1*2*3*4 two separate times, meaning that that particular result is required exactly twice. The second time, we don't have to recalculate it because we've already cached it. Each request is not unique—it happens twice.

It's true that we only read from the cache once per item, but does it make a difference whether we're reading each element from the cache once or 500 times?

Well, yes, I think it kind of does, because it governs what's an appropriate technique. Certainly in the case of the forward direction, all you've actually implemented is an accumulator, and in a particularly inefficient way as far as memory is concerned. And perhaps I got a bit muddled because you only presented the forward direction - the backwards direction probably does require memoisation (Edit: No, it doesn't).

The real point to this interview question, I believe, is that the most efficient solution requires different approaches for the forward and backward directions.

OK, I've had a proper chance to think about this (when I'm not posting from work/on the move), and Memoization is definitely not an ideal optimisation technique here. The problem is that what you really want is to not have to recompute things all the time, and so you want to have partial results. However, because of the form of these partial results, there's no requirement to actually store them in either a precomputed array or by memoization. It's trivial to use a pair of accumulators, as in the following code (Python, because that's what I normally use)

This version is likely to be as fast as you can get the algorithm, is pretty simple, and uses the bare minimum of additional memory (for clarity the two accummulators are seperate; they can be combined, meaning that the overhead needed for this algorithm is a single integer + the output array).

If you don't object, I might write this up into an object lesson on how to tell if the optimizations people choose to apply are actually appropriate optimizations. As I was saying, if you're only reading your memoized value once, it's unlikely to be the correct technique.

Thanks for all your insights, David. This is a really awesome solution. It didn't quite occur to me that I could have two separate accumulator loops (forward and reverse), and then combine the results like this.

You're definitely welcome to write this up in a post. That would be super interesting.

Awesome! I'll try the same exercise in other languages to compare times.

You're really just trading memory usage for speed. Classic example of where you can choose between fast, simple and low memory usage. Pick two. 😁