DEV Community

Robin Lejeune
Robin Lejeune

Posted on

Iterating on an array: should you loop with .map, .forEach or for?

TL;DR? Check at least the sections "Time for the comparisons!" and "But wait! There's more" ;)

Table of Contents

1. The Problem
2. How Do We Tackle That?
3. My First Thoughts
4. Improve on The First Try
5. Comparison Between .forEach(), .map() and for
6. Time For The Comparisons!
7. But Wait! There's More
8. Conclusion


I was recently asked to answer a coding exercise before a job interview. Out of the 3 questions, one felt particularly easy as I coded the answer in 5 minutes, but then... Then a problem occurred.

The Problem

My code passed 13 out of 15 tests. For the last two, it would not give an output in due time. It was not optimized enough (it wasn't optimized at all, if you ask me). Unfortunately, I only had 1 min left, and I couldn't find a proper solution in time. But then I thought some more, imagined a few possible optimized solutions, and ran a few tests of my own to verify them.

Here is the exercise: from a given array filled with numbers, you should return all the possible combinations which sum matches a given target (but return each possible combination only once).
For instance, from arr = [1, 5, 2, 4, 4] with target = 6, you should return [1, 5], [2, 4] (or '4,2', '5,1', the way in which it is presented and the order don't matter). Easy enough.
Of course, said array could be much bigger (say 20000 elements). This is when optimization kicks in.


How Do We Tackle That?

What do we have to solve that kind of problem?

  • Array.map() and Array.flatMap(), which take a callback and create a new array from the existing one. (.flatMap() returns, as its name suggests, a flattened array.)
  • Array.forEach(), which also takes a callback, loops through the array, and runs whatever you ask it to run (say, .push() the result in another empty array).
  • Or a plain, basic for loop. In the end, for what we are after, it works a bit like a .forEach(): you first create an empty array, then push whatever the loop returns.

From here, we don't have so many options. We have one array, where each element must be associated with every other element exactly once if the sum is correct. What this means is:

  • We need a new array
  • We need to loop over the array, to get each element
  • And we need to loop a second time, to get the second element

A loop inside a loop, that's never ideal. But at this point I couldn't find a better solution.


There is actually a way to stay on a linear time method
Thanks to Vincent A. Cicirello in the comments who pointed this out to me.

If you:

  • generate a Set A out of the initial array,
  • then iterate over each value and store y = target - value in another array or Set B,
  • then iterate over B and compare y to the values in A, and store the existing pairs [y, target - value]
  • and finally check whether the initial array contains target / 2 at least twice (because if your target is 6, [3, 3] is a valid answer; but if you only have one 3 in the array, you should remove [3, 3] from the pairs)

Your function takes way less time. That is better and smarter in every aspect than what I'm developing here under.
I won't linger on this in the article, though, because I want to compare .forEach(), .map() and for in a similar environment, which is here a quadratic time one.



My First Thoughts

Ok, we know where we're heading.

As we are dealing with an array, my first thoughts were to use a .forEach. (Don't ask me why I didn't start with a .flatMap, which seemed called for since we are basically returning a new array. My excuse is that I was stressed.)

Here's my code, commented for the beginners among you:

function combine(arr, target) {
    // I initiate an empty array
    const pairs = []

    // I loop once over arr
    // I keep in memory the element (e) and its index (i)
    arr.forEach((e, i) => {
        // I loop a second time over arr, 
        // and similarly keep in memory the element `el` and its index `j`
        arr.forEach((el, j) => {
            // I compare `i` and `j` to make sure I'm not 
            // combining the same element with itself,
            // and I check that the sum meets the target
            if (i !== j && e + el === target) {
                // By using a ternary operator to assign a value,
                // I make sure each pair is presented 
                // with the smallest number first
                const pair =
                    e <= el
                        ? [e, el]
                        : [el, e]
                // Two similar arrays [1, 2] are two distinct objects.
                // To compare them, I use a trick with `JSON.stringify()`,
                // because... Well, because strings are comparable.
                pairs.push(JSON.stringify(pair))
            }
        })
    })

    // Finally, I return a set from the array,
    // to make sure I have each combination only once
    return new Set(pairs)
}
Enter fullscreen mode Exit fullscreen mode

Improve on The First Try

You might see a few problems with my function already.

For instance, why stringify the combined array? We could just return a string. Remember: the form doesn't matter. That's a good guess: removing JSON.stringify() means removing a function, so less computing time. This is not the crucial point, though.

The true problem here is that I loop over elements which have already been used up. Take [1, 2, 3]: once I returned [1, 2], I don't need to return [2, 1]. Instead of looping over the whole array the second time, I could start at i + 1, the index of the current element + 1. Let's use .slice(i + 1) to do it.

One could even argue that the Set is now useless. That's true if all the elements in the array are different, but not if you have duplicates. So let's keep it.

Here's the better version:

function combine(arr, target) {
    const pairs = []

    arr.forEach((e, i) => {
        arr.slice(i + 1).forEach((el, j) => {
            if (e + el === target) {
                const pair =
                    e <= el
                        ? `${e}, ${el}`
                        : `${el}, ${e}`
                pairs.push(pair)
            }
        })
    })

    return new Set(pairs)
}
Enter fullscreen mode Exit fullscreen mode

Comparison Between .forEach(), .map() and for

That's better, but it wasn't good enough to pass the last 2 tests. I didn't have time to do better, but after the timer stopped, I ran some checks.

So what do you think? Which one of .forEach(), .map() or for is the fastest, and to which extent?

Let's try. We will run test1(), test2() and test3() to see the difference.

First, we initiate our array:

// An array of 10000 elements, each corresponding to its index
const arr = Array.from({length: 10000}, (e,i) => i)
// Let's double the array for fun (and to create duplicates)
const pirate = [...arr, ...arr]
const target = 10000
Enter fullscreen mode Exit fullscreen mode

The first test will check the time taken for .forEach().

function test1(arr, target) {
    let startTime = new Date()

    function combine(arr, target) {
        const pairs = []

        arr.forEach((e, i) => {
            arr.slice(i + 1).forEach((el, j) => {
                if (e + el === target) {
                    const pair =
                        e <= el
                            ? `${e}, ${el}`
                            : `${el}, ${e}`
                    pairs.push(pair)
                }
            })
        })

        return new Set(pairs)
    }

    combine(arr, target)
    return new Date() - startTime
}
Enter fullscreen mode Exit fullscreen mode

Then, we'll check the time taken for a .map()

function test2(arr, target) {
    let startTime = new Date()

    function combine(arr, target) {
        const pairs = arr.flatMap((e, i) =>
            arr
                .slice(i + 1)
                .map(el => {
                    if (e + el === target) {
                        return e <= el
                            ? `${e}, ${el}`
                            : `${el}, ${e}`
                    }
                })
                .filter(e => e)
        )

        return new Set(pairs)
    }

    combine(arr, target)
    return new Date() - startTime
}
Enter fullscreen mode Exit fullscreen mode

Finally, we'll test the simple for loop.

function test3(arr, target) {
    let startTime = new Date()

    function combine(arr, target) {
        let pairs = []

        for (let i = 0; i < arr.length; i++) {
            for (let j = i + 1; j < arr.length; j++) {
                if (arr[i] + arr[j] === target) {            
                    const pair = arr[i] <= arr[j]
                        ? `${arr[i]}, ${arr[j]}`
                        : `${arr[j]}, ${arr[i]}`

                    pairs.push(pair)
                }
            }
        }

        return new Set(pairs)
    }

    combine(arr, target)
    return new Date() - startTime
}
Enter fullscreen mode Exit fullscreen mode

Time For The Comparisons!

It's time for the results. But first: what are you expecting? What would your podium be?

To avoid exceptions, I run each test 10 times and calculate the average delay (in ms). Which gives us:

forEach() map() for
3212.9 16618.1 787.2

So it seems we have a winner. A really, really clear winner.

Are you as astonished as I was?

I mean, OK, for a seasoned developer, the victory of the for loop won't be much of a surprise. It is well documented after all. But to this extent? Boy, was I not expecting that!

It turns out the for loop is more than 4 times faster than .forEach() and 21 times faster than the .map()!

In fact, the margin is so big that even if I test the solutions without slicing the array to the current element on the second loop, they remain better than other options. In the end, the fastest solutions are:

  1. for (with j = i + 1)
  2. for (with j = 0)
  3. .forEach() with .slice()
  4. .forEach() without .slice()
  5. .map() with .slice()
  6. .map() without .slice()

Update on the linear time solution
It doesn't make much sense to compare this to the previous results, but just for fun, how long does a truly optimized function take?
Let's code the method suggested by Vincent A. Cicirello in the comments:
function combine(arr, target) {
    const set = new Set(arr)
    let pairs = new Set()
    let potentialPairings = new Set()

    set.forEach(e => potentialPairings.add(target - e))

    potentialPairings.forEach(pairing => {
        if (set.has(pairing)) {
            const pair =  pairing > (target - pairing) 
                    ? `${target - pairing}, ${pairing}` 
                    : `${pairing}, ${target - pairing}`
            pairs.add(pair)
        }
    })

    if (arr.filter(n => n === (target / 2)).length < 2) pairs.delete((`${target / 2}, ${target / 2}`))

    return pairs
}
Enter fullscreen mode Exit fullscreen mode

With 10000 items duplicated, this takes a whopping 5 ms to compute.
And what do we learn from this, kids? That you should really, reaaaally put some efforts in keeping a linear time function. Don't follow my lead.


But Wait! There's More

What could there be left to be said? We have a podium, and that article is already too long!

Well, as I was writing this, I found out that the results are very, very different if you remove the target. Basically, instead of returning every pair that matches a target, you return every single possible pair. That obviously takes a lot more time to compute.

Because I like my laptop, I tested with an array of 3000 instead of 10000. Here are the results:

forEach() map() for
9147.1 9097.4 8892.2

The results are much tighter. for remains the winner, but in this case all three options are fairly similar - and .map() is consistently, although slightly, better than .forEach().


Conclusion

Two things are certain:

  1. If you can find a way, remove those nested loops.
  2. for is your go-to if you need an optimization boost and you cannot reduce the number of nested loops. Additionnally, the less matches it finds, the faster for gets, and the bigger the gap between for and the rest. So if you have a very specific condition at the start of the second loop, it will trim the computed time.

For the rest, to be honest, I'm still trying to figure out where the huge differences in results comes from - and I'm talking of the difference between the 3 options in the first tests, as well as the difference between the first tests and the second ones. If you have an answer or an assumption, please leave a comment, I'll take it! :)

Top comments (3)

Collapse
 
cicirello profile image
Vincent A. Cicirello

There's another potential optimization that may have a meaningful impact. If there can be a large number of duplicates then your pairs may grow large creating unnecessary overhead. Instead of forming the Set at the end, try initializing an empty Set before your loops, and then instead of adding to pairs, add directly to the set.

Yet another optimization, even more substantial, would be to generate a Set of the numbers in the original array so that you condense it to the unique integers that it contains. Then iterate over the unique integers x in that Set, compute y=target-x, and check if the Set contains y. A lookup in a Set is constant time. In this way, you'd need 1 linear pass over the original array, followed by 1 linear pass over the unique integers. You'd end up with a linear time solution instead of your current quadratic time solution.

Collapse
 
robinlej profile image
Robin Lejeune • Edited

Oh thanks!
Your first suggestion is painfully obvious, I'm a bit ashamed I did not see it. ^^' Although after a closer look, with an array of 20000 (10000 duplicated), it's actually sliiiightly slower (~ +50 ms, whatever the method). I also tried with an array of 10 replicated 2000 times, with the same result. Maybe it starts having effects well beyond those measures? [Edit: nope, doesn't seem to be the case. With 200,000 items and a for loop, it's several seconds slower to initiate pairs as a Set.]

Anyway, I really didn't think about your second suggestion, although I tried to find a way to avoid looping inside a loop. For the simple reason that you might have one number twice, which combination is valid ([3, 3] to get 6), so a Set didn't seem to be the solution. But that was silly of me, as there's a simple way to check for this case. With your idea I made this function and tested it:

function combine(arr, target) {
    const set = new Set(arr)
    let pairs = new Set()
    let potentialPairings = new Set()

    set.forEach(e => potentialPairings.add(target - e))

    potentialPairings.forEach(pairing => {
        if (set.has(pairing)) {
            const pair = pairing > (target - pairing) ?  `${target - pairing}, ${pairing}` : `${pairing}, ${target - pairing}`
            pairs.add(pair)
        }
    })

    if (arr.filter(n => n === (target / 2)).length < 2) pairs.delete((`${target / 2}, ${target / 2}`))

    console.log(pairs.size)
}
Enter fullscreen mode Exit fullscreen mode

And this takes a whopping 5 ms to run (with 10000 items duplicated).
I think I'm going to edit my article, thanks. :)

Collapse
 
cicirello profile image
Vincent A. Cicirello • Edited

You're welcome. I'm guessing that if your original solution got all but a couple of their tests due to time, that they probably had at least one test large enough that you needed to have a linear time solution like this one. They probably weren't concerned with the for vs forEach.