loading...
Cover image for Solving Puzzles With High-Performance JavaScript

Solving Puzzles With High-Performance JavaScript

healeycodes profile image Andrew Healey Updated on ・7 min read

Premature optimization is the root of all evil. It's also the root of this article.

I like programming puzzles. I also like to go fast. We're going to take some LeetCode problems and solve them a few times, first improving runtime complexity in broad strokes and then looking for minor optimizations. We're after these wonderful words:

faster than 100.00% of JavaScript online submissions

The environment we're targetting is nodejs 10.15.0 with --harmony (source). The online judge system uses relatively small inputs for test cases as far as I can tell.

First problem

771. Jewels and Stones ~ You're given strings J representing the types of stones that are jewels, and S representing the stones you have. Each character in S is a type of stone you have. You want to know how many of the stones you have are also jewels.

A naive solution here is to loop through our stones, looping through the jewels for every stone. We'll be using standard for loops in this article as they are generally the fastest way of iterating data in JavaScript.

var numJewelsInStones = function(J, S) {
    let myJewels = 0;
    // Jewels
    for (var i = 0; i < J.length; i++) {
        // Stones
        for (var j = 0; j < S.length; j++) { // Nested!
            if (J[i] === S[j]) {
                myJewels++;
            }
        }
    }
    return myJewels;
};

The runtime is quadratic, O(N^2). Their online judge won't actually accept this solution! We get a big fat Time Limit Exceeded. Lesson? Nested for-loops should be avoided where possible.

Let's grab a Set to get rid of one of the loops. Reducing our runtime down to linear, O(N). Looking up a Set in JavaScript is constant time, O(1).

var numJewelsInStones = function(J, S) {
    const jewels = new Set(J); // Set accepts an iterable object
    let myJewels = 0;
    for (var i = 0; i < S.length; i++) {
        if (jewels.has(S[i])) {
            myJewels++;
        }
    }
    return myJewels;
};

For this effort, we're rewarded with faster than 97.84%. I'm happy with this code. It's efficient and readable. If I needed drastically better performance, I might reach for a different technology than JavaScript. We have to walk the length of both strings at least once and there's no getting around that. We can't beat O(N) but we can make optimizations.

The stones and jewels are defined as letters. So a-z and A-Z. This means there are just 52 different buckets our values can fall into! We can use a boolean array instead of a Set. To convert an alphabetical letter into a number, we'll use its ASCII code point via charCodeAt. We'll set an index to true to represent a jewel.

However, there aren't boolean arrays in JavaScript. We could use a standard array and initialize it to length 52. Or we could use Int8Array and allow the compiler to make additional optimizations. The typed array was ~6% faster when benchmarked with a range 0-52 of random characters entered as J and S.

Did you spot that our length is wrong? This is something I forgot as I was testing. There are seven characters between z and A on the ASCII code chart so the length required is actually 59.

ASCII table

var numJewelsInStones = function(J, S) {
    const jewels = new Int8Array(59);
    for (var i = 0; i < J.length; i++) {
        jewels[J.charCodeAt(i)-65] = 1;
    }
    let myJewels = 0;
    for (var i = 0; i < S.length; i++) {
        if (jewels[S.charCodeAt(i)-65] === 1) {
            myJewels++;
        }
    }
    return myJewels;
};

Et voila, our 100% fastest submission. In my tests, this was actually twice as faster as the Set version. Other optimizations I skipped testing were caching lengths, using a while loop instead of a for loop, and placing the incrementor before the number (++myJewels vs myJewels++).

Second problem

345. Reverse Vowels of a String ~ Write a function that takes a string as input and reverse only the vowels of a string.

A naive solution for this might be to loop through the array twice, replacing on the second loop. Let's try that out first.

var reverseVowels = function(s) {
    const vowels = new Set(['a','e','i','o','u', 'A', 'E', 'I', 'O', 'U']);
    const reversed = [];
    let vowelsFound = [];
    // Find any vowels
    for (var i = 0; i < s.length; i++) {
        if (vowels.has(s[i])) {
            vowelsFound.push(s[i]);
        }   
    }
    // Build the final string
    for (var i = 0; i < s.length; i++) {
        if (vowels.has(s[i])) {
            reversed.push(vowelsFound.pop());
        } else {
            reversed.push(s[i]);
        }
    }
    return reversed.join('');
};

This nets us faster than 97.00%. The runtime is linear, O(2N) -> O(N), and it reads well but I can't help but think we're looping the string one more time than we have to. Let's try a two-pointer approach. Walking in, step-by-step, from the front and back at the same time, swapping any vowels we see. If there's a middle vowel we just leave it.

var reverseVowels = function(s) {
    const vowels = new Set(['a','e','i','o','u', 'A', 'E', 'I', 'O', 'U']);
    s = s.split('');
    let front = 0;
    let back = s.length - 1;
    while (front < back) {
        if (!vowels.has(s[front])) {
            front++;
            continue;
        }
        if (!vowels.has(s[back])) {
            back--;
            continue;
        }
        let temp = s[front];
        s[front] = s[back];
        s[back] = temp;
        front++;
        back--;
    }
    return s.join('');
};

We've reduced a full iteration! This gets us faster than 98.89% and it's at this point that we need to remember that LeetCode's benchmarks aren't conclusive nor are they consistent. It's not feasible for them to run a large number of iterations with a mixture of test cases. If you're practicing your puzzle solving, stop at 97% and up. But that's not the point of this article, and, reader, I'm going to get that 100% for you.

First I threw out the Set. The number of vowels is constant and we don't need all that hashing going on. I tried a switch statement but then found a chained if statement was faster. I discovered that in-lining this logic was faster than a function. I then reduced this down to an expression. What I'm trying to say is: the code coming up is gross. It's close-down-your-IDE-and-talk-a-walk gross. But .. it's faster than 100.00%.

var reverseVowels = function(s) {
    s = s.split('');
    let front = 0;
    let back = s.length - 1;
    while (front < back) {
        if (s[front] !== 'a' &&
            s[front] !== 'e' &&
            s[front] !== 'i' &&
            s[front] !== 'o' &&
            s[front] !== 'u' &&
            s[front] !== 'A' &&
            s[front] !== 'E' &&
            s[front] !== 'I' &&
            s[front] !== 'O' &&
            s[front] !== 'U') {
            front++;
            continue;
        }
        if (s[back] !== 'a' &&
            s[back] !== 'e' &&
            s[back] !== 'i' &&
            s[back] !== 'o' &&
            s[back] !== 'u' &&
            s[back] !== 'A' &&
            s[back] !== 'E' &&
            s[back] !== 'I' &&
            s[back] !== 'O' &&
            s[back] !== 'U') {
            back--;
            continue;
        }
        let temp = s[front];
        s[front++] = s[back];
        s[back--] = temp;
    }
    return s.join('');
};

(I'm sorry).

Third problem

509. Fibonacci Number ~ Calculate the nth Fibonacci number.

This is a common puzzle and it was the hardest to improve the runtime for because there are so few moving parts in the final solution. I'm sure some RNG was involved with LeetCode's grading too. Let's get the naive solution out of the way. The Fibonacci sequence is often used to teach recursion. However, the algorithm that is used has a runtime of O(2^n) (very slow).

I actually crashed a browser tab by trying to calculate the 50th term with this function.

var fib = function(N) {
    if (N < 2) {
        return N;
    }
    return fib(N - 1) + fib(N - 2);
}

We get faster than 36.63% for this answer. Ouch. In production, this is the kind of puzzle that can be solved by memoization (caching some of the work for later). This is the best solution because we only calculate up to the values that we need in linear time O(N) and then running the algorithm again for a term under that limit is constant time O(1).

const memo = [0, 1];
var fib = function(N) {
    if (memo[N] !== undefined) {
        return memo[N];
    }
    const result = fib(N - 1) + fib(N - 2);
    memo[N] = result;
    return result
};

faster than 94.25%. LeetCode doesn't store data between each run-through of our code so we'll have to try something different. We've interested in calculating one number of the sequence just once. I think we can throw away that array. Let's look at the iterative solution.

var fib = function(N) {
    if (N < 2) {
        return N;
    }
    let a = 1;
    let b = 1;
    for (let i = 3; i <= N; ++i) {
        a = a + b;
        b = a - b;
    }
    return a;
};

If this looks a little different to other iterative versions you might have seen, it's because I avoided the third temporary variable that we have to use in JavaScript to swap values (there are other methods as well but they're too slow). I did some benchmarks and I found using arithmetic instead was.. faster than 100.00%.


Join 150+ people signed up to my newsletter on programming and personal growth!

I tweet about tech @healeycodes.

Posted on May 19 '19 by:

healeycodes profile

Andrew Healey

@healeycodes

πŸ“š 🌡 πŸ• Love blogging, open-source, and helping out. https://healeycodes.com

Discussion

markdown guide
 

For the fibonacci puzzle, here is a solution that runs in a constant time using little math πŸ˜„

// 0(1) time complexity
var fib = function(N) {
    return Math.round(
        (Math.pow((1 + Math.sqrt(5)) / 2, N) - 
        Math.pow(-2 / (1 + Math.sqrt(5)), N)) / 
        Math.sqrt(5)
    );
}
 

This solution is faster for large N, slower for small N.
jsben.ch/mOl4l


/**
 * @param {number} N
 * @return {number}
 */

const pow = function (base, n) {
    if (n==0)
        return 1

    if (n==1)
        return base

    if (n&1) 
        return base*pow(base, n-1)
    else {
        const half = pow(base, Math.floor(n/2))
        return half*half
    }
}

var fib = function(N) {
    const sqrt5 = 2.23606797749979
    const num = 1.618033988749895
    const num2 = -0.6180339887498948
    return Math.round(
        ( pow(num, N) - pow(num2, N) ) / sqrt5
    )
};

fib(NUM);

Using precomputed values for constants and using matrix expo for calculating ab (which is log(n))

 

Came here to say this! Did you also learn this from a discrete mathematics class? πŸ˜„

 

I did, long time ago and it was forgotten until I saw this function somewhere (might be on dev.to) and that stuck in my brain, so every time I see a fibonacci I get a flash of this solution.πŸ˜„

 

Thanks for sharing this cool solution! I actually found this was slower on LeetCode than the post’s iterative solution. Crazy right? They must use low values of N in the test cases. I wonder if it’s quicker in other languages πŸ€”.

 

Hmm, I've run some benchmark tests (I ran them online on Chrome so I don't know how reliable they are) and I got two different results on two different sites. On the first site (jsben.ch/) it seems that for smaller numbers, the iterative way is faster, the real power of the pure math function is when the N >= 200. While on the other site (perf.zone/), the math function is way faster even for smaller numbers. I got this results on the perf.zone/quick:

  • for N = 20, math -> 705,738,258 op/second; iterative -> 66,518,777 op/second
  • for N = 50, math -> 725,140,230 op/second; iterative -> 16,581,587 op/second
  • for N = 100, math -> 700,447,370 op/second; iterative -> 5,875,773 op/second
  • for N = 200, math -> 706,141,355 op/second; iterative -> 2,562,412 op/second

But anyway I wonder what time complexity does Javascript Math functions have πŸ€”.

The variance is very interesting. I wonder if Chrome throttles certain calculations.

 

Nice article. Could you please elaborate more on this sentence: "...Looking up a Set in JavaScript is constant time, O(1)...".
As far as I know if you implement Set using hash-table you could reduce most of the operations to the O(1) in an average-case but O(n) in the worst-case. Moreover in ECMAScript2015 specification for the Set.prototype.has it says:

.
4. Let entries be the List that is the value of S’s [[SetData]] internal slot.
5. Repeat for each e that is an element of entries,
If e is not empty and SameValueZero(e, value) is true, return true.
.

So I'm not sure that we could say that Set.prototype.has is an O(1) time complexity.

 

Thanks 😊. Yes, in the worst case the look-up time can be O(N). I was using Big O notation, where a Set look-up can be reduced down to O(1). I assumed that Chromium would be using an optimal data structure even though the spec only mandates sublinear access time.

Map object must be implemented using either hash tables or other mechanisms that, on average, provide access times that are sublinear on the number of elements in the collection.

 

I know I'm late to the party, but thanks for this reply, as you provide some links that make me understand why it "should be around O(1)" ^

If you reach this reply, how do you know/learn how to use the Set instead of doing a nested loop ?

Edit : what if in the first exemple; instead of the nested tab, we use jewels.indexOf(S[i]) !== -1 ? Exactly like using jewels.has(S[i]) ?

(Last time I checked) indexOf uses linear search to check for an element. So it's like doing a for loop and checking every element until it's found. It doesn't benefit from the fast look-up time of a hashtable.

I learned the benefits and use cases of different data-structures by reading a book called Grokking Algorithms and watching Introduction to Algorithms on the MIT website (MIT Course Number 6.006). Textbooks can help as well as programming puzzle websites like hackerrank and leetcode where you can test the performance of your code and read peoples' discussions of solutions. Looking up (and learning) what 'Big O' is helps too!

 

Possible idea with the jewels thing:

var numJewelsInStones = function(J, S) {
    const jewels = new Uint8Array(8);
    for (let i = 0, l = J.length; i < l;) {
        const j = J.charCodeAt(i++) - 0x41;
        jewels[j >>> 3] |= 1 << (j & 7);
    }
    let myJewels = 0 >>> 0;
    for (let i = 0, l = S.length; i < l;) {
        const j = S.charCodeAt(i++) - 0x41;
        jewels[j >>> 3] & (1 << (j & 7)) && ++myJewels;
    }
    return myJewels;
};

This makes heavy use of CPU caching, so should be quite fast for longer strings.

 

Could also use A and B types

var numJewelsInStones = function(J, S) {
    let a = 0|0, b = 0|0, j = 0|0;
    for (let i = 0, l = J.length; i < l;) {
        j = J.charCodeAt(i++);
        if (j < 0x60) a |= 1 << (j - 0x41);
        else b |= 1 << (j - 0x61)
    }
    let myJewels = 0 >>> 0;
    for (let i = 0, l = S.length; i < l;) {
        j = S.charCodeAt(i++);
        if (j < 0x60) a & (1 << (j - 0x41)) && ++myJewels;
        else b & (1 << (j - 0x61)) && ++myJewels;
    }
    return myJewels;
};

This can be done entirely on the stack and registers, so avoids other possible slowdowns related to allocation. Only issue is that of shifting...

 

Another, even faster:

var numJewelsInStones = function(J, S) {
    if (!J || !S) return 0; // empty string, fast return
    let a = 0|0, b = 0|0, j = 0|0, i = J.length, l = S.length, myJewels = 0 >>> 0;
    do {
        j = J.charCodeAt(--i);
        if (j < 0x60) a |= 1 << (j - 0x41);
        else b |= 1 << (j - 0x61);
    } while (i);
    do {
        j = S.charCodeAt(--l);
        if (j < 0x60) a & (1 << (j - 0x41)) && ++myJewels;
        else b & (1 << (j - 0x61)) && ++myJewels;
    } while (l);
    return myJewels;
};

This is basically just the fast ASM.js compatible form I guess. Not too sure if it will work though...

Edit: leetcode.com/submissions/detail/23...

tl;dr not as fast

A similar thing works perfectly fine in rust, though.

leetcode.com/submissions/detail/23...

impl Solution {
    pub fn num_jewels_in_stones(j: String, s: String) -> i32 {
        let mut m = 0u64;
        for i in j.bytes() {
            m |= 1 << (i - 0x41);
        }
        let mut w = 0i32;
        for i in s.bytes() {
            let g = 1 << (i - 0x41);
            if g & m != 0 { w += 1 };
        }
        w
    }
}

These are all so cool! I’m going to have to pick through them later πŸ‘

This is probably because all the optimisations you may do in JS land may be thrown to the bin once it touches native code. I Rust you have more control over the memory layout.

 

Thanks for the illustration how to solve problems iteratively: from the basic solution to the most performant.

Before reading this post I'd solve the "jewels and stones" problem by creating a js object and populating it with "jewels" keys. But using a Set collection with has method is more elegant and straightforward solution, thanks πŸ˜„

 

Glad you found it helpful! I always appreciate when articles build up to the optimal solution πŸ‘

Sets are also very cool 😎

 

Great post!

For the second problem, i'm wondering if "caching" the char helps:

        var c = s[front];
        if (c !== 'a' &&
            c !== 'e' &&
            c !== 'i' &&
            c !== 'o' &&
            c !== 'u' &&
            c !== 'A' &&
            c !== 'E' &&
            c !== 'I' &&
            c !== 'O' &&
            c !== 'U') {
            front++;
            continue;
        }

Rather than accessing s[front] for each test, especially when the char is not a vowel.
Maybe the js engine already optimizes your code πŸ€”.
Sure there's an extra affectation which cost time, but s[front] has to be "computed" each time. It's a direct memory access, but first an addition is in order to add front index to the s array adress, right ?
So what is faster:

  • 10 additions + 10 read
  • or 1 write + 10 read ?

Difference is probably marginal, but wondering how much the assembler code changes. And if the jit optimizes that.

 

Interesting proposal. I ran it ten times on LeetCode and didn't get a faster answer (not that that means anything!).

I ran both functions one million times on a 445-char lorem ipsum text in Chrome latest. It seems that caching the variable is marginally faster 😊

 

Fun article, thank-you!

I wrote a few solutions of my own before reading over yours just to do a comparison, and while I didn't hit 100% on any of them, they were fairly performant. However, something that was really interesting is just how inconsistent leetcode's solution runner is. My reverseVowels solution would swing between the 98th and 50th percentile with no changes to the code itself.

Did you run your solutions multiple times and experience a similar variation, or were they consistently in the 100th percentile?

 

Yes, I experienced a lot of variation! Since the only way I could compare my answers to existing answers was the recorded 100th percentile, that’s what I chose to measure my own.

 

"close-down-your-IDE-and-talk-a-walk gross"
Are you kidding? It was wonderful! Simple, readable, all cases covered wonderful! Loved your gross solution. Gross optimization for the masses!
Jokes aside, thanks for your solutions! You implemented some interesting new ways to approach some problems, which made me realize there are things I'm not yet aware of! In short: Thanks!

 

Thanks for the kind words, Yuri!

 

you used myJewels as the count of how many jewels... that is kind of fuzzy... when 10 other programmer read other programmers' code, when they see... myJewels... is it an array? Is it a string, or it is a number. Sure, you can imply it from the context, but sometimes when the code expand from 10 lines to 20 and to 30, it becomes difficult to track down.

I'd suggest using count or countJewels to exactly denote what it means. Otherwise Peter has a myJewels and is an array, Michael has a myJewels and it is a string... count has very little chance of confusion.

 

Hm... with LeetCode, this time it could be 96% faster, and then I tried again, it became 42% faster, and then 10 seconds later, I tried again, it was 98% faster...

using a ASCII map... what if the program is subject to 8-bit ISO-8859-1 characters... and then to unicode?

 

What about this solution for jewels problem?

function findTotal(J, S) {
return Array.from(S).filter(item => J.indexOf(item) !== -1).length;
}

 

That's a great solution πŸ‘. However, as we see here, indexOf has a complexity of O(N). This means that for every stone, we might have to check the whole jewels string rather than a data structure we've constructed.

For example, say we are passed 100 stones and 1000 jewels. With indexOf, the rough number of operations is stones * jewels, or 100 * 1000. What we really want is stones * Set#has, or 100 * 1. Set#has cost.