Premature optimization is the root of all evil. It's also the root of this article.
I like programming puzzles. I also like to go fast. We're going to take some LeetCode problems and solve them a few times, first improving runtime complexity in broad strokes and then looking for minor optimizations. We're after these wonderful words:
faster than 100.00% of JavaScript online submissions
The environment we're targetting is nodejs 10.15.0
with --harmony
(source). The online judge system uses relatively small inputs for test cases as far as I can tell.
First problem
771. Jewels and Stones ~ You're given strings J
representing the types of stones that are jewels, and S
representing the stones you have. Each character in S
is a type of stone you have. You want to know how many of the stones you have are also jewels.
A naive solution here is to loop through our stones, looping through the jewels for every stone. We'll be using standard for loops in this article as they are generally the fastest way of iterating data in JavaScript.
var numJewelsInStones = function(J, S) {
let myJewels = 0;
// Jewels
for (var i = 0; i < J.length; i++) {
// Stones
for (var j = 0; j < S.length; j++) { // Nested!
if (J[i] === S[j]) {
myJewels++;
}
}
}
return myJewels;
};
The runtime is quadratic, O(N^2)
. Their online judge won't actually accept this solution! We get a big fat Time Limit Exceeded. Lesson? Nested for-loops should be avoided where possible.
Let's grab a Set to get rid of one of the loops. Reducing our runtime down to linear, O(N)
. Looking up a Set in JavaScript is constant time, O(1)
.
var numJewelsInStones = function(J, S) {
const jewels = new Set(J); // Set accepts an iterable object
let myJewels = 0;
for (var i = 0; i < S.length; i++) {
if (jewels.has(S[i])) {
myJewels++;
}
}
return myJewels;
};
For this effort, we're rewarded with faster than 97.84%
. I'm happy with this code. It's efficient and readable. If I needed drastically better performance, I might reach for a different technology than JavaScript. We have to walk the length of both strings at least once and there's no getting around that. We can't beat O(N)
but we can make optimizations.
The stones and jewels are defined as letters. So a-z
and A-Z
. This means there are just 52 different buckets our values can fall into! We can use a boolean array instead of a Set. To convert an alphabetical letter into a number, we'll use its ASCII code point via charCodeAt. We'll set an index to true
to represent a jewel.
However, there aren't boolean arrays in JavaScript. We could use a standard array and initialize it to length 52
. Or we could use Int8Array and allow the compiler to make additional optimizations. The typed array was ~6% faster when benchmarked with a range 0-52
of random characters entered as J
and S
.
Did you spot that our length is wrong? This is something I forgot as I was testing. There are seven characters between z
and A
on the ASCII code chart so the length required is actually 59.
var numJewelsInStones = function(J, S) {
const jewels = new Int8Array(59);
for (var i = 0; i < J.length; i++) {
jewels[J.charCodeAt(i)-65] = 1;
}
let myJewels = 0;
for (var i = 0; i < S.length; i++) {
if (jewels[S.charCodeAt(i)-65] === 1) {
myJewels++;
}
}
return myJewels;
};
Et voila, our 100% fastest
submission. In my tests, this was actually twice as faster as the Set version. Other optimizations I skipped testing were caching lengths, using a while loop instead of a for loop, and placing the incrementor before the number (++myJewels
vs myJewels++
).
Second problem
345. Reverse Vowels of a String ~ Write a function that takes a string as input and reverse only the vowels of a string.
A naive solution for this might be to loop through the array twice, replacing on the second loop. Let's try that out first.
var reverseVowels = function(s) {
const vowels = new Set(['a','e','i','o','u', 'A', 'E', 'I', 'O', 'U']);
const reversed = [];
let vowelsFound = [];
// Find any vowels
for (var i = 0; i < s.length; i++) {
if (vowels.has(s[i])) {
vowelsFound.push(s[i]);
}
}
// Build the final string
for (var i = 0; i < s.length; i++) {
if (vowels.has(s[i])) {
reversed.push(vowelsFound.pop());
} else {
reversed.push(s[i]);
}
}
return reversed.join('');
};
This nets us faster than 97.00%
. The runtime is linear, O(2N) -> O(N)
, and it reads well but I can't help but think we're looping the string one more time than we have to. Let's try a two-pointer approach. Walking in, step-by-step, from the front and back at the same time, swapping any vowels we see. If there's a middle vowel we just leave it.
var reverseVowels = function(s) {
const vowels = new Set(['a','e','i','o','u', 'A', 'E', 'I', 'O', 'U']);
s = s.split('');
let front = 0;
let back = s.length - 1;
while (front < back) {
if (!vowels.has(s[front])) {
front++;
continue;
}
if (!vowels.has(s[back])) {
back--;
continue;
}
let temp = s[front];
s[front] = s[back];
s[back] = temp;
front++;
back--;
}
return s.join('');
};
We've reduced a full iteration! This gets us faster than 98.89%
and it's at this point that we need to remember that LeetCode's benchmarks aren't conclusive nor are they consistent. It's not feasible for them to run a large number of iterations with a mixture of test cases. If you're practicing your puzzle solving, stop at 97%
and up. But that's not the point of this article, and, reader, I'm going to get that 100%
for you.
First I threw out the Set. The number of vowels is constant and we don't need all that hashing going on. I tried a switch statement but then found a chained if statement was faster. I discovered that in-lining this logic was faster than a function. I then reduced this down to an expression. What I'm trying to say is: the code coming up is gross. It's close-down-your-IDE-and-talk-a-walk gross. But .. it's faster than 100.00%
.
var reverseVowels = function(s) {
s = s.split('');
let front = 0;
let back = s.length - 1;
while (front < back) {
if (s[front] !== 'a' &&
s[front] !== 'e' &&
s[front] !== 'i' &&
s[front] !== 'o' &&
s[front] !== 'u' &&
s[front] !== 'A' &&
s[front] !== 'E' &&
s[front] !== 'I' &&
s[front] !== 'O' &&
s[front] !== 'U') {
front++;
continue;
}
if (s[back] !== 'a' &&
s[back] !== 'e' &&
s[back] !== 'i' &&
s[back] !== 'o' &&
s[back] !== 'u' &&
s[back] !== 'A' &&
s[back] !== 'E' &&
s[back] !== 'I' &&
s[back] !== 'O' &&
s[back] !== 'U') {
back--;
continue;
}
let temp = s[front];
s[front++] = s[back];
s[back--] = temp;
}
return s.join('');
};
(I'm sorry).
Third problem
509. Fibonacci Number ~ Calculate the nth Fibonacci number.
This is a common puzzle and it was the hardest to improve the runtime for because there are so few moving parts in the final solution. I'm sure some RNG was involved with LeetCode's grading too. Let's get the naive solution out of the way. The Fibonacci sequence is often used to teach recursion. However, the algorithm that is used has a runtime of O(2^n)
(very slow).
I actually crashed a browser tab by trying to calculate the 50th term with this function.
var fib = function(N) {
if (N < 2) {
return N;
}
return fib(N - 1) + fib(N - 2);
}
We get faster than 36.63%
for this answer. Ouch. In production, this is the kind of puzzle that can be solved by memoization (caching some of the work for later). This is the best solution because we only calculate up to the values that we need in linear time O(N)
and then running the algorithm again for a term under that limit is constant time O(1)
.
const memo = [0, 1];
var fib = function(N) {
if (memo[N] !== undefined) {
return memo[N];
}
const result = fib(N - 1) + fib(N - 2);
memo[N] = result;
return result
};
faster than 94.25%
. LeetCode doesn't store data between each run-through of our code so we'll have to try something different. We've interested in calculating one number of the sequence just once. I think we can throw away that array. Let's look at the iterative solution.
var fib = function(N) {
if (N < 2) {
return N;
}
let a = 1;
let b = 1;
for (let i = 3; i <= N; ++i) {
a = a + b;
b = a - b;
}
return a;
};
If this looks a little different to other iterative versions you might have seen, it's because I avoided the third temporary variable that we have to use in JavaScript to swap values (there are other methods as well but they're too slow). I did some benchmarks and I found using arithmetic instead was.. faster than 100.00%
.
Join 150+ people signed up to my newsletter on programming and personal growth!
I tweet about tech @healeycodes.
Top comments (30)
For the fibonacci puzzle, here is a solution that runs in a constant time using little math π
This solution is faster for large N, slower for small N.
jsben.ch/mOl4l
Using precomputed values for constants and using matrix expo for calculating ab (which is log(n))
Thanks for sharing this cool solution! I actually found this was slower on LeetCode than the postβs iterative solution. Crazy right? They must use low values of
N
in the test cases. I wonder if itβs quicker in other languages π€.Hmm, I've run some benchmark tests (I ran them online on Chrome so I don't know how reliable they are) and I got two different results on two different sites. On the first site (jsben.ch/) it seems that for smaller numbers, the iterative way is faster, the real power of the pure math function is when the N >= 200. While on the other site (perf.zone/), the math function is way faster even for smaller numbers. I got this results on the perf.zone/quick:
But anyway I wonder what time complexity does Javascript Math functions have π€.
The variance is very interesting. I wonder if Chrome throttles certain calculations.
I think it does.
Came here to say this! Did you also learn this from a discrete mathematics class? π
I did, long time ago and it was forgotten until I saw this function somewhere (might be on dev.to) and that stuck in my brain, so every time I see a fibonacci I get a flash of this solution.π
Nice article. Could you please elaborate more on this sentence: "...Looking up a Set in JavaScript is constant time, O(1)...".
As far as I know if you implement
Set
usinghash-table
you could reduce most of the operations to theO(1)
in an average-case butO(n)
in the worst-case. Moreover in ECMAScript2015 specification for theSet.prototype.has
it says:So I'm not sure that we could say that
Set.prototype.has
is an O(1) time complexity.Thanks π. Yes, in the worst case the look-up time can be
O(N)
. I was using Big O notation, where a Set look-up can be reduced down toO(1)
. I assumed that Chromium would be using an optimal data structure even though the spec only mandatessublinear access time
.I know I'm late to the party, but thanks for this reply, as you provide some links that make me understand why it "should be around O(1)" ^
If you reach this reply, how do you know/learn how to use the Set instead of doing a nested loop ?
Edit : what if in the first exemple; instead of the nested tab, we use jewels.indexOf(S[i]) !== -1 ? Exactly like using jewels.has(S[i]) ?
(Last time I checked)
indexOf
uses linear search to check for an element. So it's like doing a for loop and checking every element until it's found. It doesn't benefit from the fast look-up time of a hashtable.I learned the benefits and use cases of different data-structures by reading a book called Grokking Algorithms and watching Introduction to Algorithms on the MIT website (MIT Course Number 6.006). Textbooks can help as well as programming puzzle websites like hackerrank and leetcode where you can test the performance of your code and read peoples' discussions of solutions. Looking up (and learning) what 'Big O' is helps too!
Possible idea with the jewels thing:
This makes heavy use of CPU caching, so should be quite fast for longer strings.
Could also use A and B types
This can be done entirely on the stack and registers, so avoids other possible slowdowns related to allocation. Only issue is that of shifting...
Another, even faster:
This is basically just the fast ASM.js compatible form I guess. Not too sure if it will work though...
Edit: leetcode.com/submissions/detail/23...
tl;dr not as fast
A similar thing works perfectly fine in rust, though.
leetcode.com/submissions/detail/23...
These are all so cool! Iβm going to have to pick through them later π
This is probably because all the optimisations you may do in JS land may be thrown to the bin once it touches native code. I Rust you have more control over the memory layout.
Great post!
For the second problem, i'm wondering if "caching" the char helps:
Rather than accessing
s[front]
for each test, especially when the char is not a vowel.Maybe the js engine already optimizes your code π€.
Sure there's an extra affectation which cost time, but
s[front]
has to be "computed" each time. It's a direct memory access, but first an addition is in order to add front index to the s array adress, right ?So what is faster:
Difference is probably marginal, but wondering how much the assembler code changes. And if the jit optimizes that.
Interesting proposal. I ran it ten times on LeetCode and didn't get a faster answer (not that that means anything!).
I ran both functions one million times on a 445-char lorem ipsum text in Chrome latest. It seems that caching the variable is marginally faster π
Fun article, thank-you!
I wrote a few solutions of my own before reading over yours just to do a comparison, and while I didn't hit 100% on any of them, they were fairly performant. However, something that was really interesting is just how inconsistent leetcode's solution runner is. My
reverseVowels
solution would swing between the 98th and 50th percentile with no changes to the code itself.Did you run your solutions multiple times and experience a similar variation, or were they consistently in the 100th percentile?
Yes, I experienced a lot of variation! Since the only way I could compare my answers to existing answers was the recorded 100th percentile, thatβs what I chose to measure my own.
Thanks for the illustration how to solve problems iteratively: from the basic solution to the most performant.
Before reading this post I'd solve the "jewels and stones" problem by creating a js object and populating it with "jewels" keys. But using a Set collection with
has
method is more elegant and straightforward solution, thanks πGlad you found it helpful! I always appreciate when articles build up to the optimal solution π
Sets are also very cool π
you used
myJewels
as the count of how many jewels... that is kind of fuzzy... when 10 other programmer read other programmers' code, when they see...myJewels
... is it an array? Is it a string, or it is a number. Sure, you can imply it from the context, but sometimes when the code expand from 10 lines to 20 and to 30, it becomes difficult to track down.I'd suggest using
count
orcountJewels
to exactly denote what it means. Otherwise Peter has amyJewels
and is an array, Michael has amyJewels
and it is a string...count
has very little chance of confusion."close-down-your-IDE-and-talk-a-walk gross"
Are you kidding? It was wonderful! Simple, readable, all cases covered wonderful! Loved your gross solution. Gross optimization for the masses!
Jokes aside, thanks for your solutions! You implemented some interesting new ways to approach some problems, which made me realize there are things I'm not yet aware of! In short: Thanks!
Thanks for the kind words, Yuri!
Hm... with LeetCode, this time it could be 96% faster, and then I tried again, it became 42% faster, and then 10 seconds later, I tried again, it was 98% faster...
using a ASCII map... what if the program is subject to 8-bit ISO-8859-1 characters... and then to unicode?
What about this solution for jewels problem?
function findTotal(J, S) {
return Array.from(S).filter(item => J.indexOf(item) !== -1).length;
}
That's a great solution π. However, as we see here,
indexOf
has a complexity ofO(N)
. This means that for every stone, we might have to check the whole jewels string rather than a data structure we've constructed.For example, say we are passed 100 stones and 1000 jewels. With
indexOf
, the rough number of operations isstones * jewels
, or100 * 1000
. What we really want isstones * Set#has
, or100 * 1
. Set#has cost.