For the longest time, I've always thought that JavaScript generators perform better than chained array methods in terms of execution time.
- Surely, I thought that the superior memory efficiency of generators (due to lazy computation) implied better overall execution time as well.
- Surely, in terms of time complexity, the single-pass nature of generators were superior over the of chained array methods.
All of the heuristics that I had been taught by computer science point to generators being the undisputed champion. After many years of personal presumption and dogma, it's finally time to put that to the test—with statistical rigor of course! So strap on your seatbelts as we finally answer the question:
Are generators actually faster than chained array methods?
The Setup
When we chain array methods, we typically perform a filter
first then the map
. The main idea is that it is more efficient to filter
out the bad data first so that we don't have to map
as many values later on. This will be our primary experiment: a basic filter
+ map
operation.
Let's start with generating 32 KiB of random data.
// Generate 32 KiB of "cyptographically strong random values".
const bytes = crypto.getRandomValues(new Uint8Array(1 << 15));
// Then save the buffer into a file.
// Deno is just an implementation detail.
Deno.writeFileSync('random.bin', bytes);
Below is a histogram of the generated bytes. Indeed, we see a fairly uniform distribution of byte values in the random.bin
file. For the sake of reproducibility and independence, we will now use this binary dump for all experiments.
The goal is to simply prune out all of the bytes greater than 127
(roughly half of random.bin
) and then normalize the data so that the remaining bytes are between 0.0
and 1.0
(inclusive). Let's now discuss the four cases that we will be testing.
Case 1: raw-for-loop
By presumption, we consider the raw for
loop construction as the baseline. Here, we iterate over all values in nums
and push the mapped value into the output
array.
function rawForLoop(nums: number[]) {
const output = [] as number[];
for (let i = 0; i < nums.length; ++i) {
const num = nums[i];
if (num <= 127)
output.push(num / 127);
}
return output;
}
Case 2: for-of-loop
A slight variation of the raw-for-loop
case is the for-of-loop
case, which instead uses the for...of
loop to iterate over nums
. Everything else remains the same.
function forOfLoop(nums: number[]) {
const output = [] as number[];
for (const num of nums)
if (num <= 127)
output.push(num / 127);
return output;
}
Case 3: array-method
Now we enter the territory of functional programming. Here, we implement the same logic as a two-pass filter
+ map
chain.
function arrayMethod(nums: number[]) {
return nums.filter(num => num <= 127).map(num => num / 127);
}
Case 4: generator
Finally, we implement the equivalent logic with generators.
function* _generatorImpl(nums: number[]) {
for (const num of nums)
if (num <= 127)
yield num / 127;
}
function generator(nums: number[]) {
return Array.from(_generatorImpl(nums));
}
The Experiment
Before we run the experiments, let's get the hardware specifications out of the way. Note that future work should reproduce this experiment on native Linux systems.
- OS: Windows 22H2 (Build 19045.4842)
- CPU: Intel® Core™ i7-10750H CPU @ 2.60GHz (12 CPUs)
- RAM: 8192 MB
Now we implement a CLI for running each experiment case. Here is a high-level overview of the harness logic:
- Select one of the four implementations: either
raw-for-loop
,for-of-loop
,array-method
, orgenerator
. - Reads the 32-KiB
random.bin
file for the cached random bytes. - Perform 2048 warm-up iterations to allow the just-in-time compiler to optimize the code.
- Finally, run the actual experiment with high-resolution sub-millisecond timing for 16384 iterations.
- When the standard output is piped into a file, the result is 16384 lines of timing data.
I'll spare you the gory details of the harness implementation. For those who are interested in the argument parser, the file loader, the implementation selector, and the experiment runner, you may visit the GitHub repository. Please do leave a star while you're there!
BastiDood / js-generators-vs-array-methods
The repository for all the scripts, notebooks, results, and analyses related to my experiments on JavaScript generators and chained array methods.
JavaScript Generators vs. Array Methods
The repository for all the scripts, notebooks, results, and analyses related to my experiments on JavaScript generators and chained array methods. See the full article "I was wrong about array methods and generators..." for more details on the methodology, results, interpretations, and conclusions.
Generating Random Bytes
The first step is to generate 32 KiB of random data. There are many ways to do this, but I opted to generate them via the Crypto#getRandomValues
function in JavaScript. A Deno script is already available at random/random.js
.
# This will overwrite the already existing `random.bin` file!
deno run random/random.js
Running the Experiments
# Node.js
function run () { node --experimental-default-type=module benchmark/bench.node.js $1 > notebook/node/$1.csv; }
run raw-for-loop
run for-of-loop
run array-method
run generator
# Deno
function run () { deno --quiet --allow-read --allow-hrtime benchmark/bench.deno.js -- $1 > notebook/deno/$1.csv; }
run raw-for-loop
…Next, on which runtimes shall we run the experiments? I've selected the three most prominent JavaScript runtimes as of writing.
-
Node.js (
v22.8.0
) is written in C++ and powered by V8. -
Deno (
1.46.3
) is written in Rust and powered by V8. -
Bun (
1.1.26
) is written in Zig and powered by JavaScriptCore.
The Results
With 2048 warm-up iterations and 16384 actual samples, I believe it is fair to assume that sample sizes in the experiment are sufficiently large enough to omit the analyses of p-values. After all, with so many samples available, the observed data as a whole is hardly a fluke at that point.
Instead, the following results and discussion will focus on the pairwise effect size of the sample distributions. Given two sample distributions (e.g., raw-for-loop
versus array-method
), the effect size quantifies how "far apart" the mean of raw-for-loop
is when compared to that of array-method
.
For example, a large negative effect size implies that raw-for-loop
has a significantly lower execution time (good!) than array-method
. In practice, we are effectively quantifying the "impact" of the array-method
on our baseline (i.e., raw-for-loop
).
To keep things simple, we will use Cohen's as our metric for effect size. As a general rule of thumb, Cohen suggests (but warns against the dogmatization of) the following interpretations for :
- We say that the effect size is small when .
- But if , then we consider that to be a medium effect.
- But if , then we have a large effect.
- Otherwise, is a huge effect.
Now that we're all on the same page, we can examine the results of the experiment.
Node.js Results
Right off the bat, the Node.js results demolish my long-held presumptions. Not only am I wrong, but it turns out that generators are the slowest of all the cases!
This histogram still sends shivers down my spine as I recall the countless number of times that I've written generators in my code under the presumption of optimization.
But how bad is it really? Let's take a look at the numbers.
Experiment | Mean | Mode | Standard Deviation |
---|---|---|---|
raw-for-loop |
0.3309 | 0.2828 | 0.1975 |
for-of-loop |
0.3368 | 0.2873 | 0.2031 |
array-method |
0.4360 | 0.3557 | 0.2404 |
generator |
1.1024 | 1.0010 | 0.2134 |
By eyeball statistics, we can see right away that the differences between raw-for-loop
(baseline) and for-of-loop
are insignificant. In fact, we can exactly quantify that with Cohen's
. The table below summarizes how the raw-for-loop
compares against the other experiment cases.
raw-for-loop |
|
---|---|
for-of-loop |
-0.0296 |
array-method |
-0.4778 |
generator |
-3.7518 |
Let's take a look at some key observations here.
- Comparing
raw-for-loop
andfor-of-loop
, we obtain . That meansraw-for-loop
is only marginally faster thanfor-of-loop
up to a very small effect—so small in fact that it is an injustice to say that there is one! - We get into the medium territory between
raw-for-loop
andarray-method
with . That is to say, theraw-for-loop
is faster than thefor-of-loop
up to a non-trivial medium effect. These results are consistent with that by various authors and blog posts over the years. - Perhaps the most shocking result is the extreme effect of
generator
onraw-for-loop
having ! Given these plots, it is fairly conclusive that generators significantly penalize execution time.
Going back to the original question, though: how badly do generators compare to chained array methods? It turns out that the effect size between array-method
and generator
is
! Once again, we have entered extreme territory.
Deno Results
Hmm... that plot looks eerily familiar. As a matter of fact, we have already seen this plot in the previous section on Node.js!
Because Node.js and Deno both share V8 as their underlying JavaScript engine, the performance differences between the two are not statistically significant. From this point forward, we can thus refer to Node.js and Deno interchangeably as if they fall under the one umbrella of V8.
For the sake of completeness, here are the same tables shown in the Node.js section, but in the context of Deno instead.
Experiment | Mean | Mode | Standard Deviation |
---|---|---|---|
raw-for-loop |
0.3267 | 0.2828 | 0.1857 |
for-of-loop |
0.3440 | 0.2963 | 0.1987 |
array-method |
0.4221 | 0.3462 | 0.3949 |
generator |
1.0946 | 1.0180 | 0.1843 |
raw-for-loop |
|
---|---|
for-of-loop |
-0.0898 |
array-method |
-0.3092 |
generator |
-4.1505 |
Note that the effect size between array-method
and generator
is
: yet another extreme effect on display.
Node.js and Deno: One and the Same
Let's go back to the observation earlier that Node.js and Deno exhibit the same performance characteristics. The following figure superimposes the Deno histogram over the Node.js histogram. The overlap is quite impressive to say the least.
Now let's quantify exactly how much they overlap by considering the effect sizes of corresponding experiment cases.
Experiment | |
---|---|
raw-for-loop |
0.0219 |
for-of-loop |
-0.0356 |
array-method |
0.0426 |
generator |
0.0388 |
In all cases, there is little to no effect between Node.js and Deno. If we must draw a conclusion, though, we can see that Node.js is marginally slower than Deno except in the for-of-loop
case. Nevertheless, this is already well within the realm of noise, so let's not look too deep into it.
Bun Results
As the JavaScriptCore engine enters the scene, the Bun histogram takes on a completely different shape. I can already hear the Zig fans rejoicing as their results practically blow Node.js and Deno out of the water.
A cursory inspection of the histograms already shows more stable performance characteristics in JavaScriptCore than in V8. The curves are closer together, but still exhibit the same patterns as in Node.js and Deno—albeit at a smaller magnitude. The table below summarizes the relevant statistics from the experiments.
Experiment | Mean | Mode | Standard Deviation |
---|---|---|---|
raw-for-loop |
0.2786 | 0.2626 | 0.2749 |
for-of-loop |
0.2789 | 0.2644 | 0.2718 |
array-method |
0.3608 | 0.2707 | 0.3274 |
generator |
0.4919 | 0.4712 | 0.2651 |
Rather than rehashing the discussion as in the Node.js and Deno sections, let's instead examine the Bun data based on its similarities and differences from the previous two runtimes.
- Like Node.js and Deno, the performance degradation with respect to the
raw-for-loop
baseline is ordered asfor-of-loop
,array-method
, andgenerator
. This phenomenon wheregenerator
performs worst—much to my dismay and chagrin—is consistent with all engines. - Unlike that of Node.js and Deno, the Bun distributions are closer together—indicating more stable performance characteristics.
- Bun (JavaScriptCore) optimizes the underlying state machines of the
generator
implementation better than Node.js and Deno (V8). By how much, you may ask? Between Node.js and Bun, the effect size is . Meanwhile, between Deno and Bun, the effect size is . That is to say, both Node.js and Deno perform extremely worse than Bun.
The next table summarizes the effect sizes of each experiment on the raw-for-loop
baseline.
raw-for-loop |
|
---|---|
for-of-loop |
-0.0012 |
array-method |
-0.2722 |
generator |
-0.7898 |
- Like Node.js and Deno, the
for-of-loop
is practically equivalent in performance characteristics to theraw-for-loop
baseline. - Like Node.js and Deno, a similarly small effect size has been observed between
raw-for-loop
andarray-method
. Indeed, the performance overhead is small but certainly non-zero. - Unlike Node.js and Deno, the overhead of the
generator
case only tiptoes between the medium and large effect sizes, whereas the former exhibited extreme effect sizes. This result attests to the previously discussed stability of Bun's performance characteristics.
Now how do array methods and generators compare in Bun? Between array-method
and generator
, the effect size is medium with
.
Interpretations and Conclusions
The most glaring conclusion thus far is that I was comically wrong about generators and chained array methods. In all experiments across all runtimes, the generator
case consistently performed the worst. To add salt to the wound, I was not even wrong by a small margin, but by an extreme margin! Truly a hard lesson learned: measure, measure, measure.
The Importance of Spatial and Temporal Locality
I suspect that these results are due to the heavily optimized C++ implementations that underpin filter
and map
. Specifically, since arrays are contiguous data structures, low-level caches and memory prefetchers (in hardware and software) exploit the spatial and temporal locality of the array operations. In other words, a tight loop over 32768 integers becomes light work for hardware that can fetch and process 32 KiB in bulk without invalidating its internal caches.
Consider the alternative: generators. Generators are implicit state machines that tick forward after every yield
. There is no notion of contiguity here because the values that will be processed are yet to be yielded by the generator. That alone fundamentally hampers the spatial and temporal locality of array operations. The state machine must necessarily tick one iteration at a time. This is opposed to the C++ code of filter
and map
that can zoom through the ready data in one fell swoop—albeit in two iterations.
In practice, this means that the
time complexity of filter
+ map
does not tell the whole story. Although generators are single-pass
constructs, sacrificing spatial and temporal locality incurs an extreme overhead that more than makes up for the two-pass bulk operation of filter
+ map
.
As we chain more array methods (which is rarely necessary), we anticipate array-method
eventually surpassing generator
in execution times.1 Until then, we have shown that the overhead of generators due to (implicit state machines) proves to be too costly.
Some part of me is quite relieved that this is the case. I must admit that filter
+ map
chains read more elegantly than generators (even with helpers from itertools
). At least I can now rest assured that generator gymnastics need not apply.
V8 versus JavaScriptCore
An unexpected storyline that unraveled as I performed the experiments was the lopsided battle between V8 and JavaScriptCore. I have heard about Bun's supposedly superior performance versus Node.js and Deno, but I have only experienced it in the making of this article. The effect sizes are less impressive in the raw-for-loop
, for-of-loop
, and array-method
cases. Where Bun really shined was with the generator
case and the overall stability of its performance characteristics.
Frankly, I am quite impressed by Bun and especially JavaScriptCore. To beat the formidable V8 engine by extreme margins is no small feat. For that, I will continue to keep a close eye on Bun. I look forward to seeing how far we can push the boundaries of JavaScript.
So... when to use generators then?
Generators still have their place where lazy computation is a necessity due to memory constraints. One notable example is file streaming, where it's impractical to load an entire file or response into memory (as is typically the case in cloud environments).
More generally speaking, generators shine in environments that are bottlenecked by idle times due to I/O (e.g., file system, networks, user inputs, streams, etc.). Here, CPU performance is less relevant because the CPU would be idle waiting on I/O to respond anyway. The lazy computation keeps memory usage low while data streaming ensures that clients receive data as soon as possible.
However, there is another compelling reason to use generators: composability. Because generators are lazily executed, operations on the resulting data stream are easily composable as items are processed one at a time. The popular itertools
package—which features a plethora of iteration utilities—is a testament to this fact.
In fact, the developer experience around iteration utilities is so great that there is an active Stage 3 ECMAScript proposal right now (as of writing) that aims to officially add lazy iterator helpers to the language. When this is more widely supported, expect a follow-up article to this one that investigates the would-be C++-backed helpers.
Future Work
As thorough as I have been in this research, there is still some more work to be done. Here are a few bullets that were previously mentioned in the article.
- Reproduce these experiments on a native Linux machine.
- Investigate whether a sufficiently long chain of array methods eventually surpass the execution time of the equivalent generator.
- Compare the performance characteristics of the upcoming iterator helpers.
The repository for all the scripts, notebooks, results, and analyses is available below.
BastiDood / js-generators-vs-array-methods
The repository for all the scripts, notebooks, results, and analyses related to my experiments on JavaScript generators and chained array methods.
JavaScript Generators vs. Array Methods
The repository for all the scripts, notebooks, results, and analyses related to my experiments on JavaScript generators and chained array methods. See the full article "I was wrong about array methods and generators..." for more details on the methodology, results, interpretations, and conclusions.
Generating Random Bytes
The first step is to generate 32 KiB of random data. There are many ways to do this, but I opted to generate them via the Crypto#getRandomValues
function in JavaScript. A Deno script is already available at random/random.js
.
# This will overwrite the already existing `random.bin` file!
deno run random/random.js
Running the Experiments
# Node.js
function run () { node --experimental-default-type=module benchmark/bench.node.js $1 > notebook/node/$1.csv; }
run raw-for-loop
run for-of-loop
run array-method
run generator
# Deno
function run () { deno --quiet --allow-read --allow-hrtime benchmark/bench.deno.js -- $1 > notebook/deno/$1.csv; }
run raw-for-loop
…While you're there, please don't forget to star the repository!
-
However, this is yet to be shown experimentally as it is already beyond the scope of this article. ↩
Top comments (1)
For the array-method, I believe array.reduce could do the job in just one pass which should up the perf a little bit.