DEV Community

Andrey Germanov
Andrey Germanov

Posted on • Updated on

Efficient string building in JavaScript

Everything that we see in browser except images and videos are strings, that is why if work with them wisely, you can dramatically increase the performance of your web applications both on a frontend and on a backend.

What should you know about strings in programming? The string is a primitive data type that holds an array of characters. Values of primitive data types are immutable, so a string's value cannot be changed after instantiation. This is true for most programming languages including JavaScript. But wait, when you do this:

let hello = "Hello";
hello += " world";
console.log(hello);
Enter fullscreen mode Exit fullscreen mode

It's obvious that you'll see Hello world on the console, which means that the value of the hello variable has changed. How is it possible? How can Javascript change the value of a string variable and keep it immutable at the same time?

It happens because Javascript does not add the second string to the first string directly, but instead, it creates a third empty string, then copies the values of both strings to it and finally, reassigns the "hello" variable to this third string. In this way, the value of the third string is set only once and values of two initial strings stay unchanged to meet the immutability rule. This is how the whole string concatenation process looks:

Image description

Do you see any problem here? What can be said about the performance of this operation? It seems that it does up to five times more operations than it should and it uses two times more memory in step 3 to hold the same data.

On the one hand it's not a big issue if we just want to concatenate two strings, because computers can do millions of operations in a second. However, the problem becomes more serious if we need to build long strings. Let's say that we need to construct a big portion of HTML content from an external data array in a loop. In this case the HTML string can become huge during this process and Javascript will create a copy of this string on each iteration of loop.

As an example, let's see the code that builds a huge string in a loop, by concatenating the initial string hundred of millions of times.

let str = "Hello";

console.log("START",new Date().toUTCString());

for (let index=0;index<100000000;index++) {
    str += "!";
}

console.log("END",new Date().toUTCString());
console.log(str.length);
Enter fullscreen mode Exit fullscreen mode

This code appends the "!" symbol to the string a hundred million times. In a real world example you can assume that instead of '!' symbol it could be a real data from external source that should be displayed later.

Also, this code outputs the current date and time before and after the loop which helps to measure how long it takes. Finally it displays the length of the constructed string.

When I ran this in my Google Chrome browser it took a while to complete. Finally it displayed the following on the Javascript console:

Image description

As you can see, it took 1 minute 26 seconds and output the correct length of the concatenated string. However, when I ran this on another computer, this code crashed the browser and I saw the following output:

Image description

If remember the basic algorithm of string concatenation, described above, it should be clear why it could happen. The default string concatenation algorithm is too inefficient and wastes a lot of memory. In this example, it copies from 1 to hundred of millions of chars hundred of millions of times while iterating through the loop. The amount of memory that can be used for this is even difficult to realize. This means that whether it crashes depends on the amount of available free memory and how the memory garbage collector works in a concrete JavaScript engine implementation to erase unused temporary strings.

The JavaScript string concatenation algorithm we discussed above does not claim to be academically accurate. Various implementations of JavaScript engines may use different string handling optimizations and memory handling mechanisms.

But you should not count on the fact that your code will always run in such engines.

For example, in the latest version of Google Chrome at the time of this writing, string concatenation worked as shown in the screenshots above. So the purpose of this article is to show how to work with strings more efficiently, regardless of how it is implemented by default.

Definitely we should find a way to do exactly what we need by concatenating two strings using a single operation. Many other programming languages, like Java or Go, which also use immutable strings, have a tool called StringBuilder. This is a helper object that allows you to construct a string from elements of array or from other mutable object. However, JavaScript does not have this built-in feature. Thus, we are here today to return to the beginning and fix this flaw.

You can write the same string in a different way:

let hello = ["Hello"];
Enter fullscreen mode Exit fullscreen mode

This is not a string, but this is an array with string. Instead of strings, arrays are mutable and you can just change them by adding items. It means, that if you run this:

hello.push(" world");
Enter fullscreen mode Exit fullscreen mode

Javascript will just mutate the array by appending the " world" item to the end. This will be done in a single operation after which the array will contain the following:

["Hello"," world"]
Enter fullscreen mode Exit fullscreen mode

This way you can concatenate as many strings as you need to this array in a very low cost. Finally, to create the string from it, you can run the join operation on the array:

hello = hello.join("");
console.log(hello);
Enter fullscreen mode Exit fullscreen mode

After this the output of "hello" variable will contain the "Hello world" string. Actually, the join operation also creates an empty string and then copies items from the array to it. However, it only happens once, instead of every time when concatenating strings.

This approach dramatically increases string concatenation speed in a loop. Let's change the loop example to use the array instead of string:

let str = ["Hello"];

console.log("START",new Date().toUTCString());

for (let index=0;index<100000000;index++) {
    str.push("!");
}

str = str.join("");

console.log("END",new Date().toUTCString());
console.log(str.length);
Enter fullscreen mode Exit fullscreen mode

After running this on the same browser, I received the following output:

Image description

As you can see, the same result was achieved in 8 seconds, which is 10 times faster than regular string concatenation.

For Javascript we constructed the concept of custom StringBuilder that can only append strings. As a homework, you can extend it and add different methods to "append", "insert" or "remove" strings from an array. You could create a class that incapsulates an array variable and contains functions to manipulate strings in this array and construct the string from it when required.

When adding elements to an array, it is important to keep in mind the existing limits on the number of array elements. If you do not take them into account, you may encounter the "RangeError: invalid array range" error. You can learn more about the limits here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Errors/Invalid_array_length.

If the number of lines to be added in the loop exceeds these limits, then you will have to periodically flush the array into temporary string buffers and then merge these buffers.

To help you to work with strings even more efficiently, there are more great string handling algorithms available.

One of the fastest of them based on a data structure called "Rope". It was invented to efficiently handle operations on huge strings: https://en.wikipedia.org/wiki/Rope_(data_structure). This is more complex than the method discussed above, but you can start from reusing one of the Javascript implementations of the Rope in your projects:

https://github.com/component/rope
https://github.com/josephg/jumprope

Thus, by changing just three lines of code, you can significantly increase the performance of your data processing pipeline. You can use this method when building strings in a loop from external data streams of variable size in JavaScript. Just add strings to an array one by one and finally join them to a string before output. Other programming languages recommend using internal StringBuilder or StringBuffer objects for string concatenation.

As part of my practice, I had a client whose website was experiencing slowdowns due to ineffective string handling that he attempted to resolve by caching content in CloudFlare. He also seriously considered moving to AWS to increase data throughput to resolve these issues. But it was enough to do a code review to fix it.

Good luck and happy coding guys!

Feel free to connect and follow me on social networks where I publish announcements about my upcoming software, articles, similar to this one and other software development news:

LinkedIn: https://www.linkedin.com/in/andrey-germanov-dev/
Facebook: https://web.facebook.com/AndreyGermanovDev
Twitter: https://twitter.com/GermanovDev

My online services website: https://germanov.dev

Top comments (0)