Get Byte size of the string in Javascript

#javascript #100daysofcode #codenewbie #codequality

Hello folks, Welcome again in a new episode of series called javascript useful snippets. In this series, I’m going to talk about some shortcodes and useful functions of javascript. These snippets can help you to make your development more efficient and faster. Stay tuned till the end to learn something new… 😊

Javascript Useful Snippets — byteSize()

As we all know a byte is one the unit of digital information, and while development taking care of the size of variables, records and files are a much important task. To do that we have various ways but with these functions, it’s very easy to do. byteSize() snippet will take a string as an input and in the output, it’ll return byte size of a given string. Let’s look at the syntax…

const byteSize = str => new Blob([str]).size;

Here, in return, we are using Blob web API to get out byte size. Where Blobs allow you to construct file-like objects and here we are passing our string in array to create one, from that we are returning just size which will be byte size. Let’s see some results in a better understanding…

Result One:

const result = byteSize(“Hello World”) // output: 11

Result Two:

const result = byteSize(“😃”) // output: 4

As we see both results, with plain strings it’s returning the same number as a length while in case of emoji it’s 4 sizes of bytes. ( For knowledge — The size of the UTF encoding simply defines the minimum number of bytes to be used to represent a character. However, certain characters, like the emoji you are using, require more than 2 bytes to be represented.)

Thank you for watching/reading folks, if you found this informative and wanted to make me more content like this please support me on Patreon.

Now, Guys in the next episode I’m going to share a function to get a difference of two arrays. so follow/subscribe to get notification…

Subscribe on youtube https://www.youtube.com/channel/UCvNjso_gPQIPacA6EraoZmg
Facebook: https://www.facebook.com/KatharotiyaRajnish/
Twitter: https://twitter.com/tutorial_spot

Top comments (1)

Jack Arrington • Dec 6 '22 • Edited

This solution is not strictly correct. JavaScript uses UTF-16; Blob uses UTF-8 when reading strings (as noted here). UTF-8 and UTF-16 frequently do not use the same number of bytes to represent a string.

The most basic example would be an ASCII character like "a"—1 byte in UTF-8, 2 bytes in UTF-16. A slightly more advanced example would be "🏳️‍🌈", because it lies outside of the single code-point range for both UTF-8 and UTF-16 and also uses combining characters. It's 14 bytes in UTF-8, but 12 in UTF-16. "😃" is 4 bytes in both.

So, if you care about the number of bytes that a JavaScript string is taking up in memory, you want to know the UTF-16 byte length. If you're writing to a UTF-8 encoded file or somesuch, you may want the UTF-8 byte length. In that light, a more correct/thorough solution:

// .length returns the number of UTF-16 codepoints, each of which as 2 bytes long.
const byteLengthUtf16 = (str) => str.length * 2
const byteLengthUtf8 = (str) => new Blob([str]).size