DEV Community

Cover image for Binary in JS
RaymondX
RaymondX

Posted on

Binary in JS

In JavaScript, there are many concepts related to binary, such as Buffer, TypedArray, ArrayBuffer, Blob, Stream and so on. So what is the relationship of these concepts to each other? What are the respective usage scenarios? This will be the focus of this article.

TypedArray

First, let's introduce the TypedArray. A TypedArray is a specialized array for processing data of numerical types (not all types), ArrayBuffer is just one of the concepts.

History of TypedArray

TypedArray were first used in WebGL. WebGL is a transplanted version of OpenGL ES 2.0. In the early versions of WebGL, because of the difference between JavaScript arrays and native arrays There is a mismatch between them, so there is a performance problem.

The in-memory format of JavaScript arrays is double-precision floating-point format (IEEE 754 64-bit), but graphics driver APIs generally don't expect values passed to them in JavaScript's default double-precision floating-point format. So every time an array is passed between WebGL and the JavaScript runtime, WebGL needs to reallocate the array in the target environment, iterate over the array in its current format, and convert the values into the proper format in the new array, which takes a lot of time.

To solve the above problem, Mozilla implemented CanvasFloatArray. It provides a JavaScript interface to C-style arrays of floating-point values. Eventually that type became Float32Array, one of the types for TypedArray.

ArrayBuffer

ArrayBuffer is the basis of all TypedArray, it is a piece of memory address that can contain a certain number of bytes, which is called "Byte Array" in other languages. The process of creating ArrayBuffer is similar to calling malloc() in C to allocate memory, except that there is no need to specify the data type contained in the memory block.

let buffer = new ArrayBuffer(10); // allocate 10 bytes in memory
Enter fullscreen mode Exit fullscreen mode

Note: ArrayBuffer cannot be resized once created.

Just creating a storage unit is useless, we need to write data to the storage unit, so we also need to create a view to implement the writing function.

The ArrayBuffer is an address in the memory, and the view is used to operate the interface in the memory. Views can manipulate array buffers or subsets of buffers, reading and writing data according to one of the numeric data types.

DataView

The first view that allows reading and writing of ArrayBuffer is DataView, which is a universal array buffer view. This view is designed for file I/O and network I/O, and its API supports a high degree of control over buffered data, but performance is poorer than other types of views.

An example of use is as follows:

let buffer = new ArrayBuffer(10)
let view = new DataView(buffer)
Enter fullscreen mode Exit fullscreen mode

DataView has the following properties:

  • buffer: the ArrayBuffer bound by the view;
  • byteOffset: The second parameter of the DataView constructor, the default is 0, and it has a value only when the parameter is passed in;
  • byteLength: The third parameter of the DataView constructor, the default is the bytelength of the length of the buffer.

DataView has no preset value for the data type stored in the buffer. Its API forces developers to specify an ElementType when reading and writing, and then DataView will convert according to the specified type. There are 8 types of ElementType supported by DataView:

type bytes description
Int8 1 8-bit signed integer
Uint8 1 8-bit unsigned integer
Int16 2 16-bit signed integer
Uint16 2 16-bit unsigned integer
Int32 4 32-bit signed integer
Uint32 4 32-bit unsigned integer
Float32 4 32-bit IEEE-754 floating point number
Float64 8 64-bit IEEE-754 floating point

Each of the above types exposes get and set methods, such as getInt8(byteOffset, littleEndian), setFloat32(byteOffset, value ,littleEndian). For a more detailed introduction, see: DataView.

TypedArray

A TypedArray is another form of ArrayBuffer view, which is a specific view for an ArrayBuffer, which can directly force the use of a specific data type instead of the generic DataView object to manipulate the array‘s buffer, TypedArray follow native endianness.

There are several types of TypedArray:

constructor name bytes description
Int8Array 1 8-bit signed integer
Uint8Array 1 8-bit unsigned integer
Uint8ClampedArray 1 8-bit unsigned integer (cast)
Int16Array 2 16-bit signed integer
Uint16Array 2 16-bit unsigned integer
Int32Array 4 32-bit signed integer
Uint32Array 4 32-bit unsigned integer
Float32Array 4 32-bit IEEE floating point number
Float64Array 8 64-bit IEEE floating-point numbers

The above Uint8ClampedArray is roughly the same as Uint8Array, the only difference is that if the value in the array buffer is less than 0 or greater than 255, Uint8ClampedArray will convert it to 0 or 255 respectively. For example, -1 becomes 0 and 300 becomes 255.

According to Brendan Eich, the father of JavaScript: "Uint8ClampedArray is completely a historical relic of the HTML5 canvas element. Unless you are really doing canvas related development, don't use it."

For more detailed usage of TypedArray, see the documentation: TypedArray.

Endianness

An 8, 16, 32, or 64-bit view of the same byte sequence can be viewed using a stereotyped array. This involves the issue of "endianness". The so-called "endian" refers to a byte order convention maintained by the computer system. It is divided into two types: big endian and little endian:

  • Big-endian: The high-order byte comes first, and the low-order byte follows. This is the way humans read and write values.
  • little-endian: the low-order byte comes first, and the high-order byte follows. For example, the value 0x2211 is stored in two bytes: the high-order byte is 0x22, and the low-order byte is 0x11, so the corresponding little-endian byte order is 0x1122.

The endianness of the underlying platform can be determined using the following code:

// If the integer 0x00000001 is arranged in memory as 01 00 00 00
// The bottom layer uses little-endian byte order. 
// On big-endian platforms it should be 00 00 00 01
let littleEndian = new Int8Array(new Int32Array([1]).buffer)[0] === 1
Enter fullscreen mode Exit fullscreen mode

The common CPUs currently on the market are all little-endian. Many network protocols and some binary file formats require big-endian byte ordering.

For efficiency, TypedArray use the native endianness of the underlying hardware. The DataView mentioned above does not obey this convention. DataView is a neutral interface to a piece of memory, it will follow the endianness you specify. All DataView API methods use big-endian as the default value, but can be turned on by receiving a true.

const buf = new ArrayBuffer(2)
const view = new DataView(buf)

// read Uint16 in little endian byte order
view. getUint16(0, true)
Enter fullscreen mode Exit fullscreen mode

Stream

The Steam API was created to solve the problem of web applications consuming small chunks of information in an orderly fashion rather than larger ones. The application scenarios of this capability are as follows:

  • Large chunks of information may not be available all at once: responses to network requests are a typical example. Network load is delivered in sequential packets, and streaming allows applications to consume data as soon as it arrives, rather than waiting for all the data to be loaded.
  • Large chunks of data may need to be processed in smaller parts. Such as video processing, data compression, etc.

The problem directly solved by the Stream API is to deal with network requests and reading and writing disks, which defines three streams:

  • Readable Streams: A stream that reads chunks of data through a public interface. Data enters the stream internally from the underlying source and is then processed by the consumer;
  • Writable Streams: a stream that writes data blocks through a public interface. The producer writes data to the stream, and the data is passed internally to the underlying data sink;
  • Transform Streams: It consists of two streams, the writable stream is used to receive data (writable end), and the readable stream is used to output data (readable end). Between these two streams is a transformer that inspects and modifies the stream content as needed.

The basic unit of a stream is a chunk. Chunk can be of any data type, but are usually TypedArray. Each chunk is a discrete stream fragment that can be processed as a whole. Chunks are not fixed in size and do not necessarily arrive at regular intervals.

Blob

Blob is related to file reading. In some cases, we need to read part of the file instead of the whole file. For this purpose, the File object provides a method called slice(). The slice() method returns a Blob instance. The File interface is based on Blob, inheriting the functionality of blob and extending it to support files on the user's system.

Blob stands for binary large object, which is JavaScript's encapsulation type for unmodifiable binary data. Arrays containing strings, ArrayBuffer, ArrayBufferView, and even other blobs can be used to create blobs. Its data can be read in text or binary format, and can also be converted into ReadableStream for data manipulation.

Blobs have two properties:

  • Blob.prototype.size: Indicates the size (bytes)** of **data contained in the Blob object;
  • Blob.prototype.type: A string indicating the MIME type that this Blob object contains. If the type is unknown, the value is empty.

The instance methods of Blob are as follows:

  • Blob.prototype.arrayBuffer(): returns a promise, and after resolution, the result contains ArrayBuffer in binary format containing all the contents of the Blob;
  • Blob.prototype.slice(): Returns a new Blob object that contains the data within the specified range of the source Blob object;
  • Blob.prototype.stream(): returns a ReadableStream that can read the contents of the Blob;
  • Blob.prototype.text(): returns a promise that resolves to a UTF-8 string containing all the contents of the blob.

Buffer

Finally, let's talk about Buffer. The difference from the above is that Buffer is unique to Node.js, but in fact the Buffer class is a subclass of Uint8Array in JavaScript, and it is extended.

Buffer instances are also JavaScript Uint8Array and TypedArray instances. All TypedArray methods are available on Buffers. There are, however, subtle incompatibilities between the Buffer API and the TypedArray API.

In particular:

  • While TypedArray.prototype.slice() creates a copy of part of the TypedArrayBuffer.prototype.slice() creates a view over the existing Buffer without copying. This behavior can be surprising, and only exists for legacy compatibility. TypedArray.prototype.subarray() can be used to achieve the behavior of Buffer.prototype.slice() on both Buffers and other TypedArrays and should be preferred.
  • buf.toString() is incompatible with its TypedArray equivalent.
  • A number of methods, e.g. buf.indexOf(), support additional arguments.

Summarize

The above are some concepts related to binary in JS. Finally, use a picture to summarize the relationship between the concepts mentioned above:

Summarize

Top comments (0)