In JavaScript, there are many concepts related to binary, such as Buffer
, TypedArray
, ArrayBuffer
, Blob
, Stream
and so on. So what is the relationship of these concepts to each other? What are the respective usage scenarios? This will be the focus of this article.
TypedArray
First, let's introduce the TypedArray. A TypedArray is a specialized array for processing data of numerical types (not all types), ArrayBuffer
is just one of the concepts.
History of TypedArray
TypedArray were first used in WebGL. WebGL is a transplanted version of OpenGL ES 2.0. In the early versions of WebGL, because of the difference between JavaScript arrays and native arrays There is a mismatch between them, so there is a performance problem.
The in-memory format of JavaScript arrays is double-precision floating-point format (IEEE 754 64-bit), but graphics driver APIs generally don't expect values passed to them in JavaScript's default double-precision floating-point format. So every time an array is passed between WebGL and the JavaScript runtime, WebGL needs to reallocate the array in the target environment, iterate over the array in its current format, and convert the values into the proper format in the new array, which takes a lot of time.
To solve the above problem, Mozilla implemented CanvasFloatArray
. It provides a JavaScript interface to C-style arrays of floating-point values. Eventually that type became Float32Array
, one of the types for TypedArray.
ArrayBuffer
ArrayBuffer
is the basis of all TypedArray, it is a piece of memory address that can contain a certain number of bytes, which is called "Byte Array" in other languages. The process of creating ArrayBuffer
is similar to calling malloc()
in C to allocate memory, except that there is no need to specify the data type contained in the memory block.
let buffer = new ArrayBuffer(10); // allocate 10 bytes in memory
Note: ArrayBuffer cannot be resized once created.
Just creating a storage unit is useless, we need to write data to the storage unit, so we also need to create a view to implement the writing function.
The ArrayBuffer is an address in the memory, and the view is used to operate the interface in the memory. Views can manipulate array buffers or subsets of buffers, reading and writing data according to one of the numeric data types.
DataView
The first view that allows reading and writing of ArrayBuffer
is DataView
, which is a universal array buffer view. This view is designed for file I/O and network I/O, and its API supports a high degree of control over buffered data, but performance is poorer than other types of views.
An example of use is as follows:
let buffer = new ArrayBuffer(10)
let view = new DataView(buffer)
DataView has the following properties:
- buffer: the ArrayBuffer bound by the view;
- byteOffset: The second parameter of the DataView constructor, the default is 0, and it has a value only when the parameter is passed in;
- byteLength: The third parameter of the DataView constructor, the default is the bytelength of the length of the buffer.
DataView has no preset value for the data type stored in the buffer. Its API forces developers to specify an ElementType when reading and writing, and then DataView will convert according to the specified type. There are 8 types of ElementType supported by DataView:
type | bytes | description |
---|---|---|
Int8 | 1 | 8-bit signed integer |
Uint8 | 1 | 8-bit unsigned integer |
Int16 | 2 | 16-bit signed integer |
Uint16 | 2 | 16-bit unsigned integer |
Int32 | 4 | 32-bit signed integer |
Uint32 | 4 | 32-bit unsigned integer |
Float32 | 4 | 32-bit IEEE-754 floating point number |
Float64 | 8 | 64-bit IEEE-754 floating point |
Each of the above types exposes get
and set
methods, such as getInt8(byteOffset, littleEndian)
, setFloat32(byteOffset, value ,littleEndian)
. For a more detailed introduction, see: DataView.
TypedArray
A TypedArray is another form of ArrayBuffer
view, which is a specific view for an ArrayBuffer, which can directly force the use of a specific data type instead of the generic DataView
object to manipulate the array‘s buffer, TypedArray follow native endianness.
There are several types of TypedArray:
constructor name | bytes | description |
---|---|---|
Int8Array | 1 | 8-bit signed integer |
Uint8Array | 1 | 8-bit unsigned integer |
Uint8ClampedArray | 1 | 8-bit unsigned integer (cast) |
Int16Array | 2 | 16-bit signed integer |
Uint16Array | 2 | 16-bit unsigned integer |
Int32Array | 4 | 32-bit signed integer |
Uint32Array | 4 | 32-bit unsigned integer |
Float32Array | 4 | 32-bit IEEE floating point number |
Float64Array | 8 | 64-bit IEEE floating-point numbers |
The above Uint8ClampedArray
is roughly the same as Uint8Array
, the only difference is that if the value in the array buffer is less than 0 or greater than 255, Uint8ClampedArray
will convert it to 0 or 255 respectively. For example, -1 becomes 0 and 300 becomes 255.
According to Brendan Eich, the father of JavaScript: "Uint8ClampedArray
is completely a historical relic of the HTML5 canvas
element. Unless you are really doing canvas
related development, don't use it."
For more detailed usage of TypedArray, see the documentation: TypedArray.
Endianness
An 8, 16, 32, or 64-bit view of the same byte sequence can be viewed using a stereotyped array. This involves the issue of "endianness". The so-called "endian" refers to a byte order convention maintained by the computer system. It is divided into two types: big endian and little endian:
- Big-endian: The high-order byte comes first, and the low-order byte follows. This is the way humans read and write values.
-
little-endian: the low-order byte comes first, and the high-order byte follows.
For example, the value
0x2211
is stored in two bytes: the high-order byte is0x22
, and the low-order byte is0x11
, so the corresponding little-endian byte order is0x1122
.
The endianness of the underlying platform can be determined using the following code:
// If the integer 0x00000001 is arranged in memory as 01 00 00 00
// The bottom layer uses little-endian byte order.
// On big-endian platforms it should be 00 00 00 01
let littleEndian = new Int8Array(new Int32Array([1]).buffer)[0] === 1
The common CPUs currently on the market are all little-endian. Many network protocols and some binary file formats require big-endian byte ordering.
For efficiency, TypedArray use the native endianness of the underlying hardware. The DataView
mentioned above does not obey this convention. DataView
is a neutral interface to a piece of memory, it will follow the endianness you specify. All DataView
API methods use big-endian as the default value, but can be turned on by receiving a true
.
const buf = new ArrayBuffer(2)
const view = new DataView(buf)
// read Uint16 in little endian byte order
view. getUint16(0, true)
Stream
The Steam API was created to solve the problem of web applications consuming small chunks of information in an orderly fashion rather than larger ones. The application scenarios of this capability are as follows:
- Large chunks of information may not be available all at once: responses to network requests are a typical example. Network load is delivered in sequential packets, and streaming allows applications to consume data as soon as it arrives, rather than waiting for all the data to be loaded.
- Large chunks of data may need to be processed in smaller parts. Such as video processing, data compression, etc.
The problem directly solved by the Stream API is to deal with network requests and reading and writing disks, which defines three streams:
- Readable Streams: A stream that reads chunks of data through a public interface. Data enters the stream internally from the underlying source and is then processed by the consumer;
- Writable Streams: a stream that writes data blocks through a public interface. The producer writes data to the stream, and the data is passed internally to the underlying data sink;
- Transform Streams: It consists of two streams, the writable stream is used to receive data (writable end), and the readable stream is used to output data (readable end). Between these two streams is a transformer that inspects and modifies the stream content as needed.
The basic unit of a stream is a chunk. Chunk can be of any data type, but are usually TypedArray. Each chunk is a discrete stream fragment that can be processed as a whole. Chunks are not fixed in size and do not necessarily arrive at regular intervals.
Blob
Blob is related to file reading. In some cases, we need to read part of the file instead of the whole file. For this purpose, the File object provides a method called slice()
. The slice()
method returns a Blob instance. The File interface is based on Blob, inheriting the functionality of blob and extending it to support files on the user's system.
Blob stands for binary large object, which is JavaScript's encapsulation type for unmodifiable binary data. Arrays containing strings, ArrayBuffer
, ArrayBufferView
, and even other blobs can be used to create blobs. Its data can be read in text or binary format, and can also be converted into ReadableStream for data manipulation.
Blobs have two properties:
-
Blob.prototype.size
: Indicates the size (bytes)** of **data contained in the Blob object; -
Blob.prototype.type
: A string indicating the MIME type that this Blob object contains. If the type is unknown, the value is empty.
The instance methods of Blob are as follows:
-
Blob.prototype.arrayBuffer()
: returns a promise, and after resolution, the result containsArrayBuffer
in binary format containing all the contents of the Blob; -
Blob.prototype.slice()
: Returns a new Blob object that contains the data within the specified range of the source Blob object; -
Blob.prototype.stream()
: returns a ReadableStream that can read the contents of the Blob; -
Blob.prototype.text()
: returns a promise that resolves to a UTF-8 string containing all the contents of the blob.
Buffer
Finally, let's talk about Buffer. The difference from the above is that Buffer is unique to Node.js, but in fact the Buffer
class is a subclass of Uint8Array
in JavaScript, and it is extended.
Buffer
instances are also JavaScript Uint8Array
and TypedArray
instances. All TypedArray
methods are available on Buffer
s. There are, however, subtle incompatibilities between the Buffer
API and the TypedArray
API.
In particular:
- While
TypedArray.prototype.slice()
creates a copy of part of theTypedArray
,Buffer.prototype.slice()
creates a view over the existingBuffer
without copying. This behavior can be surprising, and only exists for legacy compatibility.TypedArray.prototype.subarray()
can be used to achieve the behavior ofBuffer.prototype.slice()
on bothBuffer
s and otherTypedArray
s and should be preferred. -
buf.toString()
is incompatible with itsTypedArray
equivalent. - A number of methods, e.g.
buf.indexOf()
, support additional arguments.
Summarize
The above are some concepts related to binary in JS. Finally, use a picture to summarize the relationship between the concepts mentioned above:
Top comments (0)