Because I use binary so infrequently, it seems I have to re-learn it almost every time I use it. This is an attempt to finally put all of that learning in one place.
To get you to a practical understanding of binary we will first go through some concepts then give a couple bit manipulation examples. Some notable we will cover:
- Binary to Decimal conversion
- Hexadecimal
- ASCII table
- Big Endian/Little Endian
Introduction to Binary
What is binary - Binary is a base 2 numbering system using only 1's and 0s. Base 2 means there are only 2 possible numbers for each digit. Our standard numbering system is base 10 and this means there are 10 possible numbers for each digit (0-9).
Why Binary - Binary is primarily used in computers because this is what is implementable in a circuit. 1's and 0s represents open and closed circuits as implemented initially by vacuum tubes and now with transistors.
Binary to Decimal Conversion
In a series of binary values (ex: 1010) counting the binary digit positions starts at the right most digit. This digit position starts at 0 and counts upward as you move to the left.
The digit value can only be 0 or 1.
To convert each digit to its corresponding decimal value we must multiply the digit value * 2 ^ (digit position). Where ^ stands for exponent and 2 comes from the fact that binary is a base 2 numbering system.
The binary number 1010 = 10 in decimal. Here is how that works:
The windows calculator is a great tool for playing with these values. (Click the 3 line drop down in the upper left or just <alt+4> to switch to programmer mode)
This screenshot shows the number 1010 in big font on the right and its conversion to decimal (10), HEX (A) and OCT (12). Clicking on one of these on the left changes which number is represented in the big font on the right and changes which values are then being converted to on the left.
Definitions
bits - the smallest unit of memory in a computer. A bit stores either a 1 or 0 value.
byte - a byte contains 4 bits.
word - a word contains bytes. Most commonly a word is 16 bits or 4 bytes, but it depends on the architecture of the system in question.
int32 - a common definition of integer. This is an integer with 32 bits or 8 bytes. There are other sizes of integers possible (int8, int64). The most significant bit (left most bit) of a int32 represents the sign of that number where 0 = positive and 1= negative).
uint32 - a common definition of an integer. the letter "u" indicates the most significant bitdoes NOT represent a sign. As a result a uint32 has one extra bit to make a bigger number than an int32.
Hexadecimal (Hex)
Hexadecimal and binary often go hand and hand. Hexadecimal is simply packing 15 values into one digit vs just 10. To count to 15 with one digit, hex uses the letters A-F, where A = 10, B = 11, C = 12, D = 13, E = 14, and F = 15.
Hexadecimal is often time noted with a '0x' in front like this: 0xF which means a hexadecimal F or a decimal 15.
Summary of Binary, Hex, and Decimal
Format | allowed characters | # possible values in one digit |
---|---|---|
Decimal | 0-9 | 10 |
Binary | 0,1 | 2 |
Hex | 0-9, A-F | 15 |
Efficiency of binary storage
In computers, a byte stores one character. In a standard decimal notation using numbers we are all used to, each digit is stored in a byte. So the maximum a byte can store in decimal notation is 9. However, in binary notation, each byte stores 4 bits and 4 bits can store 16 values so the numbers 0 through 15. In other words:
Format | allowed characters | max value in one byte |
---|---|---|
Decimal | 0-9 | 9 |
Binary | 0,1 | 15 |
Hex | 0-9, A-F | 15 |
Integer values possible in one byte
-decimal = 1 character = 0 - 9
-hex = 1 character = 0 - 15
-bin = 4 bits = 4 = 0 - 15
Because of this, for large data it is more efficient to best to send data using binary or hex format. This why NodeJS streams support the Buffer format (in addition to strings and objects).
ASCII Table
Before we get too far into binary bit manipulations, it is important to mention the ASCII table.
But first some motivation.
var x = 'Hello World';
var buff = Buffer.from(x);
for(var ctr = 0; ctr < buff.length; ctr++){
console.log(buff[ctr])
}
/* outputs
72
101
108
108
111
32
87
111
114
108
100
*/
What are all these numbers? These numbers are the ascii decimal equivalents of the provided characters.
Since computers only deal in numbers, they need a way to store and manipulate letters. The ASCII (American Standard Code for Information Interchange) solves this problem. The ASCII table provides a mapping of characters to numbers. Taking a look at the ASCII table below you can now see this is simply a translation from those characters (listed in red in the 'char' / 'chr' in the table below) to their corresponding decimal equivalents in the column 'DEC'.
From: https://www.asciitable.com/
Or, in summary:
Our inputted char | converted decimal value |
---|---|
'H' | 72 |
'e' | 101 |
'l' | 108 |
'l' | 108 |
'o' | 111 |
Space | 32 |
'W' | 87 |
'o' | 111 |
'r' | 114 |
'l' | 108 |
'd' | 100 |
A way to directly convert from char to ascii code is to use javascripts charCodeAt method like below.
Char to ascii code =>
var x = '1';
console.log(x.charCodeAt()); //49
Little Endian/Big Endian
This is probably one of the biggest problem topics in all of binary related topics.
What is endianness? - Endianness relates to the order in which bytes are organized. Most common types of endianness are little endian and big endian.
Why are there different types of endianness? - This comes from independently developed computers that interpret binary differently.
Basically little vs big endianness comes down to which is stored first in memory, the most significant byte or least significant byte? The way to remember it is that big endian stores the most significant byte first. Little endian stores the least significant byte first.
There are lots of pieces here, and I can't think of a better way to visualize this than with a diagram. Hopefully this helps.
At the end of the day the number is always represented with the least significant bit on the right and most significant bit on the left. The process of saving and retrieving the number in memory may be swapped one way or another, but once it is in your operational work space, the top part of the above diagram is the way to think about and address the bytes. This threw me off for a while, so separate the memory storage and retrieval of the number data from operating and manipulating the numbers. Either you run a endianSwap routine on the data when you load it from memory or you don't, but don't start changing your bit and byte manipulations trying to account for this down the line.
NodeJS buffer manipulations
In NodeJS you will be using the Buffer object to handle your conversions and manipulations. https://nodejs.org/api/buffer.html#buffer
NodeJS buffer object key commands:
buffer.[command]
Command | description |
---|---|
alloc(N) | allocate N bytes of space with 0s in the buffer. |
write(x) | writes the string x to buffer |
writeInt32BE(x) | write the integer x to the buffer in big endian format |
readInt32BE(y) | read an int 32 number from the buffer at an index of y bytes from the start in big endian format |
subarray(start,end) | pull out a slice of the buffer into a new buffer with start and end being the byte indexes where the end is not inclusive. |
newBuff = Buffer.concat(buff1,buff2) | combine buffer 1 and buffer 2 together into one new buffer. |
The Buffer object is a subclass of Javascript's Uint8Array. Therefore, when you examine the array contents of a Buffer object, each index represents 8 bits or 2 bytes.
A large number: 879,365,933 for example, requires 8 bytes to store.
879,365,933 = 0x346A0F2D. To input this into a buffer object, we do the following:
var y = 879365933;
var y_buff = Buffer.alloc(4);
y_buff.writeInt32BE(y)
if(y === y_buff.readInt32BE()){
console.log("success");
}) //prints: success!
NodeJS bit manipulations
If we start with the binary value '10' = 2, to do bit manipulation means to take the individual digits of the binary number and change them.
To change the number 2 ('10') to the number 4 ('100'), we need to move the one left by one digit. This is called bit shifting.
To do bit shift to the left in NodeJS, you use the '<<' operator.
result = number to be shifted << number of bits to shift
var x = 2; ('10')
var y = x << 1;
console.log(y); //4 ('100')
In this example a 0 is added to the right most bit and the left most bit falls off. If you got your number from a Buffer using readInt32BE, then the left most bit would be the bit in position 31 (starting at 0 when counting).
To bit shift back from 4 to 2 we would use the '>>' right shift operator.
var y = 4;
var x = y >> 1;
console.log(x);// 2
In this example a 0 is added to the left most bit and the right most bit falls off.
For example, what happens if we bit shift 5 ('101') once to the left? We get '1010' which is 10. If we bit shift 5 to the left by 2, we get '10100' which is 20.
There are other topics we could cover, but this should be a good starting points. Let me know if there is anything I missed you would like to see included.
Top comments (0)