Resources:
- Some plain English fundamentals about bits/bytes/words
- Fundamentals of data representation (multi page resource, so make sure to click through each topic)
- Binary numbers + bit shifting overview (see links at bottom of the article)
- Two's Complement
- Sign & Magnitude
- Why two's complement is preferred
- Video on bitwise operators
- Another video on bitwise
- 1st bitmasks video
- 2nd bitmasks video
Other resources used: Geeksforgeeks, backtobackswe
(Also worth mentioning is this article about a creative use of bitwise operators)
Takeaways:
- Binary numbers are base-two. The number system most of us are familiar with is base-ten or decimal. Binary numbers are composed of only 1s and 0s.
- The first bit in a binary number is known as the most-significant-bit, the last bit in a binary number is known as the least-significant-bit (MSB & LSB).
- 1010 in decimal is one thousand and ten. 1010 in binary is ten. That is because each bit in binary is a sequential power of two, instead of a power of ten. To understand this better, see this article
- Negative numbers in binary can be represented in a few ways. The most common way is called two's complement.
- In two's complement we use the first bit of a number to tell us what the sign is (- or +). If the first bit is a 1 then the number is negative. If it's 0 then the number is positive. So 11010, using two's complement, is -6 in decimal (-16 + 8 + 2). Written as 01010, it is ten (8 + 2).
- Another way to represent negative numbers in binary is called Sign & Magnitude
- Sign & magnitude is similar to two's complement, where the first bit indicates if the number is positive or negative. However, we never count the first bit towards the number's value.
- Using sign & magnitude: 1010 is -2 in decimal. 0010 is 2 in decimal
- Fractions in binary are represented in one of two ways: fixed point & floating point.
- For fixed point fractions, take an 8-bit number: 10101000. The first 4 bits represent the number left of the decimal point, the trailing 4 bits represent the fractions after the decimal point. So 10101000 can be visualized as 1010.1000 or 10.5. Why .5? well, the four bits after the decimal point represent (in order): 0.5, 0.25, 0.125, 0.0625. So .1000 = 0.5. See here for a more thorough explanation
- For a floating point number there are 3 components: a sign, exponent, and mantissa. The sign (most-significant-bit) denotes the sign of the number. The exponent portion is after the sign and is the number we should multiply the mantissa by. And the mantissa is the remaining bits. For a more thorough understanding of floating point binary I have found this to be the best explanation
- We can manipulate bits in a binary number using bitwise operations
- The bitwise operators are: AND (&), OR (|), Exclusive OR (^) (also referred to as XOR), Left Shift (<<), Right Shift (>>), and NOT (~) (also referred to as one's complement).
- & takes two bits and returns 1 if both are 1.
- | takes two bits and returns 1 if either or both are set.
- ^ takes two bits and returns 1 only if either is set - not both.
- ~ inverts bits. Example: ~1010 = 0101.
- << moves bits to the left by a specified amount and appends 0 at the right side. A single << is the same as multiplying by two. Example: 8 << 1 = 10000 (16).
- >> does the opposite to <<, it shifts bits to the right, and a single right shift is the equivalent to dividing by two. Example: 8 >> 1 = 0100 (4).
- One important thing about right shifts (>>) is that there are two types: Logical right shift & Arithmetic right shift.
- Arithmetic right shifts preserve the most-significant-bit (it is copied and the copy is retained at the original position). This is important for two's complement (also for sign & magnitude), as it preserves the number's sign.
- Logical right shifts do not retain the most-significant-bit, and instead just pad/insert 0s.
- Arithmetic right shift example: 11000 >> 1 = 11100
- Logical right shift example: 11000 >> 1 = 01100
- In C# right shifts (>>) are arithmetic, unless the integer is unsigned - then they are logical. In Java there are two right shift operators: >> is arithmetic, and >>> is logical.
- An unsigned integer is always positive and we calculate it's value based on all of its bits. Meaning we always include the most-significant-bit when summing the set bits (1s). Unsigned integers, therefore, have a greater possible positive value than a signed integer.
- A signed integer uses the first bit (most-significant-bit) to identify the integer as positive or negative (usually using two's complement). This means signed integers can be negative or positive, but their maximum positive value is less than that of an unsigned integer.
- This is a helpful SO post on signed vs unsigned integers
- A common tool used when manipulating bits is something called a bitmask
- A bitmask is just another binary number that is used to help retain/toggle/clear/set bits in another number. Please see the related resources, or the below code sample for some examples that show the use of bitmasks.
Below you will find some common bit manipulations. I have included comments to explain the operations + provide context:
As always, if you found any errors in this post please let me know!
Top comments (3)
Suggest more info on MSB and LSB, image pointing to bits, also there are different memory formats that can cause the bit order to be inverted depending on the perspective. E.g. binary files and communications.
@andrewharpin thanks for the suggestions and you are absolutely right - there is much more to cover/explain.
For this post (and my week learning the material) I only covered the basics and tried to focus on things that might come up in a FAANG style SWE interview.
Only ensuring those learning understand that there is still more to learn. :-)