Today, a close friend of mine asked me a compelling question:
How the F*** did we make tiny patterns of electricity do all of THIS.
I answered with a tangent about the foundations of computer systems. This is that, but I felt it necessary to be more eloquent.
All jokes aside, welcome to the world of computing! It's the closest thing we have to magic in this world; it's just that we use numbers instead of mana... lots of numbers.
Circuitry
Everything complex starts from something simple. Everything simple hides a deep complexity. The fundamental components of computers are electronic circuits, devices comprised of their own components, such as logic gates.
Logic gates allow electrical signals to make decisions. An "AND" gate is referred to as such because it receives two signals and produces a signal if given both signal 1 AND signal 2. A "NOT" gate will produce the opposite of what it is given. And so on.
Complex devices, such as latches and multiplexers, can be formed through logic gates. Logic gates and signals are, in fact, at the core of how computers work with numbers and arithmetic.
Numbers and Arithmetic
As you're likely aware, Western cultures use the decimal number system, also known as "base 10." This means we use ten unique digits to represent our numbers: 0 to 9. Computers, on the other hand, use binary, or "base 2." They only have 0 and 1.
The reason is simple. Think about a computer; it has only two states: On or off. This is a binary state. An electrical signal can be present or absent.
In binary, we form weightier numbers by joining bits. Each bit represents a number, and we add the number of every "present" bit together. Specifically, we start with the least significant bit, which has the value of one when present and is the rightmost bit; therefore, we read from right to left.
With computers, we group bits into octets, pairs of eight, and we refer to these as bytes. A byte is the tiniest unit of information we can save or retrieve, but not the only unit. By grouping bytes, higher capacity numbers are made.
Since our numbers are bits, we can manipulate them like electrical signals. Through logic gates, we can create adders. Adders, as the name implies, add bits.
Components of a Computer
- Central Processing Unit (CPU), or processor: The CPU is the brain and nervous system of the computer. It does most of the complex work and coordinates most components together.
- Motherboard: The motherboard is a large circuit board that serves as the "body" of the computer. It allows parts to communicate with one another, like nerves, and it provides system information and firmware.
- Firmware is a bundle of code given to the processor by the motherboard when the computer powers on. It sets up the machine so the operating system, e.g. Windows or macOS, can begin operating.
- Random Access Memory (RAM) or memory: Cards that comprise the computer's working memory. Memory is a huge list of bytes, where each byte has a position, also known as an "address," like a person's particular home address. When visiting certain, special addresses, signals may be redirected to other components by the motherboard. This is known as "memory-mapped input/output" and is how the processor communicates with much of the system. RAM is volatile, meaning when it loses power, its contents are lost.
- Storage Disk, or disk: Disks contain memory, similar to RAM, but with the added benefit that their memory is persistent. However, disks are often slower for the processor to access than RAM since the signals must travel farther.
- Graphics Processing Unit (GPU), or graphics processor: The GPU helps display graphics on your screens by performing many operations simultaneously. It isn't as sophisticated as the CPU, but it is faster at simple operations. It has its own memory, "video RAM."
Instructions
For a computer to be useful, we need the ability to give instructions to the processor. To do so, we provide a group of bytes subject to a specialized format known as an instruction set architecture. For educational purposes, I'll be working with a fictional, simplified architecture.
At the beginning of our instruction, we have a byte specifying the instruction we're dealing with. This is known as the "opcode." Let's introduce the basic opcodes:
0 = Add ("add")
1 = Subtract ("sub")
2 = Multiply ("mul")
3 = Divide ("div")
Next, these instructions need some numbers to work with. Addition is a binary operation, meaning it requires two numbers. Let's put them directly after the opcode:
"1 + 1" correlates to: [0, 1, 1]
We can now provide this to our processor and it will handle our instruction. Where does the sum go, though? Simple!
Registers
The processor has several "registers," which are memory cells capable of keeping a single value. When the processor performs an operation, the result of that operation is likely to go into a register. Some registers are also used for special purposes, but we'll cover those more soon.
Let's say that the sum of our previous addition operation ended up in an "accumulator" register. We'll label it ACC
. We can use registers within instructions, extracting their values.
"2 * ACC" correlates to: [2, 2, ACC]
Since the product is stored in ACC
, ACC
is now 4
.
If we want to store more than just a few bytes but rather a collection of subsequent bytes, we'll have to use something with a much higher capacity, like memory.
Memory
Let's say we have a person and this person has a name and age. We can represent this person with bytes by breaking them down into a "name" and "age" value.
First, to store their name, we need a text encoding. A text encoding is a format of storing letters, numbers, symbols, etc. as bytes for use in computer programs.
Commonly, we use ASCII or Unicode, but let's just say 0 to 25 represents the letters of the Latin alphabet, where a byte corresponds to a letter, i.e. 0 = A
, 1 = B
, and so on. This series of bytes is usually referred to as an "array."
Additionally, we need some way to say how long our text is. So, let's just store the length directly as a prefix!
"MAOW" correlates to [4, 12, 0, 14, 22]
For the person's age, let's just use a normal byte for simplicity.
And so, with all that in mind, if we were to store the person in memory we would just store their name and then subsequently their age, right? Correct. Let's give the age the label of "age" and the name "name."
But wait... if we have the address of "name," the byte at its beginning, then we actually have the memory address of the person, since name is at the beginning of the person.
This means that this person we've been talking about is a struct. A struct, short for structure, is a group of fields, which are named values. By composing structs, we can represent complex concepts with bytes!
Jumping!
If the instructions we give to a processor are bytes, then does that mean we can store instructions in memory? Yes, absolutely. In fact, operating systems copy-paste programs into memory and then provide the processor with their addresses, allowing the processor to read instruction by instruction like a list.
Tangent: Any data that the program might need to operate correctly is also stored as part of the program if not as a separate file. The data just lives outside of the "code section" of the file, which has our instructions, and instead in the "data section." This is what an
.exe
file facilitates, if you've ever wondered.
Each instruction begins at a certain address, and we can use that to our advantage by introducing new "jump" instructions. A jump simply tells the processor to jump forward or backwards to a different instruction using the address of that instruction.
4 = Jump ("jmp")
5 = Jump If Not Zero ("ifnz")
6 = Jump If Zero ("ifz")
7 = Test ("test")
Hold on, wait. What're those "if" and "test" instructions?
Let's back up just a bit (badum tss). First, we covered logic gates a while ago, but I neglected to mention we can use operations like "AND" with bytes. These are referred to as "bitwise operations," since they work with the bits of the byte, rather than the byte itself. AND-ing a byte with another byte simply lines the bytes up and applies the AND to each pair of bits.
Secondly, one of the special registers that a processor might have is the FLAGS register, which states a bunch of useful "flags" usually related to an operation or the processor itself. To set these flags, we'll be using the test instruction.
The test instruction ANDs two numbers and produces flags from the result of that AND. If we provide the test instruction with the same number twice, we can simply get the flags for that number as the AND won't affect it.
The main flag we're after is the zero flag, which states whether the tested number was zero. This is how "jump if not zero" works! That instruction will only jump to the specified instruction address if the zero flag didn't return positive. This is called a conditional branch, or just condition, and is essential for complex tasks.
(Sub)routines
Somewhere in our journey, we migrated from the realm of circuitry to the realm of software. Everything you run on your computer is software, including your operating system which helps manage your applications and system resources, and as defined before, a program is a list of instructions, but the proper term for this would be a "routine."
Therefore, what's a subroutine?
A subroutine is a routine that within a routine, usually to isolate some code so we can jump to it to re-use it at any time, rather than repeating the same code over and over.
Some subroutines expect us to provide them with values prior to running them, much like instructions do, and some subroutines will provide us with values once they've finished running.
To provide these values we store them either in general-purpose processor registers or in memory. Once a subroutine is done, it will do the same with its return values and then jump back to where we jumped from.
Programming Languages
To sum them up briefly: Code written in a language is simply specially formatted bytes that get interpreted by another program which generates a program the processor can actually read. By representing these complicated "features" in text, we make it much more approachable and understandable for humans, as well as potentially safer and faster.
All programming languages are different in some way, just like all natural languages.
Closing
Aren't computers really f***ing cool? We've gotten to a point in history where we don't even realize everything happening in our metal boxes is just electrical signals.
These signals represent numbers, which are used to represent abstract concepts and data. Images and sounds become numbers, which we can transmit to people hundreds of kilometers away through the internet.
Hope you had fun :D
Top comments (0)