Bhushitha Hashan

Posted on Jun 13

Why Everything Inside Your Computer Is Just Numbers

When you power on your computer, it might seem like magic: a machine with flashy lights, crisp graphics, and powerful AI all running seamlessly. But underneath all of that, your computer is really just a highly sophisticated number-cruncher. It doesn’t "understand" in any human sense instead it processes raw numbers and bytes.

In this article, we’ll explore:

How the CPU wakes up and starts executing instructions
What instructions look like in memory
How the CPU knows where instructions start and end
The role of architecture in instruction size and format
The Von Neumann architecture’s influence on how programs and data coexist in memory

Understanding CPU Instructions in Memory

At its core, a CPU instruction like:

mov eax, 1

is just a sequence of bytes stored in memory. The CPU does not see "mov" or "eax" as words; it sees raw binary numbers.

In machine code, that instruction is represented as the following five bytes (in hexadecimal):

B8 01 00 00 00

Each byte occupies one memory address. So in memory, this sequence is laid out like this:

At some starting address (say, 0x00000000), the byte B8
At 0x00000001, the byte 01
At 0x00000002, 00
At 0x00000003, 00
At 0x00000004, 00

The first byte (B8) is the opcode, which tells the CPU: “Move an immediate 32-bit value into the EAX register.” The next four bytes (01 00 00 00) are the actual 32-bit little-endian value to move (in this case, the number 1).

Memory Is Just Bytes With Addresses

Think of memory as a huge row of lockers, each holding a single byte a number from 0 to 255. Every locker (byte) has an address, so the CPU knows where to find data or instructions.

For example:

Address 0x00000000 might hold 123
Address 0x00000001 might hold 065
Address 0x00000002 might hold 255

And so on.

All data, whether numbers, text, images, or instructions, is ultimately just bytes stored in these lockers. The meaning depends on how software interprets those bytes.

Storing Data and Instructions as Numbers in Memory

At the most basic level, a computer’s memory is a sequence of bytes ,small storage units that each hold a number between 0 and 255. Everything stored in a computer, whether text, numbers, images, or instructions, is ultimately just these numbers.

How Is Text Stored?

Text is stored using character encoding standards such as ASCII. For example, the string "HELLO" is stored as the sequence of bytes:

[72, 69, 76, 76, 79]

Each number corresponds to the ASCII code for a letter (72 for 'H', 69 for 'E', etc.).

How Are Numbers Stored?

Numbers can be stored in two primary ways:

As text: The characters '1', '0', '0' are stored as ASCII codes [49, 48, 48].
As raw numeric values: The integer 100 stored directly as the byte 100.

This distinction is important because the CPU only processes bytes. The meaning of those bytes depends entirely on context,whether they represent text, numbers, instructions, or other types of data.

If Everything Is Just Numbers, How Does the CPU Know What to Do?

This raises a fundamental question: if memory is just numbers, how does the CPU know which numbers are instructions to execute and which are data?

The answer lies in the way the CPU is designed and how software prepares the system.

CPU Instructions Are Numbers Too

Every instruction that a CPU executes such as MOV, ADD, or JMP is encoded as a binary number known as an opcode.well talk about opcodes in detail in a bit.

For example, on x86 processors:

Opcode (Hex)   Instruction
------------   -----------
B8             MOV EAX, immediate
05             ADD EAX, immediate
C3             RET (return)

The CPU contains a decoder that interprets these binary patterns as specific instructions.

How Does the CPU Identify Instructions?

Press the power button, and your CPU begins with no knowledge of its surroundings.
The CPU does not inherently distinguish instructions from data. Instead, it begins executing instructions from a specific memory address known as the instruction pointer.This address points to a special, read-only area called ROM (Read-Only Memory), where your BIOS or UEFI firmware lives. The CPU fetches its first instructions from this ROM chip.

At this address, the CPU reads bytes and interprets them as instructions in sequence. If the bytes at the instruction pointer correspond to valid opcodes, the CPU executes them accordingly.

It’s important to note that the RAM (random access memory) is uninitialized and doesn’t contain meaningful data at this early stage.

The operating system or bootloader sets the instruction pointer to the start of the program’s code segment, guiding the CPU to interpret that memory region as executable instructions.

What’s the Difference Between Code and Data?

At the hardware level, there is no intrinsic difference between code and data—both are just numbers stored in memory. The distinction arises from how the system uses those numbers:

If the CPU executes the bytes, they are treated as code.
If the program reads or writes the bytes, they are treated as data.

This concept is central to the Von Neumann architecture, where code and data share the same memory space but are used differently depending on context.

What Happens If Code and Data Get Confused?

If the CPU attempts to execute data as code, it will misinterpret the bytes, often causing crashes or unpredictable behavior. Modern operating systems implement memory protection mechanisms to reduce such risks by separating code and data into distinct regions of memory.

How CPU Architecture Influences Instruction Size and Format

Instruction size depends heavily on the CPU architecture. For example:

On modern x86 or x86_64 CPUs, instructions are variable length — they can range from 1 to 15 bytes.
The instruction mov eax, 1 takes 5 bytes because it includes an opcode byte plus four bytes for the 32-bit immediate value.
On simpler 8-bit CPUs like the 6502 or Z80, instructions are generally shorter (often 1 to 3 bytes).

What Is an Opcode?

An opcode (operation code) is a byte that encodes the CPU instruction type. It tells the CPU what to do next.

For example:

B8 means “move an immediate 32-bit value into EAX”
89 means “move a register into memory or another register”
C3 means “return from subroutine”
90 means “no operation” (NOP)

When the CPU reads an opcode, it knows exactly how many additional bytes to read for operands (like registers, immediate values, or memory addresses) based on the opcode’s meaning.

How Does the CPU Know Where Instructions End?

The CPU follows a strict fetch-decode-execute cycle:

It fetches one byte from the current instruction pointer.
It decodes that byte by looking it up in an internal opcode table.
The opcode tells the CPU whether more bytes need to be fetched and how many.
The CPU fetches the remaining bytes (operands).
It then executes the instruction.
The instruction pointer is updated to the next instruction’s first byte.
The cycle repeats.

Because of this design, instructions are self-describing in length. The first opcode byte inherently defines the size and format of the entire instruction.

For example, the instruction mov [eax + 4], ebx translates to the bytes:

89 58 04

Here:

89 is the opcode for a register/memory move
58 is a ModR/M byte that specifies the memory location (eax plus an 8-bit displacement)
04 is the displacement value

A Quick Overview of Instruction Components

Many x86 instructions are composed of multiple parts:

Optional prefix bytes for repeat or segment overrides
The opcode itself (the main instruction code)
Optional ModR/M byte specifying addressing modes and registers
Optional SIB (scale-index-base) byte for complex memory addressing
Optional displacement bytes for memory offsets
Optional immediate bytes for literal values

The CPU’s decoder logic knows how to parse these parts step-by-step as it reads the instruction stream.

Von Neumann Architecture and Shared Memory

All of this ties back to the Von Neumann architecture concept: program instructions and data share the same memory.

This means that instructions like mov eax, 1 are stored as numbers, just like data values.

The CPU doesn’t inherently distinguish code from data — it relies on the instruction pointer to fetch bytes in sequence and interpret them as instructions.

This architecture offers flexibility but can also introduce vulnerabilities, such as when data is mistakenly executed as code.

Wrapping Up what we discussed so far

The CPU begins execution from a fixed ROM address containing firmware.
Memory is a continuous sequence of bytes, each with a unique address.
Instructions are sequences of bytes, starting with an opcode byte that tells the CPU how to decode the rest.
Instruction size varies based on architecture and specific instruction.
The CPU uses internal decoding logic and lookup tables to fetch and execute instructions correctly.
The Von Neumann architecture explains why instructions and data coexist in the same memory space.

This article is designed to give you some understanding of what’s happening inside your CPU when it starts running code beneath the surface of all the software and graphics you see.

Thanks for reading!

References:Programming from the Ground Up by Jonathan Bartlett

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.