Farhad Rahimi Klie

Posted on Dec 8, 2025 • Edited on Dec 22, 2025

Inside the CPU: A Complete Guide to the Instruction Execution Cycle and How Data Is Retrieved from RAM

#cpu #computerscience #architecture #programming

Modern processors feel instantaneous, yet behind every program lies a tightly orchestrated sequence of microscopic operations. Whether you're building systems, learning low-level programming, or just curious about how machines truly “think,” understanding the CPU execution cycle is foundational.

This article breaks down how a CPU fetches instructions, accesses RAM, decodes operations, and executes them step by step. By the end, you will fully understand what actually happens when your code runs.

1. What Is the CPU Execution Cycle?

The CPU (Central Processing Unit) processes instructions using a repeated loop called the Instruction Cycle or Fetch–Decode–Execute (FDE) cycle. Every instruction—whether it's ADD, MOV, LOAD, a function call, or a branch—passes through this cycle.

The three major phases are:

Fetch: Get the instruction from memory (RAM or cache).
Decode: Understand what the instruction means.
Execute: Perform the operation (ALU math, memory access, branch, etc.).

This cycle repeats billions of times per second.

2. CPU Architecture at a Glance

Before diving deeper, here are the main components involved in the instruction cycle:

Registers

Small, extremely fast storage units inside the CPU:

PC (Program Counter) – holds the address of the next instruction.
IR (Instruction Register) – holds the instruction being decoded/executed.
MAR (Memory Address Register) – holds memory addresses.
MDR (Memory Data Register) – holds data being transferred to/from memory.
General Registers – AX, BX, RAX, RBX, etc., depending on architecture.

ALU (Arithmetic Logic Unit)

Performs math (add, subtract, multiply) and logical operations (AND, OR, XOR).

Control Unit

Directs the entire execution cycle, orchestrating fetch/decode/execute.

Cache

Very fast memory layers (L1, L2, L3) that speed up instructions and data access.

RAM (Main Memory)

Where instructions and program data live while a program runs.

3. Step-by-Step Guide to the Instruction Cycle

Step 1: Fetch

Fetching means retrieving the next instruction from memory.

PC → MAR
The Program Counter contains the address of the next instruction.
That address is copied into the Memory Address Register.
CPU requests memory read from RAM (or cache)
The address is placed on the address bus.
Memory returns the instruction → MDR
Data from memory goes into the Memory Data Register.
MDR → IR
The fetched instruction is placed in the Instruction Register.
PC increments
PC = PC + size_of_instruction (typically +1, +4, etc.).

At this point, the CPU has loaded the instruction and is ready to decode.

Step 2: Decode

In this step, the CPU’s Control Unit interprets what the instruction means.

Examples:

If the instruction is ADD R1, R2, the control unit activates the ALU.
If the instruction is LOAD R1, [0x1000], the CPU prepares for a memory read.
If the instruction is a branch, the control unit checks conditions and prepares to modify the PC.

The CPU breaks the instruction into:

Opcode (operation: add, move, jump)
Operands (register numbers, memory addresses, constants)
Addressing modes (direct, indirect, immediate, indexed)

No data processing happens yet—only interpretation.

Step 3: Execute

Now the CPU performs the action:

Case 1: Arithmetic / Logical Operation

The ALU performs the operation.
Result is stored in a register (e.g., R1 = R1 + R2).

Case 2: Memory Read (LOAD)

This is where the CPU gets data from RAM.

The effective memory address is calculated (base + offset).
The address is placed into MAR.
The control unit sends a read request to memory.
Data from RAM moves over the data bus into MDR.
MDR → Register (e.g., R1).

Case 3: Memory Write (STORE)

CPU puts data into MDR.
CPU loads address into MAR.
CPU signals memory to write MDR to that RAM address.

Case 4: Branch

If condition is true: PC is updated to a new address.
Otherwise: PC continues normally.

Case 5: I/O

Control signals activate I/O controllers.

After execution, the cycle restarts with the next instruction.

4. How the CPU Actually Gets Data from RAM

Let’s focus specifically on memory access, since this is the heart of performance.

Step 1: CPU checks cache

Before going to RAM, the CPU checks:

L1 cache
L2 cache
L3 cache

If the requested data is in cache (cache hit):

CPU reads it instantly (nanoseconds).

If the data is not in cache (cache miss):

The CPU must ask RAM for it.
This is much slower.

Step 2: If not found → CPU sends address to RAM

The address is placed on the Address Bus via the MAR.

Step 3: RAM responds

RAM places the requested data on the Data Bus.

Step 4: Data arrives in MDR

CPU pulls the data into the Memory Data Register.

Step 5: CPU copies data into target register

Example:

LOAD R1, [0x5000]

The final step:

MDR → R1

This completes the memory read operation.

5. Putting It All Together: A Real Example

Consider this simple instruction:

ADD R1, [0x2000]

Here is the complete flow:

Fetch the instruction at PC.
Decode the opcode (ADD) and operand types (register + memory).
Execute:

Calculate effective address: 0x2000.
MAR = 0x2000.
RAM → MDR (memory read).
ALU adds R1 + MDR.
Store result back into R1.

Everything happens in a few CPU cycles.

6. Pipeline and Parallelism (Modern CPUs)

Modern CPUs no longer execute instructions strictly one at a time. They use:

Instruction pipelining
Superscalar execution
Out-of-order execution
Branch prediction

This means:

Multiple instructions are in different stages at the same time.
The CPU may reorder instructions for better efficiency.
The hardware tries to guess what the next instruction will be.

These optimizations allow CPUs to reach billions of operations per second.

7. Summary

Here is the entire execution cycle in one sequence:

Fetch:
PC → MAR → Memory → MDR → IR → PC++
Decode:
Control Unit interprets opcode and operands.
Execute:
ALU operations
Memory Load/Store
Branching
I/O operations
Memory Access:
Always goes through cache → RAM → MDR → Register.

Understanding this cycle gives you a deep view into how your code really runs at the hardware level. Whether you’re studying system design, assembly, compilers, or performance engineering, these fundamentals apply across all platforms.

DEV Community