CPU (Central Processing Unit) — Complete Deep-Dive Guide for Developers
The CPU (Central Processing Unit) is the core execution engine of every computer system. Every line of code you write—JavaScript, Python, C++, SQL, or Assembly—eventually becomes CPU instructions.
To write efficient software, understand performance, debug low-level issues, or learn Assembly and System Design, you must understand the CPU deeply.
This article explains the CPU from silicon logic to machine instructions, step by step.
1. What Is a CPU?
A CPU is a programmable electronic circuit that:
- Fetches instructions from memory
- Decodes them
- Executes them
- Stores results
This process is called the Instruction Cycle.
At a high level:
Software → Compiler → Assembly → Machine Code → CPU
2. High-Level CPU Architecture
A modern CPU is composed of these major subsystems:
+--------------------------------------------------+
| CPU |
| |
| +-----------+ +-----------+ +-------------+ |
| | Control | | Execution | | Registers | |
| | Unit (CU) | | Units | | | |
| +-----------+ +-----------+ +-------------+ |
| |
| +---------------- Cache (L1/L2/L3) ------------+|
| |
+--------------------------------------------------+
Core Components
- Control Unit (CU)
- Arithmetic Logic Unit (ALU)
- Registers
- Cache Memory
- Clock
- Instruction Decoder
- Execution Units
- Bus Interface
3. CPU Clock and Timing
The clock synchronizes all CPU operations.
- Measured in GHz
- 3.5 GHz = 3.5 billion cycles per second
Each instruction takes multiple clock cycles.
Example:
ADD RAX, RBX
May take:
- Fetch: 1 cycle
- Decode: 1 cycle
- Execute: 1–3 cycles
- Write-back: 1 cycle
4. Instruction Cycle (Fetch–Decode–Execute)
Every CPU follows this loop:
Step 1: Fetch
- Instruction Pointer (IP / RIP) points to memory
- Instruction loaded into Instruction Register
Step 2: Decode
- Control Unit interprets opcode
- Determines operands and execution unit
Step 3: Execute
- ALU / FPU / SIMD executes instruction
Step 4: Write Back
- Result stored in register or memory
5. Registers (Fastest Storage)
Registers are inside the CPU, faster than cache.
General-Purpose Registers (x86-64)
| Register | Purpose |
|---|---|
| RAX | Accumulator |
| RBX | Base |
| RCX | Counter |
| RDX | Data |
| RSI | Source Index |
| RDI | Destination Index |
| RSP | Stack Pointer |
| RBP | Base Pointer |
| RIP | Instruction Pointer |
Example (Assembly)
mov rax, 10
mov rbx, 20
add rax, rbx ; rax = 30
6. Arithmetic Logic Unit (ALU)
The ALU performs:
- Addition
- Subtraction
- Bitwise AND/OR/XOR
- Shifts
- Comparisons
Example:
cmp rax, rbx
jg greater
ALU sets CPU flags:
- Zero Flag (ZF)
- Carry Flag (CF)
- Sign Flag (SF)
- Overflow Flag (OF)
7. Floating Point Unit (FPU)
The FPU handles:
- Floating-point arithmetic
- IEEE-754 operations
Example:
movsd xmm0, [a]
addsd xmm0, [b]
8. SIMD & Vector Units
SIMD = Single Instruction, Multiple Data
Used for:
- Graphics
- AI
- Video
- Scientific computing
Examples:
- SSE
- AVX
- AVX-512
vmovaps ymm0, ymm1
vaddps ymm0, ymm0, ymm2
9. CPU Cache Hierarchy
CPU cache reduces memory latency.
Cache Levels
| Level | Location | Speed | Size |
|---|---|---|---|
| L1 | Per core | Fastest | 32–128 KB |
| L2 | Per core | Fast | 256 KB–1 MB |
| L3 | Shared | Slower | 8–64 MB |
Memory access order:
Registers → L1 → L2 → L3 → RAM
10. Cache Lines and Cache Coherency
- Cache works in cache lines (usually 64 bytes)
-
MESI protocol keeps caches consistent across cores:
- Modified
- Exclusive
- Shared
- Invalid
11. Instruction Set Architecture (ISA)
ISA defines:
- Instructions
- Registers
- Addressing modes
- ABI compatibility
Common ISAs
- x86-64 (Intel / AMD)
- ARM64
- RISC-V
Example x86 instruction:
add rax, rbx
12. CISC vs RISC
CISC (x86)
- Complex instructions
- Variable length
- Fewer instructions per program
RISC (ARM, RISC-V)
- Simple instructions
- Fixed length
- More instructions
13. Pipelines
Modern CPUs use instruction pipelines:
Fetch → Decode → Execute → Memory → Write Back
Multiple instructions are processed in parallel.
14. Superscalar Execution
CPU executes multiple instructions per cycle.
Example:
- ALU + FPU + Load unit all working simultaneously
15. Out-of-Order Execution
CPU reorders instructions internally for efficiency.
mov rax, [a]
mov rbx, [b]
add rcx, rdx
CPU executes add early if operands ready.
16. Branch Prediction
Branches slow pipelines.
CPU predicts:
if (x > 0)
Wrong prediction → pipeline flush → performance penalty.
17. Speculative Execution
CPU executes predicted paths before knowing outcome.
This led to vulnerabilities:
- Spectre
- Meltdown
18. Memory Addressing Modes
Examples:
mov rax, [rbx]
mov rax, [rbx + 8]
mov rax, [rbx + rcx*4]
19. Stack and Stack Frame
Stack stores:
- Function arguments
- Local variables
- Return addresses
push rbp
mov rbp, rsp
sub rsp, 32
20. Interrupts and Exceptions
Interrupts stop normal execution:
- Hardware interrupts (keyboard, timer)
- Software interrupts (syscalls)
Example:
syscall
21. Privilege Levels (Rings)
| Ring | Access |
|---|---|
| Ring 0 | Kernel |
| Ring 3 | User programs |
System calls switch Ring 3 → Ring 0.
22. Multi-Core CPUs
Modern CPUs have:
- Multiple cores
- Shared cache
- Hardware threads (SMT / Hyper-Threading)
23. CPU and Operating System
OS responsibilities:
- Scheduling
- Context switching
- Memory protection
Context switch saves:
- Registers
- Flags
- Instruction pointer
24. From C Code to CPU
C Code
int add(int a, int b) {
return a + b;
}
Assembly
mov eax, edi
add eax, esi
ret
Machine Code
89 f8
01 f0
c3
25. Performance Factors
CPU performance depends on:
- Clock speed
- IPC (Instructions per cycle)
- Cache hit rate
- Branch prediction accuracy
- Memory latency
26. Why CPU Knowledge Matters for Developers
Understanding the CPU helps you:
- Write faster code
- Debug performance issues
- Learn Assembly & ABI
- Understand compilers
- Build system-level software
- Master system design
27. Summary
The CPU is not a black box.
It is a highly optimized instruction execution engine built from:
- Registers
- ALUs
- Pipelines
- Caches
- Predictors
- Schedulers
Every abstraction eventually collapses into CPU instructions.
If you understand the CPU, you understand computing.
Top comments (1)
Interested!