Most developers know how to write C code.
Far fewer know what actually happens after they press Run.
A common mental model looks like this:
Write Code
↓
Click Run
↓
Get Output
Simple. Intuitive. Completely wrong.
Between your source code and the CPU sits an entire toolchain. Before a processor executes a single instruction, your code passes through preprocessing, compilation, assembly, linking, and loading.
Understanding this pipeline explains:
- Why compiler errors happen
- Why linker errors happen
- What object files are
- How libraries work
- What GCC is actually doing
Let's walk through the complete journey.
The Complete Pipeline
A C program typically moves through the following stages:
main.c
↓
Preprocessor ───► main.i
↓
Compiler ───────► main.s
↓
Assembler ──────► main.o
↓
Linker ─────────► app
↓
Loader ─────────► Memory
↓
CPU ────────────► Execution
Each stage has one job.
Each stage produces output consumed by the next.
Stage 1: The Preprocessor
The compiler is not the first tool that touches your code.
The preprocessor runs before compilation begins.
Consider:
#include <stdio.h>
#define PI 3.14
int main() {
printf("Hello World");
return 0;
}
The preprocessor handles:
- Header inclusion (
#include) - Macro expansion (
#define) - Comment removal
- Conditional compilation (
#ifdef)
For example:
#define PI 3.14
float radius = PI;
becomes:
float radius = 3.14;
before compilation even starts.
Output:
main.i
View it yourself:
gcc -E main.c -o main.i
Stage 2: The Compiler
The compiler's job is much more than translation.
It performs:
- Lexical analysis
- Syntax analysis
- Semantic analysis
- Optimization
- Code generation
Example:
x = y + z;
might become:
MOV AX, y
ADD AX, z
MOV x, AX
The compiler also catches errors such as:
printf("Hello"
Missing parentheses.
Type mismatches.
Undeclared variables.
Output:
main.s
Generate it:
gcc -S main.c -o main.s
Stage 3: The Assembler
Assembly language is readable by humans.
CPUs cannot execute it directly.
The assembler converts assembly instructions into machine code.
Example:
ADD AX, BX
becomes something closer to:
00000011 11000011
Output:
main.o
Generate it:
gcc -c main.s -o main.o
At this point you have machine code, but not a complete executable.
Stage 4: The Linker
This is where many developers get confused.
Suppose your code contains:
printf("Hello");
The compiler verifies that printf() exists.
The assembler creates machine code referencing it.
But where is the actual implementation?
The linker finds it.
It combines:
- Your object files
- Standard libraries
- Third-party libraries
and produces a complete executable.
Output:
app
Why "Undefined Reference" Happens
One of the most common C errors:
undefined reference to 'myFunction'
This is not a compiler error.
It is a linker error.
Typical causes:
- Function declared but not defined
- Source file not compiled
- Library not linked
- Incorrect linker flags
A useful rule:
The compiler checks correctness. The linker checks completeness.
Stage 5: The Loader
After linking, you finally have an executable.
But the CPU still cannot execute a file sitting on disk.
The operating system loader:
- Allocates memory
- Loads instructions into RAM
- Maps shared libraries
- Initializes the process
Only then does execution begin.
What GCC Actually Does
Most developers use:
gcc main.c -o app
Behind that single command, GCC runs multiple stages.
You can execute them manually:
[Preprocessor]
gcc -E main.c -o main.i
[Compiler]
gcc -S main.i -o main.s
[Assembler]
gcc -c main.s -o main.o
[Linker]
gcc main.o -o app
Understanding these commands makes debugging much easier.
Compiler Error vs Linker Error
| Type | Stage | Meaning |
|---|---|---|
| Compiler Error | Compilation | Invalid code |
| Linker Error | Linking | Missing implementation |
Examples:
Compiler Error:
printf("Hello"
Linker Error:
undefined reference to 'test'
Remember:
Compiler = correctness
Linker = completeness
Key Takeaways
- The CPU cannot understand C directly.
- The preprocessor handles directives.
- The compiler analyzes and transforms code.
- The assembler generates machine code.
- The linker builds a complete executable.
- The loader places the program into memory.
-
undefined referenceis a linker error. - GCC automates the entire pipeline.
The next time you press Run, you'll know exactly what happens before your program reaches the CPU.
What part of the compilation pipeline confused you the most when you started learning C?
If you found this useful, you can read more articles on C Programming, Java, and Software Engineering at:
Originally published on Moksh eLearning Blog.
Full article: C Compliation Process
Additional content on C Programming, Java, Spring Boot, and Software Engineering:
https://mokshelearning.blogspot.com
Top comments (0)