“What you don’t use, you don’t pay for. And what you do use, you couldn’t hand-code better.” — Bjarne Stroustrup
The Philosophy at the Core
When Bjarne Stroustrup coined the term zero-cost abstractions for C++, he meant two things:
- You don’t pay for what you don’t use
- You can’t do better by hand
Rust takes that philosophy and turns it into a guarantee, not just an aspiration. It’s baked into the design of the language itself.
But what does that actually mean in practice?
Most explanations stop at “the compiler is smart.” That’s not enough. Let’s break it down — from source code to machine code.
What “Zero-Cost” Actually Means
Let’s clarify what zero-cost abstractions do NOT mean:
- Your program runs instantly
- You never need to think about performance
- All Rust code is automatically fast
What it does mean:
- The abstraction mechanism adds no runtime overhead
- Iterators compile to the same code as manual loops
- Generics compile to specialized versions (no dynamic cost)
- Closures compile to inline logic
The cost exists at compile time:
- Longer builds
- Larger binaries (due to monomorphization)
At runtime? Zero tax.
Iterators vs Loops — The Classic Example
Manual loop
fn sum_squares_loop(data: &[i64]) -> i64 {
let mut total = 0;
let mut i = 0;
while i < data.len() {
let x = data[i];
if x % 2 == 0 {
total += x * x;
}
i += 1;
}
total
}
Iterator version
fn sum_squares_iter(data: &[i64]) -> i64 {
data
.iter()
.filter(|&&x| x % 2 == 0)
.map(|&x| x * x)
.sum()
}
The iterator version reads like English:
“Filter even numbers → square them → sum the results.”
Why it's just as fast (or faster)
1. Bounds-check elimination
The compiler can prove accesses are safe and removes checks entirely.
2. Loop fusion
.filter(), .map(), .sum() → one loop, not three.
3. Auto-vectorization
The compiler may use SIMD instructions automatically.
What the Compiler Actually Does
Stage 1: Source code
data.iter().filter(...).map(...).sum()
Stage 2: MIR (Mid-level IR)
- Iterators become state machines
- No heap allocations
- Everything gets inlined
Stage 3: LLVM optimizations
- Functions + closures inlined
- Operations fused into a single loop
- Bounds checks removed
- SIMD applied where possible
Stage 4: Assembly
What remains is highly optimized machine code — often processing multiple values at once.
Generics: Monomorphization
fn largest<T: PartialOrd>(list: &[T]) -> &T {
let mut largest = &list[0];
for item in list.iter() {
if item > largest {
largest = item;
}
}
largest
}
What actually happens:
largest(&[3_i32, 1, 7, 2]); // generates largest_i32
largest(&[3.1_f64, 1.0]); // generates largest_f64
Each type gets its own specialized version.
Result:
- No virtual dispatch
- No runtime overhead
- Same performance as handwritten code
Tradeoff: larger binaries, longer compile times.
Traits: Static vs Dynamic Dispatch
trait Drawable {
fn draw(&self);
}
Static dispatch (zero-cost)
fn render_static<T: Drawable>(shape: &T) {
shape.draw();
}
- Fully known at compile time
- Inlined
- No vtable
Dynamic dispatch (runtime cost)
fn render_dynamic(shape: &dyn Drawable) {
shape.draw();
}
- Uses vtable lookup
- ~3–5ns overhead per call
- Cannot be inlined
👉 In Rust, dynamic dispatch is always explicit (dyn Trait).
Real-World Example
fn process_logs(raw: &str) -> HashMap<String, usize> {
raw
.lines()
.filter(|l| l.contains("ERROR"))
.filter_map(|l| parse_service(l))
.fold(HashMap::new(), |mut acc, service| {
*acc.entry(service).or_insert(0) += 1;
acc
})
}
What this gives you:
- Single pass over data
- No intermediate allocations
- No extra memory overhead
Readable AND fast.
Performance Summary
| Task | Approach | Performance |
|---|---|---|
| Sum of squares | Iterators | Same as loop |
| Filter + map | Iterators | Same or faster |
| Generics | Monomorphized | Same |
| Dynamic dispatch | dyn Trait |
Small overhead |
Where You Can Still Go Wrong
Zero-cost abstractions don’t mean zero thinking.
Common pitfalls:
1. clone() in hot loops
→ Each call may allocate
2. .collect() mid-chain
→ Breaks fusion + allocates memory
3. Box<dyn Trait> in hot paths
→ Heap allocation + dynamic dispatch
4. Large values copied repeatedly
→ Prefer references or Arc<T>
How This Changes Your Coding Style
1. Prefer clarity first
Readable iterator chains are usually also the fastest.
2. Trust the compiler
Don’t assume loops are faster — measure first.
3. Make costs explicit
-
dyn Trait→ runtime cost -
Box,Vec→ allocation
4. Embrace iterator chains
They give the compiler more optimization opportunities.
The Bigger Picture
Zero-cost abstractions are more than a performance feature.
They represent a shift in how we think about programming:
You shouldn’t have to choose between readability and performance.
Rust proves you can have both.
Key Takeaways
- Prefer iterators over loops — clearer and just as fast
- Use generics and
impl Traitby default (static dispatch = free) - Closures are inlined — no overhead
- Avoid unnecessary
.collect()calls - Profile before optimizing
Final Thought
Rust’s biggest promise isn’t just memory safety.
It’s this:
High-level code that compiles down to low-level performance — without compromise.
Top comments (0)