Divyesh Kakadiya

Posted on Mar 26

Zero-Cost Abstractions in Rust — What It Really Means

#rust #performance #webdev #zerocostabstraction

“What you don’t use, you don’t pay for. And what you do use, you couldn’t hand-code better.” — Bjarne Stroustrup

The Philosophy at the Core

When Bjarne Stroustrup coined the term zero-cost abstractions for C++, he meant two things:

You don’t pay for what you don’t use
You can’t do better by hand

Rust takes that philosophy and turns it into a guarantee, not just an aspiration. It’s baked into the design of the language itself.

But what does that actually mean in practice?

Most explanations stop at “the compiler is smart.” That’s not enough. Let’s break it down — from source code to machine code.

What “Zero-Cost” Actually Means

Let’s clarify what zero-cost abstractions do NOT mean:

Your program runs instantly
You never need to think about performance
All Rust code is automatically fast

What it does mean:

The abstraction mechanism adds no runtime overhead
Iterators compile to the same code as manual loops
Generics compile to specialized versions (no dynamic cost)
Closures compile to inline logic

The cost exists at compile time:

Longer builds
Larger binaries (due to monomorphization)

At runtime? Zero tax.

Iterators vs Loops — The Classic Example

Manual loop

fn sum_squares_loop(data: &[i64]) -> i64 {
    let mut total = 0;
    let mut i = 0;
    while i < data.len() {
        let x = data[i];
        if x % 2 == 0 {
            total += x * x;
        }
        i += 1;
    }
    total
}

Iterator version

fn sum_squares_iter(data: &[i64]) -> i64 {
    data
        .iter()
        .filter(|&&x| x % 2 == 0)
        .map(|&x| x * x)
        .sum()
}

The iterator version reads like English:

“Filter even numbers → square them → sum the results.”

Why it's just as fast (or faster)

1. Bounds-check elimination
The compiler can prove accesses are safe and removes checks entirely.

2. Loop fusion
.filter(), .map(), .sum() → one loop, not three.

3. Auto-vectorization
The compiler may use SIMD instructions automatically.

What the Compiler Actually Does

Stage 1: Source code

data.iter().filter(...).map(...).sum()

Stage 2: MIR (Mid-level IR)

Iterators become state machines
No heap allocations
Everything gets inlined

Stage 3: LLVM optimizations

Functions + closures inlined
Operations fused into a single loop
Bounds checks removed
SIMD applied where possible

Stage 4: Assembly

What remains is highly optimized machine code — often processing multiple values at once.

Generics: Monomorphization

fn largest<T: PartialOrd>(list: &[T]) -> &T {
    let mut largest = &list[0];
    for item in list.iter() {
        if item > largest {
            largest = item;
        }
    }
    largest
}

What actually happens:

largest(&[3_i32, 1, 7, 2]);  // generates largest_i32
largest(&[3.1_f64, 1.0]);    // generates largest_f64

Each type gets its own specialized version.

Result:

No virtual dispatch
No runtime overhead
Same performance as handwritten code

Tradeoff: larger binaries, longer compile times.

Traits: Static vs Dynamic Dispatch

trait Drawable {
    fn draw(&self);
}

Static dispatch (zero-cost)

fn render_static<T: Drawable>(shape: &T) {
    shape.draw();
}

Fully known at compile time
Inlined
No vtable

Dynamic dispatch (runtime cost)

fn render_dynamic(shape: &dyn Drawable) {
    shape.draw();
}

Uses vtable lookup
~3–5ns overhead per call
Cannot be inlined

👉 In Rust, dynamic dispatch is always explicit (dyn Trait).

Real-World Example

fn process_logs(raw: &str) -> HashMap<String, usize> {
    raw
        .lines()
        .filter(|l| l.contains("ERROR"))
        .filter_map(|l| parse_service(l))
        .fold(HashMap::new(), |mut acc, service| {
            *acc.entry(service).or_insert(0) += 1;
            acc
        })
}

What this gives you:

Single pass over data
No intermediate allocations
No extra memory overhead

Readable AND fast.

Performance Summary

Task	Approach	Performance
Sum of squares	Iterators	Same as loop
Filter + map	Iterators	Same or faster
Generics	Monomorphized	Same
Dynamic dispatch	`dyn Trait`	Small overhead

Where You Can Still Go Wrong

Zero-cost abstractions don’t mean zero thinking.

Common pitfalls:

1. clone() in hot loops
→ Each call may allocate

2. .collect() mid-chain
→ Breaks fusion + allocates memory

3. Box<dyn Trait> in hot paths
→ Heap allocation + dynamic dispatch

4. Large values copied repeatedly
→ Prefer references or Arc<T>

How This Changes Your Coding Style

1. Prefer clarity first

Readable iterator chains are usually also the fastest.

2. Trust the compiler

Don’t assume loops are faster — measure first.

3. Make costs explicit

dyn Trait → runtime cost
Box, Vec → allocation

4. Embrace iterator chains

They give the compiler more optimization opportunities.

The Bigger Picture

Zero-cost abstractions are more than a performance feature.

They represent a shift in how we think about programming:

You shouldn’t have to choose between readability and performance.

Rust proves you can have both.

Key Takeaways

Prefer iterators over loops — clearer and just as fast
Use generics and impl Trait by default (static dispatch = free)
Closures are inlined — no overhead
Avoid unnecessary .collect() calls
Profile before optimizing

Final Thought

Rust’s biggest promise isn’t just memory safety.

It’s this:

High-level code that compiles down to low-level performance — without compromise.

DEV Community