Programming in JS: Let's just just check every element until we find what user asked for.
Programming in Rust: Hmm, do I really need this extra byte in bystruct
?
So, I spent some time trying to understand how much actual time dynamic dispatch costs me.
I have a service which should choose between different trait implementations depending on input. Means that there are 2 options:
struct BackendB<'a, T: MyProcessor> {
data: &'a mut T
// But we will struggle instantiating this
}
or
struct BackendA<'a> {
data: Box<&'a mut dyn MyProcessor>
}
Backend
should internally call some MyProcessor
methods, so I will be sad if these calls are slow. To benchmarks!
impl<'a> BackendA<'a> {
#[inline(never)] // This is very important, or
// compiler will work too good and replace calls with constants
fn inc(&mut self, how_much: usize) {
// This simulates internal calls
for _i in 0..how_much {
self.data.inc(); // vtable resolve should happen here
}
}
}
impl<'a, T: MyProcessor> BackendB<'a, T> {
#[inline(never)]
fn inc(&mut self, how_much: usize) {
for _i in 0..how_much {
self.data.inc();
}
}
}
And here is benchmark entry points:
fn criterion_benchmark(c: &mut Criterion) {
c.bench_function("Static dispatch", |b| b.iter(|| {
let mut d = ImplMyProcessor {data: 0};
let mut b = BackendB::<ImplMyProcessor> {data: &mut d};
b.inc(black_box(500_000));
}));
c.bench_function("Dynamic dispatch", |b| b.iter(|| {
let mut d = ImplMyProcessor {data: 0};
let mut b = BackendA {data: Box::new(&mut d)};
b.inc(black_box(500_000));
}));
}
But it gives almost same timings:
time: [1.1774 ms 1.1916 ms 1.2068 ms]
time: [1.2390 ms 1.2625 ms 1.2908 ms]
Ok, let's check what happens if vtable
is actually involved:
fn criterion_benchmark(c: &mut Criterion) {
c.bench_function("Static dispatch", |b| b.iter(|| {
let mut d = ImplMyProcessor {data: 0};
let mut b = BackendB::<ImplMyProcessor> {data: &mut d};
for _i in 0..500_000 {
b.inc(black_box(1));
}
}));
c.bench_function("Dynamic dispatch", |b| b.iter(|| {
let mut d = ImplMyProcessor {data: 0};
let mut b = BackendA {data: Box::new(&mut d)};
for _i in 0..500_000 {
b.inc(black_box(1));
}
}));
}
Now we see the difference:
time: [1.6077 ms 1.6248 ms 1.6431 ms]
time: [2.6543 ms 2.7012 ms 2.7553 ms]
So the test is valid, we can actually see the regress of dynamic dispatch in Rust. But the good news is - it happens only ones per function call (optimizer does it's job!). And in my case - ones per API call, and I can stop worrying.
Top comments (0)