Inspired by this video, which demonstrates how branchless techniques can improve the performance of certain operations, I decided to give it a quick try with Dart. Although it is considered to be a low-level one, why not Dart? After all, speculative execution and branching instructions are everywhere, no matter the language you write in.
Branchless programming is a trick that allows to avoid if
statements and conditional operators. This way, you don't use branching instructions of CPUs and do not engage branch prediction (that's when the CPU tries to guess what instructions are next). Failed predictions can have a significant performance tall.
Challenges with Dart
You can find plenty of articles on the internet. Most of the examples are in C/C++. There're a few missing features in Dart that make direct translation impossible:
- C implicitly casts bool to int, not in Dart. E.g. you can't have this expression in Dart:
(a > b) +1
- There's no unsigned int in Dart
Dart code
You can find the full example at https://github.com/maxim-saplin/branchless_dart.
min()
The first optimization I tried was a function that returns the minimum of the 2 ints.
Variant 1:
int minBranchless(int a, int b) {
return a + ((b - a) & ((b - a) >> 63));
}
And variant 2:
int minBranchless2(int a, int b) {
var m1 = (a - b) >>> 63;
var m2 = m1 ^ 1;
return a * m1 + b * m2;
}
They use bitwise operators to avoid conditional operators. Here're the versions I tested them against:
int min(int a, int b) {
if (a < b) {
return a;
} else {
return b;
}
}
int min2(int a, int b) {
return a < b ? a : b;
}
And of course, there was a test for standard Math.min()
.
abs()
The second function was absolute value:
int absBranchless(int a) {
int mask = a >>> 63;
a ^= mask;
a -= mask;
return a;
}
Tested it against int.abs()
and this trivial implementation:
int abs(int a) {
return a >= 0 ? a : -a;
}
Results
Tested with Dart 3.1.0, on macOS 13.5 with M1 Pro and Windows 11 with Intel Core i5-8257U:
Function | macOS/ARM Time (ms) | Windows/x86 Time (ms) |
---|---|---|
min | 1176 | 1372 |
min2 | 1178 | 1368 |
mathMin | 1180 | 1379 |
branchlessMin | 2351 | 4058 |
branchlessMin2 | 2395 | 4571 |
testAbs | 3142 | 4033 |
testIntAbs | 3133 | 4794 |
testAbsBranchless | 2351 | 4744 |
The branchless version was a winner only once and only on Mac/ARM - absBrnachless()
. In other cases branchless were either slower (min) or the same.
A curious finding is that the trivial implementation of abs()
with if statements was a winner on x86.
Conclusion
As a brainteaser trying branchless programming in Dart can be entertaining. Though I don't see any practical reasons to seriously go that route.
Platform specificity plays a huge role in the performance and you should test for each configuration (OS, CPU) and there're numerous.
Secondly, there can be better ways to optimize critical parts (you always start with algorithm and higher level optimizations which give the most benefits).
Thirdly, bitwise operators do not work well in the Web, branchless optimisations will fail there.
Top comments (3)
Good to know and interesting experiment. Thanks for sharing.
Thanks for the fun article, although I would set code readability and higher-level optimizations (e.g. a simple cache for network caches) as priorities. I presume very few Dart programs/apps would actually benefit from these kind of optimizations, maybe some edge cases but that's probably it 😅
Exactly)