Originally published on my blog in 2014
I still remember an interview I had around February 2001, in which an embedded firmware engineer talked about how his team wrote code:
We write stuff in Assembler, because we're too lazy to write stuff in C.
Wait...what? I thought the whole purpose of C was to have portable Assembly, so you could control the bare metal correctly? I did get an inkling if you were that good, assembly could be seductive in your ability to do whatever you want.
This came to mind again when a former colleague of mine posed a similar question on Facebook the other night:
Pop quiz: When you run this, what prints out?
Basically, the above is a quiz to determine if you understand loops, expressions -versus- statements, and the pre-decrement operator (--
). Pre-decrement specifies that the lvalue of the expression is the current value minus one and the post-state of that variable is assigned that decremented value. Post-decrement has the same result (decrementing the value), but the lvalue of the expression is the PREVIOUS value.
As is my wont, I got the above wrong, but that's not the point. :-D
To check my answer, I sucked it into quick c program using vim:
Compiling that program and using mac's otool
to dump the assembly gives you this:
Unoptimized version
Some things to note in the above:
- The compiler has done a faithful job of translating exactly the program (as-is) to assembler:
- We load the variables in lines 9 and 10
- We have the first loop in lines 11-22
- The second loop (despite being a no-op) still exists, in lines 24-29
Compiler-optimized version
Things get slightly more interesting when you pass the -O (optimize) flag
Some things to note:
- This looks nothing like the C code. There are no loops (or indeed, branch instructions) at all.
- The compiler determined the second loop to be a no-op, and compiled it away completely.
- Our stack variables are gone. The compiler is using x64 CPU registers exclusively.
- The compiler has analyzed the loop and unrolled it into discrete calls to
callq
for the printf function.
Lastly: The answer to the quiz is in the assembly if you look hard enough:
5 9
5 8
5 7
5 6
5 5
5 4
5 3
5 2
5 1
Pretty cool....I never get to look at assembly in my day-job, so getting this close to the CPU was a neat
Oldest comments (2)
The compiler also does this to sum(1,...n) as n + n(n-1)/2. Pretty intelligent compiler devs.
AMD has a great guide on writing optimized C/C++ code.
Personally I think of C as kinda psuedocodes for assembly, I can
write stuff out without having to think as much, and it allows me
to mess around without having to write as much code