Hello everyone, welcome to week 6 of SPO600(Software Portability and Optimization) reflection and extra exploration blog! We have been learning assembly language for 5 weeks. This week we are going to talk about how C programs are built, compiled and optimized.
As a beginner of a C programmer, are you curious about how the source code becomes the binary executable file? We start from this today.
// hello.c
#include <stdio.h>
int main(){
printf("hello world!\n");
}
We start from the simplest helloworld program which is the first program for every programmer.
If we want to compile this source code, what we need to in Linux system is:
$ gcc hello.c # compile
$ ./a.out # execute
hello world!
It is super common for us to run a code, but is this process as simple as what we see above? The answer is No. Let's see what the gcc actually does during the compile period above.
The gcc compiler will run through 5 steps, which are: preprocessing, compilation, optimization, assembly, and linking.
To show it more specifically, I want to provide another example.
// mymath.h
#ifndef MYMATH_H
#define MYMATH_H
int add(int a, int b)
int sum(int a, int b);
#endif
// mymath.c
int add(int a, int b){
return a+b;
}
int sub(int a, int b){
return a-b;
}
// test.c
#include <stdio.h>
int main(){
int a = 2;
int b = 3;
int sum = add(a, b);
printf("a=%d, b=%d, a+b=%d\n", a, b, sum);
}
In this program, we simply created it to find the sum for 2 integers.
Preprocessing
Preprocessing is going to replace the declaration of the header file with its content. After the processing, you will get a larger file. You can use the command below to preprocess the test.c
gcc -E -I./inc test.c -o test.i
or use cpp command
$ cpp test.c -I./inc -o test.i
-E in the above command is to exit when the preprocessing is completed.
-I is to determine the header file folder.
-o is to determine the name of the output file.
and the size of test.i is 17691B, for test.c is 146B.
Compilation
The compilation in this part is to assemble the preprocessed code to the assembly code. You can use the code below:
$ gcc -S -I./inc test.c -o test.s
The content of test.s is below
.file "test.c"
.section .rodata
.LC0:
.string "a=%d, b=%d, a+b=%d\n"
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
andl $-16, %esp
subl $32, %esp
movl $2, 20(%esp)
movl $3, 24(%esp)
movl 24(%esp), %eax
movl %eax, 4(%esp)
movl 20(%esp), %eax
movl %eax, (%esp)
call add
movl %eax, 28(%esp)
movl 28(%esp), %eax
movl %eax, 12(%esp)
movl 24(%esp), %eax
movl %eax, 8(%esp)
movl 20(%esp), %eax
movl %eax, 4(%esp)
movl $.LC0, (%esp)
call printf
leave
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu 4.8.2-19ubuntu1) 4.8.2"
.section .note.GNU-stack,"",@progbits
For all I know, I don't quite understand what the assembly code really does here. I will create another post for this if I figure it out.
Assemble
Assemble is to convert the assembly code to the machine code. In this step, it will create a binary-format object file. You can use
gcc -c test.s -o test.o
This step will create an object file for every source file.
Linking
Linking process will connect all the library files that the program needs to finally create an executable file(.exe).
Conclusion & my thought
According to the procedure above, we could see that the process is not like what we thought before, the procedure is not that simple for us. The compilation will go through 5 steps in order to create an executable file for us to access, we can not skip any single step from the pipeline above to compile a source code file successfully.
However, you don't need to worry too much because you can still use:
$ gcc hello.c # compile
$ ./a.out # execute
Top comments (0)