DEV Community

Alex Dzyoba
Alex Dzyoba

Posted on • Originally published at on

gprof and gcov

gprof and gcov are classical profilers that are still in use. Since the dawn of time, they were used by hackers to gain insight into their programs at the source code level.

In the examples below, I’ll use my block_hasher program to illustrate the usage of profilers. because it’s a small and simple utility.


gprof (GNU Profiler) – simple and easy profiler that can show how much time your program spends in routines in percents and seconds. gprof uses source code instrumentation by inserting special mcount function call to gather metrics of your program.

Building with gprof instrumentation

To gather profile you need to compile your program with -pg gcc option and then launch under gprof. For better results and statistical errors elimination, it’s recommended to launch profiling subject several times.

To build with gprof instrumentation invoke gcc like this:

$ gcc <your options> -pg -g prog.c -o prog
Enter fullscreen mode Exit fullscreen mode

Here is the actual compile instructions for the block_hasher:

$ gcc -lrt -pthread -lcrypto -pg -g block_hasher.c -o block_hasher
Enter fullscreen mode Exit fullscreen mode

As a result, you’ll get instrumented program. To check if it’s really instrumented just grep mcount symbol.

$ nm block_hasher | grep mcount
     U mcount@@GLIBC_2.2.5
Enter fullscreen mode Exit fullscreen mode

Profiling block_hasher under gprof

As I said earlier to collect useful statistic we should run the program several times and accumulate metrics. To do that I’ve written simple bash script:


if [[ $# -lt 2 ]]; then
    echo "$0 <number of runs> <program with options...>"
    exit 1

shift 1


# Profile name is a program name (first element in args)
PROFILE_NAME="$(echo "${COMMAND}" | cut -f1 -d' ')"

for i in $(seq 1 ${RUNS}); do

    # Run profiled program
    eval "${COMMAND}"

    # Accumulate gprof statistic
    if [[ -e gmon.sum ]]; then
        gprof -s ${PROFILE_NAME} gmon.out gmon.sum
        mv gmon.out gmon.sum

# Make final profile
gprof ${PROFILE_NAME} gmon.sum > gmon.profile
Enter fullscreen mode Exit fullscreen mode

So, each launch will create gmon.out that gprof will combine in gmon.sum. Finally, gmon.sum will be feed to gprof to get flat text profile and call graph.

Let’s do this for our program:

$ ./ 10 ./block_hasher -d /dev/sdd -b 1048576 -t 10 -n 1000
Enter fullscreen mode Exit fullscreen mode

After finish, this script will create gmon.profile - a text profile, that we can analyze.


The flat profile has info about program routines and time spent in it.

Flat profile:

Each sample counts as 0.01 seconds.
  % cumulative self self total
 time seconds seconds calls Ts/call Ts/call name
100.24 0.01 0.01 thread_func
  0.00 0.01 0.00 50 0.00 0.00 time_diff
  0.00 0.01 0.00 5 0.00 0.00 bdev_close
  0.00 0.01 0.00 5 0.00 0.00 bdev_open
Enter fullscreen mode Exit fullscreen mode

gprof metrics are clear from the name. As we can see almost all of it’s time our little program spent in thread function, BUT look at the actual seconds – only 0.01 seconds from whole program execution. It means that it’s not the thread function who is slowing down but something underlying. In the case ofblock_hasher, it’s a pread syscall that does the I/O for the block device.

The call graph is really not interesting here, so I won’t show you it, sorry.


gcov (short for GNU Coverage) – tool to collect function call statistics line-by-line. Usually it’s used in pair with gprof to understand what exact line in slacking function is holds your program down.

Building with gcov instrumentation

Just as gprof you need to recompile your program with gcov flags

# gcc -fprofile-arcs -ftest-coverage -lcrypto -pthread -lrt -Wall -Wextra block_hasher.c -o block_hasher
Enter fullscreen mode Exit fullscreen mode

There are 2 gcov flags: -fprofile-arcs и -ftest-coverage. First will instrument your program to profile so called arcs – paths in program’s control flow. The second option will make gcc to collect additional notes for arcs profiling and gcov itself.

You can simply put --coverage option which implies both of -fprofile-arcsand -ftest-coverage along with linker -lgcov flag. See GCC debugging options for more info.

Profiling block_hasher under gcov

Now, after instrumenting we just launch the program to get 2 files –block_hasher.gcda and block_hasher.gcno. Please, don’t look inside it – we will transform it to text profile. To do this we execute gcov passing it source code filename. It’s important to remember that you must have<filename>.gcda and <filename>.gcno files.

$ gcov block_hasher.c
File 'block_hasher.c'
Lines executed:77.69% of 121
block_hasher.c:creating 'block_hasher.c.gcov'
Enter fullscreen mode Exit fullscreen mode

Finally, we’ll get block_hasher.c.gcov.


.gcov file is result of that whole gcov work. Let’s look at it. For each of your source files gcov will create annotated source codes with runtime coverage. Here is excerpt from thread_func:

10: 159: gap = num_threads * block_size; // Multiply here to avoid integer overflow
    -: 160:
    -: 161: // Initialize EVP and start reading
   10: 162: md = EVP_sha1();
   10: 163: mdctx = EVP_MD_CTX_create();
   10: 164: EVP_DigestInit_ex( mdctx, md, NULL );
    -: 165:
   10: 166: get_clock( &start );
10010: 167: for( i = 0; i < nblocks; i++)
    -: 168: {
10000: 169: offset = j->off + gap * i;
    -: 170:
    -: 171: // Read at offset without changing file pointer
10000: 172: err = pread( bdev->fd, buf, block_size, offset );
 9999: 173: if( err == -1 )
    -: 174: {
#####: 175: fprintf(stderr, "T%02d Failed to read at %llu\n", j->num, (unsigned long long)offset);
#####: 176: perror("pread");
#####: 177: pthread_exit(NULL);
    -: 178: }
    -: 179:
 9999: 180: bytes += err; // On success pread returns bytes read
    -: 181:
    -: 182: // Update digest
 9999: 183: EVP_DigestUpdate( mdctx, buf, block_size );
    -: 184: }
   10: 185: get_clock( &end );
   10: 186: sec_diff = time_diff( start, end );
    -: 187:
   10: 188: EVP_DigestFinal_ex( mdctx, j->digest, &j->digest_len );
   10: 189: EVP_MD_CTX_destroy(mdctx);
Enter fullscreen mode Exit fullscreen mode

The left outmost column is how many times this line of code was invoked. As expected, the for loop was executed 10000 times – 10 threads each reading 1000 blocks. Nothing new.

Though gcov was not so much useful for me, I’d like to say that it has really cool feature – branch probabilities. If you’ll launch gcov with -b option

[root@simplex block_hasher]# gcov -b block_hasher.c
File 'block_hasher.c'
Lines executed:77.69% of 121
Branches executed:100.00% of 66
Taken at least once:60.61% of 66
Calls executed:51.47% of 68
block_hasher.c:creating 'block_hasher.c.gcov'
Enter fullscreen mode Exit fullscreen mode

You’ll get nice branch annotation with probabilities. For example, here istime_diff function

113 function time_diff called 10 returned 100% blocks executed 100%
114 10: 106:double time_diff(struct timespec start, struct timespec end)
115 -: 107:{
116 -: 108: struct timespec diff;
117 -: 109: double sec;
118 -: 110:
119 10: 111: if ( (end.tv_nsec - start.tv_nsec) < 0 )
120 branch 0 taken 60% (fallthrough)
121 branch 1 taken 40%
122 -: 112: {
123 6: 113: diff.tv_sec = end.tv_sec - start.tv_sec - 1;
124 6: 114: diff.tv_nsec = 1000000000 + end.tv_nsec - start.tv_nsec;
125 -: 115: }
126 -: 116: else
127 -: 117: {
128 4: 118: diff.tv_sec = end.tv_sec - start.tv_sec;
129 4: 119: diff.tv_nsec = end.tv_nsec - start.tv_nsec;
130 -: 120: }
131 -: 121:
132 10: 122: sec = (double)diff.tv_nsec / 1000000000 + diff.tv_sec;
133 -: 123:
134 10: 124: return sec;
135 -: 125:}
Enter fullscreen mode Exit fullscreen mode

In 60% of if calls we’ve fallen in the branch to calculate time diff with borrow, that actually means we were executing for more than 1 second.


gprof and gcov are really entertaining tools despite a lot of people think about them as obsolete. On the one hand, these utilities are simple, they implement and automate an obvious method of source code instrumentation to measure functions hit count.

But on the other hand, such simple metrics won’t help with problems outside of your application like kernel or library, although there are ways to use it for an operating system, e.g. for Linux kernel. Anyway, gprof and_gcov_ are useless in the case when your application spends most of its time in waiting for some syscall (pread in my case).

To read

Top comments (0)