DEV Community

Discussion on: I'm an Expert in Memory Management & Segfaults, Ask Me Anything!

Collapse
 
codemouse92 profile image
Jason C. McDonald

Ahh, the joys of a Heisenbug. Unfortunately, you seem to have gotten a particularly shy one. There won't be a way to directly observe it. However, your clever investigative efforts with the core dump have pointed you to the responsible realloc(), and thus the pointer giving you trouble, so there really isn't much more that Valgrind could have told you.

Thinking through it, I wonder if the free is causing issues because the memory at the original pointer was partially or completely uninitialized? Desk check the code allocating, initializing, and accessing that pointer, preferably in execution order (or close to it). Remember that a segfault is an operating-system-level error raised because something attempted to access memory not belonging to the program - or, more specifically, trying to use a memory address in the "protected space" that the OS set aside around the memory assigned to the process.

I hope that helps.

Collapse
 
fermut profile image
sandlocker

Hello Jason,
Thanks a lot for the reply and the suggestions.

This is the gdb/core dump session:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `executable'.
Program terminated with signal SIGSEGV, Segmentation fault.

0  0x00007f629d991df8 in mremap_chunk (p=0x7f62938d3000,

    p@entry=0x7f629960c000, new_size=, new_size@entry=3787904)
    at malloc.c:2878
2878 malloc.c: No such file or directory.
(gdb) up

1  0x00007f629d996a70 in GI_libc_realloc (oldmem=0x7f629960c010,

    bytes=3787896) at malloc.c:3206
3206 in malloc.c
(gdb) up

2  0x00007f629c37cadb in ?? ()

   from /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.470.129.06
(gdb) up

3  0x000056491c746ab6 in MyClass::allocateF (this=0x56491d35bdf0, n=157829)

    at program.cpp:46
46      xyz=(double*)realloc((void*)xyz,3*n*sizeof(double));

(gdb) q

It looks like realloc() goes into the library libnvidia-tls(??) which is strange (at least for me).

So to test things out I created myrealloc() in a different .cpp file, by just doing malloc, memcpy and free and the segfault disappeared although I am not sure if that really solved the problem or if I just pushed the bug away for now.

My first thought was that there is a mismatch between the malloc() that creates the pointer in the first time (maybe from a different library than libnvidia-tls) and the realloc() that is being called here (as it is fairly easy to redefine realloc() with a macro). However I think this would have also generated a segfault within gdb/valgrind. 

Have you seen something like this before ?

Thanks again!