arxstax

Posted on Jan 31, 2021

Using Linker .map Files

#esp32 #embedded #gcc #debugging

This short lesson is for those that are not familiar with .map files and how they can be used to help debug your code in an embedded system.

The Motivation

On systems that have high-level operating systems such as Windows/Linux, a programmer that encounters a critical fault during execution of their code has some pretty nice debugging tools in their arsenal to figure out what happened. Sometimes these tools will actually let the user know that an issue will happen before a program is even run for the first time. Even better, the high-level language that the programmer is using may not even allow for these types of errors to be created - by accident or on purpose.

However, on an embedded target a programmer may not have these resources available. The hardware target may not allow for breakpoints, or an easy way to debug (JTAG/SWD/ICE/etc.) may not be available. Maybe all the programmer gets is "printf" or something similar.

The programming language used may not be of much assistance either. Embedded programs are often written in C, which will happily allow the programmer to commit serious atrocities. There are contests focused on the "power" of C.

I was recently assisting someone with a problem that they encountered while using an ESP32 target. They noticed that their program crashed as soon as they enabled a certain aspect of functionality. All that was produced during the crash was the ESP32's built in fault handler output, which was something like:

Guru Meditation Error: Core  0 panic'ed (StoreProhibited). Exception was unhandled.
Core 0 register dump:
PC      : 0x400d398e  PS      : 0x00060430  A0      : 0x800d399c  A1      : 0x3ffb4e40  
A2      : 0x00000000  A3      : 0x00001800  A4      : 0x3ffae720  A5      : 0x00000000  
A6      : 0x00060023  A7      : 0x00000000  A8      : 0x0000dead  A9      : 0x3ffb4e10  
A10     : 0x3ffb6a44  A11     : 0x000295a4  A12     : 0x3ffb6a40  A13     : 0x00000008  
A14     : 0x3ffaff5c  A15     : 0x3ffaff5c  SAR     : 0x00000014  EXCCAUSE: 0x0000001d  
EXCVADDR: 0x00000000  LBEG    : 0x00000000  LEND    : 0x00000000  LCOUNT  : 0x00000000  

ELF file SHA256: 0a4df3e068964284

Backtrace: 0x400d398b:0x3ffb4e40 0x400d3999:0x3ffb4e60 0x400d39a5:0x3ffb4e80 0x400d39b1:0x3ffb4ea0 0x400d39bd:0x3ffb4ec0 0x400d0aae:0x3ffb4ee0 0x400870a5:0x3ffb4f00

I realize that to the uninformed eye, this information may seem useless and highly cryptic. Cryptic it is. Useless it is not.

For targets like the ESP32 (and basically all other targets) this information holds some insight into what went wrong. It provides some clues on where to look.

Some very useful information is contained in the line:

Backtrace: 0x400d398b:0x3ffb4e40 0x400d3999:0x3ffb4e60 0x400d39a5:0x3ffb4e80 0x400d39b1:0x3ffb4ea0 0x400d39bd:0x3ffb4ec0 0x400d0aae:0x3ffb4ee0 0x400870a5:0x3ffb4f00

This line shows you the path the software took before it encountered its crash. Each address shows a location that the CPU called a function. Using these addresses you can build a "chain" of function calls leading up to the point where it croaked.

This is all fine and dandy, but how on earth do you line up the addresses contained in the backtrace to actual functions in your code? The magic decoder ring is a file called the .map file. This file is (or can be) generated when code is being translated into what gets executed by the CPU.

Some Background

Compiled code (like C/C++/Arduino) goes through a few steps on its way to becoming machine code. These steps are performed by the compiler (often gcc) and the compiler's helper tools.

The first phase for languages like C and C++ is looking at header files and performing any substitutions that might need to take place (for example like handling #define statements and what they "replace"). This is done for every file (.c or .cpp) that is given to the compiler.

The second phase takes the files made in step #1 and translates them into what is known as object code. You may see .o files and these are them. Object files are almost ready for execution. They contain machine code. But usually there are some problems with these files. They aren't 100% complete.

Let's say you have a function in one .c file that calls another function in a different .c file. Each .c file will have a corresponding .o file which is basically ready for the CPU to execute. But since each object file was created on its own, how does one object file "talk" to another object file?

That is handled by the third step - linking. The linker is a special program that takes object files and stitches them all together to make one big happy "executable". The linker resolves the function calls from one c file to another. It puts the functions all together and "fills in" the function addresses where the functions are called throughout the code.

When the linker does this, it makes a lot of decisions on its own. You can also tell the linker how you want things put together using what's called a linker command file (a must for many embedded systems), but that's a topic for another day.

When the linker makes its decisions, it can output what's called a .map file. This file is a "map" of what the linker made - which is the code you are trying to run.

It shows where it put your functions, certain variables you created, and anything else that got pulled in to make your code run on the target.

Example

Now you might start to see why this backtrace could be useful. If we look in the .map file for the addresses in the backtrace, we might see that these addresses are where our functions were put by the linker. And hence we can figure out where the crash happened.

To illustrate this, I'd like to show you a simple example. I'll use the ESP32 and it's IDF, but you could use any target where you can get both the bactktrace information and the .map file.

Here's the example:

void d(int *ptr)
{
    *ptr = 0xDEAD;  
}

void c(int *ptr)
{
    d(ptr);
}

void b(int *ptr)
{
    c(ptr);
}

void a(int *ptr)
{
    b(ptr);
}

void app_main()
{
    int *myPointer = NULL;

    a(myPointer);
}

In the above code, execution starts at app_main(). app_main() calls function a, which calls b, which calls c, which calls d. D is for dead. In function d, I intentionally "de-referenced a NULL pointer". This is just a fancy way of saying I tried to write data to a location in memory that I really shouldn't write to (whatever NULL is equal to, likely address zero).

On some targets this might be okay and not cause a crash right away. It could take days of running before a problem like this rears its ugly head. Fortunately for my demo, the ESP32 really doesn't like this and it dies immediately.

What I get over the serial port from the ESP32 is this:

Guru Meditation Error: Core  0 panic'ed (StoreProhibited). Exception was unhandled.
Core 0 register dump:
PC      : 0x400d398e  PS      : 0x00060430  A0      : 0x800d399c  A1      : 0x3ffb4e40  
A2      : 0x00000000  A3      : 0x00001800  A4      : 0x3ffae720  A5      : 0x00000000  
A6      : 0x00060023  A7      : 0x00000000  A8      : 0x0000dead  A9      : 0x3ffb4e10  
A10     : 0x3ffb6a44  A11     : 0x000295a4  A12     : 0x3ffb6a40  A13     : 0x00000008  
A14     : 0x3ffaff5c  A15     : 0x3ffaff5c  SAR     : 0x00000014  EXCCAUSE: 0x0000001d  
EXCVADDR: 0x00000000  LBEG    : 0x00000000  LEND    : 0x00000000  LCOUNT  : 0x00000000  

ELF file SHA256: 0a4df3e068964284

Backtrace: 0x400d398b:0x3ffb4e40 0x400d3999:0x3ffb4e60 0x400d39a5:0x3ffb4e80 0x400d39b1:0x3ffb4ea0 0x400d39bd:0x3ffb4ec0 0x400d0aae:0x3ffb4ee0 0x400870a5:0x3ffb4f00

Now let's go digging for the .map file. It's usually in the "build" directory or the place where the final linked executable is placed. For the ESP32 when using Espressif's IDF, it's in a directory called "build" that is created in your project folder. I'm sure its in a similar place for Arduino builds, and I'll update this post when I confirm.

Let's take a look at this map file. In it we see several sections. These sections show us what the linker had to do to make our executable. For example, the first section (line 1) starts with this:

Archive member included to satisfy reference by file (symbol)

This shows what libraries (files containing pre-compiled object files, common for things like the C library and the ESP32's IDF) the linker had to reference (and why it referenced them) to make the executable.

The sections in the linker file are one after another. The second section is titled:

Allocating common symbols

The sections can be quite long. In my example, the second section is on line 374.

For this example, we are really only interested in one of these sections:

Linker script and memory map

Did I mention these files are long? This section starts at line 6101 in my example.

Bringing Things Together

If we look through the section titled

Linker script and memory map

we will start to see things that seem familiar as we get further down the file. In my example, around line 10873, we see:

 .text.d        0x400d3988        0xa esp-idf/main/libmain.a(hello_world_main.c.obj)
                0x400d3988                d
 *fill*         0x400d3992        0x2 
 .text.c        0x400d3994        0xa esp-idf/main/libmain.a(hello_world_main.c.obj)
                                  0xe (size before relaxing)
                0x400d3994                c
 *fill*         0x400d399e        0x2 
 .text.b        0x400d39a0        0xa esp-idf/main/libmain.a(hello_world_main.c.obj)
                                  0xe (size before relaxing)
                0x400d39a0                b
 *fill*         0x400d39aa        0x2 
 .text.a        0x400d39ac        0xa esp-idf/main/libmain.a(hello_world_main.c.obj)
                                  0xe (size before relaxing)
                0x400d39ac                a
 *fill*         0x400d39b6        0x2 
 .text.app_main
                0x400d39b8        0xa esp-idf/main/libmain.a(hello_world_main.c.obj)
                                  0xe (size before relaxing)
                0x400d39b8                app_main
 *fill*         0x400d39c2        0x2

These are the lines we are looking for. This shows where the linker put the functions in our demo. They were placed as follows:

a - Starts at address 0x400d39ac, and has a size of 0xa. So it ends at 0x400d39b6.
b - Starts at address 0x400d39a0, and ends at 0x400d39aa.
c - Starts at address 0x400d3994, and ends at 0x400d399e.
d - Starts at address 0x400d3988, and ends at 0x400d3992.
app_main - Starts at address 0x400d39b8, and ends at 0x400d39c2.

The backtrace is ordered from the address where the crash occurred to the earliest function call. As I mentioned earlier, it shows the chain of function calls leading up to the point of failure.

Our original backtrace:

Backtrace: 0x400d398b:0x3ffb4e40 0x400d3999:0x3ffb4e60 0x400d39a5:0x3ffb4e80 0x400d39b1:0x3ffb4ea0 0x400d39bd:0x3ffb4ec0 0x400d0aae:0x3ffb4ee0 0x400870a5:0x3ffb4f00

Let's reformat it:

Backtrace: 
0x400d398b:0x3ffb4e40 (where crash happened)
0x400d3999:0x3ffb4e60 
0x400d39a5:0x3ffb4e80 
0x400d39b1:0x3ffb4ea0 
0x400d39bd:0x3ffb4ec0 
0x400d0aae:0x3ffb4ee0 
0x400870a5:0x3ffb4f00 (the first function call made)

Each line of the backtrace has two addresses, separated by a ':'. The address on the left is the address at which the crash occurred or a function call occurred. The address on the right is the stack pointer at the time of crash or when a function call occurred. We can ignore the stack pointer for now. It is useful for seeing other types of problems, which deserves its own writeup.

If we examine the function addresses, we see that the most recent address is 0x400d398b.

Looking at our list of functions from the map file, we can see that this address exists within function d(). d() starts at 0x400d3988 and ends at 0x400d392, and address 0x400d398b is in that range. So the crash occurred in d.

Going down the list, we can start to see the call chain:

Backtrace: 
0x400d398b:0x3ffb4e40 (inside function d(), point of the crash)
0x400d3999:0x3ffb4e60 (inside function c(), address where function d() was called)
0x400d39a5:0x3ffb4e80 (inside function b(), address where function c() was called)
0x400d39b1:0x3ffb4ea0 (inside function a(), address where function b() was called)
0x400d39bd:0x3ffb4ec0 (inside app_main(), address where function a() was called)
0x400d0aae:0x3ffb4ee0 (searching through map file shows this is inside main_task())
0x400870a5:0x3ffb4f00 (searching through map file shows this is inside vPortTaskWrapper())

So looking through our backtrace from the bottom to the top, we can see that the CPU did the following leading up to the crash:

vPortTaskWrapper called main_task
main_task called app_main
app_main called a
a called b
b called c
c called d
the crash happened inside of d, since there are no more addresses in the backtrace.

Assuming we had no idea the problem existed in function d(), we would go and examine the code closely and would likely find our problem.

If we couldn't find our problem, we could also look at the disassembly for function d and see the exact instruction that was called and caused the crash - a writeup for a different day.

DEV Community

Using Linker .map Files

The Motivation

Some Background

Example

Bringing Things Together

Top comments (0)

Read next

Thanks for the Memoize

Ulanzi TC001 - ESP32 Programming / Custom Arduino firmware

Day 52: Monitoring LLM Performance in Production

Simple Multi-page Website