DEV Community

loading...
Cover image for Under The Hood : Assigning Values to Global Pointers

Under The Hood : Assigning Values to Global Pointers

yjdoc2 profile image YJDoc2 ・6 min read

Hello!
This is the fifth post in the series. In case you have not read the previous post in this series, I would recommend you do, as this builds on the parts of previous, and skips the details explained in the previous posts.

I'm learning this all as I write this, If you find any mistakes or have any suggestions and improvements, please let know in comments.

In this post, we will see how the global pointer with assigned values in a c file are converted to assembly.

Pointers

Pointers are simply variables, which holds address of another variable. One can also chain pointers, which is sometimes used for implementing linked-lists in C, where we store location of variable, which itself contains location of another variable. This would be called a 'pointer to a pointer'.

In previous article, we saw how assigned values to simple global variables like int, float, char etc. , here we will see how values assigned to pointers are converted to assembly.

Simple Global pointer

Let us first look at a simple (incorrect) example

int *a = 5;
Enter fullscreen mode Exit fullscreen mode

This is not a valid code, as it tries to store value 5, which can point to any place in code, probably 6th byte from start of data segment, or even code segment, which will give a segmentation fault if we add a main function accessing this value,it will crash with a segmentation fault.
But let us see the generated code anyway πŸ˜„

    .globl  a
    .data
    .align 8
    .type   a, @object
    .size   a, 8
a:
    .quad   5
Enter fullscreen mode Exit fullscreen mode

Most part of this is similar to global variables, except the size, which is 8 byte, as the machine I am compiling this on is 64 bit, hence all addresses will be 64 bit, that is, 8 byte.
Another difference is value stored in variable a, as .quad 5. .quad is used to store an eight byte value in place, and it stores the value we gave to it, 5 , in its place.

Now let us see an actual example

int a =5;
int *b = &a;
Enter fullscreen mode Exit fullscreen mode

This generates assembly

    .globl  a
    .data
    .align 4
    .type   a, @object
    .size   a, 4
a:
    .long   5
    .globl  b
    .section    .data.rel.local,"aw"
    .align 8
    .type   b, @object
    .size   b, 8
b:
    .quad   a
Enter fullscreen mode Exit fullscreen mode

The first part declaring the variable a is same as explained in last article.
Second part starts with .globl b , which declares b as a global variable.
.section is used to declare section in a program which then can be given specific permission of read-write etc. .section .data.rel.local,"aw" tells assembler to assemble the following section into a section named .data.rel.local, which indicates that it stores data which is relocatable. This means, that in current assembly file, all mentions of this data is by label, as when the file will be linked by linker later, these sections will be moved from their current place to some other place, and only after that, the linker should calculate and substitute their addresses.
The "aw" specifies type of the section, where a is for allocatable, saying that at runtime space should be reserved for this section. A non-allocatable section would be where there is no need to allocated space for it at runtime, such as section containing debugging data. The "w" indicates that this section is writable, implicitly indicating readable as well.
After this the alignment, type and size of b is defined. The value stored here is .quad a, which is not evaluated by assembler, but by the linker later , when it links this file. At that time, Linker will decide the address of label a in the complete executable file, and then substitute it to all occurrences.

If we define another variable after b,

int a = 5;
int *b = &a;
int c = 5;
Enter fullscreen mode Exit fullscreen mode

in assembly, we switch section again by using .data before declaring variable c :

    .globl  b
    .section    .data.rel.local,"aw"
    .align 8
    .type   b, @object
    .size   b, 8
b:
    .quad   a
    .globl  c
    .data
    .align 4
    .type   c, @object
    .size   c, 4
c:
    .long   5
Enter fullscreen mode Exit fullscreen mode

The pointer for float, double and a single char are same as above.
When defining a char pointer to a string, that is:

char *a = "Hello";
Enter fullscreen mode Exit fullscreen mode

the assembly changes as :

    .globl  a
    .section    .rodata
.LC0:
    .string "Hello"
    .section    .data.rel.local,"aw"
    .align 8
    .type   a, @object
    .size   a, 8
a:
    .quad   .LC0
Enter fullscreen mode Exit fullscreen mode

Here, after .globl a , there is a section change to .section .rodata, saying that the following data should be placed in read only data section.After that there is a label .LC0.
According to stackoverflow answer,

Here is some supplement to @Thomas Pornin's answer.

  • .LC0 local constant, e.g string literal.
  • .LFB0 local function beginning,
  • .LFE0 local function ending,

The suffix of these label is a number, and start from 0.

This is gcc assembler convention.



'LC' indicates Local Constant, and zero is the number of constant. After that the string is stored using .string directive as seen in previous article, meaning, it is by default ended with a zero byte.

Then we change the section to .data.rel.local again, and now we store label of that constant, LC0 in value of the pointer.

Static Global Pointers

A static type pointer to normal variable ,a normal pointer to a static point variable, and a static pointer to a static variable, all are similar to normal variable, except the missing .globl in respective variables

int a = 5;
static int *b = &a;
Enter fullscreen mode Exit fullscreen mode

Will have a missing .globl on b,

static int a = 5;
int *b = &a;
Enter fullscreen mode Exit fullscreen mode

will have .globl missing on a,

static int a = 5;
static int *b = &a;
Enter fullscreen mode Exit fullscreen mode

will have .globl missing from both.

Global Constant pointers

Let us first see a constant pointer to a non-constant variable:

int a =5;
const int *b = &a;
int g = 5;
Enter fullscreen mode Exit fullscreen mode

Surprisingly (for me) this does make any changes to assembly generated, neither this stops from writing

void main(){
     b = &g;
}
Enter fullscreen mode Exit fullscreen mode

This compiles and runs without any error.
But this :

void main(){
     *b = &g;
}
Enter fullscreen mode Exit fullscreen mode

Will generate a compile time error as :

global_assg.c: In function β€˜main’:
global_assg.c:5:18: error: assignment of read-only location β€˜*b’
    5 | void main() { *b = g; }
      |  
Enter fullscreen mode Exit fullscreen mode

Which means, that a constant pointer stops from modifying value of what it points to by using it. We can change value of a by directly using a, but we cannot change value of a by using *b = something;

A normal pointer to constant entity:

const int a =5;
int *b = &a;
Enter fullscreen mode Exit fullscreen mode

will generate a compile time warning:

global_assg.c:2:10: warning: initialization discards β€˜const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
    2 | int *b = &a;
      |
Enter fullscreen mode Exit fullscreen mode

but will compile nonetheless. As a itself is const, it is placed into rodata section,hence any attempt to change value of a using *b = something; will result in a runtime segmentation fault.

const a = 5;
const int *b = &a;
Enter fullscreen mode Exit fullscreen mode

const pointer to const variable compiles without any warnings or error.

Global Array pointers

Now in array of pointers :

int b,c,d;
int *a[] = {&b,&c,&d};
Enter fullscreen mode Exit fullscreen mode

Will be compiled as :

    .globl  a
    .section    .data.rel.local,"aw"
    .align 16
    .type   a, @object
    .size   a, 24
a:
    .quad   b
    .quad   c
    .quad   d
Enter fullscreen mode Exit fullscreen mode

Which follows from both, pointer as well as arrays : it is put in section same as pointer, but declared same as array, with size of 3 pointers, 8*3 = 24 bytes. The values are stored as labels for d,c and d.

Adding static to a will simply remove .globl from above, and adding const to array will stop any code assigning value through the array that is *a[1] = something; from compiling. Other combinations of const give results similar to non-array pointers mentioned above.

Pointers To Pointers

A code like

int a =5;
int *b = &a;
int **c = &b;
Enter fullscreen mode Exit fullscreen mode

Will have assembly code for a and b same as before, and for c it will generate :

    .globl  c
    .align 8
    .type   c, @object
    .size   c, 8
c:
    .quad   b
Enter fullscreen mode Exit fullscreen mode

Where in the value of c, label b is used.

This is how global assigned pointers are converted to assembly when compiling from c.

Again,I'm learning this all as I write this, If you find any mistakes or have any suggestions and improvements, please let know in comments.

Thank you !

NOTES :

  • This is an explanation of .data.rel section.
  • This is a good explanation on what is a relocatable section.
  • This is a short explanation on allocatable and loadable section.
  • This video has a nice explanation on ELF files and segments and sections in it : https://youtu.be/nC1U1LJQL8o

Discussion (0)

pic
Editor guide