While many programmers are familiar with the concept of the stack, a common question arises: Why is it faster than the heap, even though both reside in the same physical memory? This article aims to answer those questions to the very low level to satisfy the thirst for understanding of some curious individuals out there.
Let’s not waste any more time and dive into the depth of the stack memory architecture.
📚What is the stack?
The stack memory, allocated for every local variable and function calls, follows a strict, well-known, Last in first out (LIFO)data structure: the latest item has to be removed first in order to access the previous stack. Each segment of the stack, which corresponds to each allocations and scope, is called a “stack frame”.
(Figure 1: stack data structure illustration)
🧪An Example in C and Assembly.
To better understand the stack memory, we need an example to follow by and explain.
Supposed you call a function - add():
int add(int j, int k) {
int a = 10;
int b = j + k;
return b;
}
Which compiles to the following assembly(I promise it’s not as scary, I will explain them through) looks like:
add:
push {r11, lr}
mov r11, sp
sub sp, sp, #16
str r0, [r11, #-4]
str r1, [sp, #8]
mov r0, #10
str r0, [sp, #4]
ldr r0, [r11, #-4]
ldr r1, [sp, #8]
add r0, r0, r1
str r0, [sp]
ldr r0, [sp]
mov sp, r11
pop {r11, lr}
bx lr
To understand the stack and how it’s allocated, we need to introduce some terminology and conventions first:
📚 The stack frame is a region of memory allocated on the call stack for each function call in a program, which in the lower level, is the area between the frame pointer
fp
and the top stack pointersp
.☝️Frame pointer, together with the stack pointer, are the anchor points which the entire stack will operate relative to. They will never be changed throughout the entire process, and only change when the stack frame change ie. new stack allocation.
⏬The stack grows conventionally downwards, which means that when we allocate a stack on top of the existing one, the memory address of the new stack is lower than the previous one.
📝
r11
refers to a register in this case used as the frame pointer, andsp
refers to the stack pointer
🧱Breaking down the assembly
Every time a stack has to be allocated the computer does the following
push {r11, lr}
mov r11, sp
sub sp, sp, #16
push {r11, lr}
instructs the processor to put anything insider11
which is currently the previous stack’s frame pointer, andlr
(link register used for returning from a function) onto the stack and decrements sp by 4 per each item, 8 in total.mov r11, sp
sets the frame pointer to the stack pointerFinally
sub sp, sp, #16
decrementssp
by 16 to make space for the 16 bytes we need for the function’s local variables and parameters.
(Figure 2: stack allocation step by step illustration)
This concludes the stack allocation. The memory region between the stack pointer (sp
) and the frame pointer (r11
) constitutes the current function’s stack frame.
🔠Variables
On line 7, we can see the assembly associated with initializing the variable a
to 10:
str r0, [sp, #4]
This instruction stores the value from register r0
, which currently holds the value 10 from the previous mov
instruction, into the memory location at an offset of 4 bytes from the stack pointer.
This convoluted way of refering to a place in memory is required for the flexibility of the function to be called whenever and wherever in the code, since all the local variables would be just offsets in the frame itself. Different compilers have different approaches to how to use the stack, but essentially they all use location relative to the stack position for local variables.
You might wonder where do these memories come from? The truth is, all processes on the system get preallocated a stack space it can use without requesting for more memory, by default about 1 - 8MB depends on the operating system. This means that programs already have the stack memory they need already allocated to them, but this also means that the stack is limited to only the space allocated for them at the beginning of the program. This limited preallocation is the reason why using too much of the stack memory (ie. from deep recursive function or an array too big) often leads to a stack overflow error.
The mechanics used to allocate the memory being essentially moving pointers up and down requiring only a few instructions is a major factor contributing to the exceptionally quick allocation of the stack memory, compared to heap memory which requires complicated book keepings of the entire heap area and operating system calls for more memory, making the stack memory far better suited for local small allocation such as variables, which does not need to persist through to other scope of the program.
✅To Summarize
- Stack allocation is fast due to simple pointer arithmetic.
- The CPU provides specific instructions optimized for stack operations.
- Stack memory is preallocated, eliminating the need for dynamic memory requests during execution.
- Stack size is determined at program start, and memory is automatically reclaimed when a function exits.
Top comments (1)
Wow, what a useful article! Thank you for the information.