Stacks and Overflows
Vaidehi Joshi Feb 13 '17
When I was first learning to code, errors used to scare the hell out of me. My gut reaction would be to panic and immediately think that I had done something wrong. And that I had broken something. Or everything!
Well, part of that fear was true: I had usually done something wrong. But it took me a few months to not be so afraid of figuring out what exactly I had done wrong. That’s a lot of what my the early days of learning to code looked like, back when I was freaked out by debugging and would perpetually give up on myself before I had even started.
Before I enjoyed debugging my code, one of the errors that I ran into quite often — and sometimes still run into now — looked like this:
SystemStackError: stack level too deep!
At first, I didn’t really know what it meant, but I knew that it was bad. Stack level too deep? I mean, how deep were we really talking here?
But, as time went on, I started delving deeper into this error message and reading more about this thing called a stack that I seemed to be overflowing. That was the first time that I learned about something that’s referred to as the stack (_or sometimes the _call stack), which is the structure that stores all the things that happen when a program is executed — or all the things that happen during runtime.
It turns out, however, that there’s a lot more to the term stack. It’s actually it’s own abstract data type! And there’s so much to learn about it.
Teach me how to stack
So what is a stack, and how does it work? Well, just like a linked list, a stack is nothing more than a data structure that contains a whole bunch of elements. And, like linked lists, stacks are linear, which means that there is a sequence and an order to how they are constructed and traversed.
The interesting thing about how stacks work, is that they only really have one direction to them. That is to say, if we wanted to add elements to this structure, or if we wanted to remove elements from it, we’d have to do it starting from one place: the top of the stack.
We can think of it like a stack data type just like a stack of books. If we had a stack of fifty books, we’d have to add the fifty-first book to the top of our stack, right? We probably wouldn’t be trying to stick it at the bottom of the pile of books. That’s the same concept that applies to stack structures: the insertion and deletion of elements can only happen from one end of the stack — the top.
This behavior of inserting and deleting from the same location has a name, too — and sometimes, this name is used to refer to stacks as well: last in, first out.
The last in, first out principle (LIFO) refers to the rule that whichever element is the last one to go into the stack will be the very first one that has to go out.
Conversely, this also means that the first element that you put into the stack will always end up being the last one that goes out.
It seems like stacks and linked lists are…eerily similar, no? Well, there’s a good reason for this: stacks are often implemented by using a linked list! Because they can only “grow” in one direction — that is to say, they only allow us to add and remove elements through a single place — they lend themselves easily to behaving exactly like singly linked lists.
If we remember the characteristics of a linked list, we’ll recall that they have a head node, and that the space time complexity of adding elements to the beginning of the linked list is O(1), or constant time. That’s really nice, because no matter how big our stack might get, since we’re only adding and removing from the top (our head node), we can perform that single operation in relatively the same amount of time, even with a large stack.
Of course, a linked list isn’t our only option: we could also implement a stack using an array. But, there’s a drawback to this: arrays are static data structures. They require a set amount of memory and space to be set aside and allocated before they are created. The problem here is that stacks don’t have an upper-limit when it comes to their size: they can grow to be infinitely large!
So in that case, what happens when you implement a stack using an array as our data structure? Well, if we try to add more elements to our stack than our array has space for…things turn real bad, real fast.
Our stack won’t prevent us from adding another element (since it thinks that it can grow as large as it wants), but the array that’s implementing the stack won’t have the space that’s required to allocate enough memory to the new element that we’re adding! All of this leads to — you guessed it — a stack overflow! Which is never good. (And, incidentally, is related to my all-time favorite error:
Stack level too deep!! But more on that later.)
Because linked lists are dynamic data structures, whose memory can be stored anywhere, non contiguously, just as long as each node has a pointer referencing where in memory the next node lives, they are far easier to grow in size. Stack overflows are rather rare when a stack is implemented with a linked list. The only time that would really ever happen is if we ran out of all possible memory on our machines…in which case, we’d have a much bigger problem (like a memory leak) on our hands!
These two reasons — a constant space time complexity and the ability to grow easily in size —are why so many stacks are actually just implementations of linked lists under the hood.
Building up a stack
Cool, now that we know what a stack is, the question remains: how hard is it to build one? Good news! The answer is: pretty easy. Because we can only add and remove elements from one end of the stack (the top), there’s only a limited amount of functionality we really need to build a stack.
Regardless of which language a stack is implemented in, there are a couple functions that will almost always be needed:
- push: the function that’s used to add elements into the stack
- pop: the function that’s used to remove elements from the stack
- top (peek): a function that returns the first value (what’s on top of the stack), but doesn’t remove it
- isEmpty: a function that checks if the stack is empty or not — super helpful when trying to clear all the elements from a stack
- size: a function that returns the number of elements that are in a stack at any given time
And that’s pretty much it — those functions are, for the most part, the only things we’d really need to create a stack!
Which is great. But why do stacks matter? Where do they even show up in our developer lives? (Aside from annoying error messages, of course).
Real life overflows
It’s easy to build a stack and write one in the form of a linked list. But who even does that? For most higher-level programming languages, we don’t really find ourselves ever writing or implementing a stack (or even a linked list). And yet, they’re all around us, if only we took the time to look for them.
There are a few popular examples of stack implementations, and some of them are features that we each use multiple times in a day! For example, when you open a document or text editor, and use the undo/redo functionality, you’re really (on a super low level) pushing and popping “state” into a stack. The state of my article before I added this paragraph is one element in the stack, and the state of my article after I added this paragraph is the next. When I toggle between undo and then redo, I’m popping and then pushing an element onto the top of the stack that contains all states of my article’s “history”.
In a similar vein, when you opened your browser to dev.to’s website, and then clicked on an article, and then clicked the back arrow to navigate back to the dev.to homepage, you were toggling between two states, and popping and pushing states onto a stack! What’s especially cool about the browser here is that it has something called a History object that shows us a little bit of how this abstraction works. The Mozilla Developer Network has a well-documented API that explains how their implementation of browser history works. They actually have methods that reveal the stack that’s probably being used under the hood, for example:
pushState, replaceState, and a
popstate event. Sound familiar?
However, my favorite example of a stack in the wild is, of course, the call stack. The way that a call stack is built up is particularly interesting because this stack creation happens all the time — even when just a single function is called.
Here’s a super simplified example of how a call stack might work:
We have a function (
function_one), which defines a few local variables, and then passes those variables to another function as arguments. Inside of
function_one, the second, internal function (
function_two) does something similar: defines some local variables, and then passes those down to another function (
function_three) as arguments.
In our super simple example, we have only three functions in all, and only three different procedures that need to happen. When the call stack is built up run all of this code, each “procedure” corresponds to an element in a stack. The only big difference here is that things are named slightly differently.
function_one runs, it allocates some memory for its local variables, for any arguments that were passed to it, and the return address for whatever value will eventually be returned by
function_one . That whole section of memory that is allocated to
function_one is referred to as a stack frame.
function_one leans on
function_two ! So, it needs to pass some arguments to it and hand off control and execution power to that nested function. This means that
function_two also needs to allocate memory for any local variables, arguments, and its own return address. So
function_two also gets its own stack frame.
This process of allocating memory as a stack frame continues until we get to the end of the program’s execution. At this point, we’ve built up our stack with stack frame elements! What’s cool about this is that we’ve been pushing stack frames onto our call stack as we built it up.
So what happens when the function has finished running and we have a return value? Well, we need to tear down the stack. Since
function_three is the last element on the stack — on the top of the stack—its stack frame will have to be, according to the LIFO principle, the first one come off. As the stack is destroyed and torn down, each element (stack frame) is popped off the call stack. This process repeats and our call stack keeps handing off control from the internal function’s stack frames until we get back down to the bottom of our stack. At the bottom of our stack is the stack frame for
function_one: now we a return value for it, and that’s the last thing we end up with as we empty our stack.
Pretty amazing, right? Stacks and their elements can be anything — even subsets of entire program processes, just like in our example!
So, hopefully by now we might have some ideas as to what might have lead to my very first stack overflow and that
stack level too deep error all those years ago. My guess is that I probably wrote a recursive function where I named a local variable the same thing as a function, and my call stack kept trying to allocate memory until it completely ran out and…overflowed.
To be completely honest though, I’m kind of glad that it did. Because otherwise, I would have never known how deep the stack really goes.
Do you want to overflow All Of The Things? (Or maybe just read up on stacks and everything there is to know about them?) Check out these awesome resources:
- Stack Applications, Professor Gerald Kruse
- Call Stack, Scope & Lifetime of Variables, ReelLearning
- Stack — Linked-List Implementation, Professor Robert Pitts
- Reverse a string or linked list using stacks, My Code School
- Pushing and Popping with the History API, Mike Robinson
- How does a “stack overflow” occur and how do you prevent it?, StackOverflow