Recursion
In programming, there are two types of looping, recursion and iteration. Iteration is the most common form of looping, using your basic while
and for
loops or more complicated generators. Recursion however is when a function is called within its own definition.
Recursion occurs when an algorithm to solve a problem involves repeating the same steps in a diminishing pattern, until you get to a condition where you can break out of the recursion.
To illustrate, let's consider multiplication as being solvable by recursive addition.
x * y = x + x(y-1) + x(y-2) + . . . x(y-n) where y-n > 1
In any form of multiplication, x * y, the multiplication can be thought of as x plus x times y minus one. Repeat, until y is one. Let's take a look with actual numbers:
x = 3, y = 4
3 * 4
= 3 + 3 * 3
= 3 + 3 + 3 * 2
= 3 + 3 + 3 + 3 * 1
= 3 + 3 + 3 + 3
= 12
Note how it is recursive because you're repeating the same steps over and over again as the value of y diminishes. When y is 1, we have our base case where we break out of the recursion. When this happens, the recursion ends, and we simply add up all the threes.
The Stack
So how does this work programmatically? Well, whenever you call a function, a new frame is added to the stack. The stack is a region in memory that handles your functions. When the function completes, it's output value, if it has one, is returned to the stack that called it. This is why the keyword return is named what it is. We're returning to the lower stack frame.
At the bottom is our problem, 3 * 4. To solve this issue, we add a new frame to the stack and input the next step of the problem, which is the bold part of 3 + 3 + 3 * 3. Meanwhile, the current frame will wait for the frame above to return. That frame, will then do the same, making a new frame and inputting the bold part of 3 + 3 + 3 + 3 * 2. And again, the frame will wait for the frame above to return. This continues until we reach the base case.
If there is no base case, the frames will stack on each other until the stack runs out of memory, which leads to a stack overflow error.
Once we reach the base case, the frames start returning. The top frame will return a 3, and pass it to the frame beneath it, which can then finish its calculation and return a 6, and so on and so forth until the frame at the bottom finally receives the returned value it needs and returns a 12.
In the image above, you can see the whole process laid out. We start at "start" and go up, creating new frames until we reach the base case, then we start going down the stack back to the bottom, returning the value of the function in the frame above it all the way down.
Implementation
Most languages support recursion. As Python is one of the easiest languages to read, I'll show you how this can be implemented in programming.
The first thing to think about is, where does the recursion end? This is called the base case. In the above example of 3 * 4
, our base case was when we reached 3 + 3 + 3 + 3 * 1
. In terms of x * y
, it's when y = 1
.
If x is 3:
3 * y
= 3 + 3 * (y - 1)
= 3 + 3 . . . + 3 * y: y = 1
= 3 + 3 . . . + 3
When y is 1, the recursion is complete, and the stack can collapse to the value we want.
So now we have the base case, we also need to know which part is the recursion. And that part is the x(y - 1)
portion of x + x + ... + x(y - 1)
of our equation. This is the part that repeats in each step. The repetition of x + x...+x is resolved as the stack collapses.
Here's the most basic version of our function, in python:
def mult(x, y):
if y == 1:
return x
return x + mult(x, y - 1)
if y is one, we're at our base case. Because x * 1
is x
, we can just return x
. Otherwise we're going to add the recursive portion of our algorithm, x(y-1)
, to x and return that value. x(y-1)
translates to mult(x, y-1)
in this case.
Now we can just call the function and give it a try:
import sys
def mult(x, y):
if y == 1:
return x
return x + mult(x, y - 1)
if __name__ == '__main__':
x,y = [int(i) for i in sys.argv[1:]]
print(mult(x,y))
Here I have it set up so we can call the function with arguments in the command line. We're using a list comprehension to turn those arguments into integers and unpacking them into variables x and y, which we then pass into the mult()
function, and print to the screen.
We have to slice
sys.argv
because the zeroth value is always the name of the file, we only want the first and second arguments.
We call with python3 multiply.py 3 4
and it will display a 12.
The command might be different on your machine, depending on how python is installed and aliased in your PATH, and how you've named the file. The syntax is
[Python] [filename.py] [x] [y]
When we enter 3 and 4 as x and y, mult(3,4)
is called and placed in a new frame on the stack. This function in turn calls itself with mult(3, 4-1)
and pushes that on the stack. And this repeats until mult(3,1)
is called, where y is 1 and the base case is met.
This final base case will return a 3
to the frame beneath it. The function in that frame can now finally complete its return statement and return a value to the frame beneath it, and so on and so forth, collapsing to the bottom. What we end up with is 3 + 3 + 3 + 3
, which is 12.
Improving our function
However our function only works with positive or zero values of y. Negative values will always result in a stack overflow (or in Python, RecursionError: maximum recursion depth exceeded in comparison
)
We can make our function work with negatives. The logic is the same, we just have to switch some of the signs around because we're going from left to right on the number line now, whereas with positive values, we're going from right to left. x
needs to be made negative and y
needs to increase to positive 1 rather than decrease to positive 1.
3 * -4
= -3 + 3 * -3
= -3 - 3 + 3 * -2
= -3 - 3 - 3 + 3 * -1
= -3 - 3 - 3 - 3
= -12
Or in Python, we add a new if
statement:
import sys
def mult(x, y):
if y == 1:
return x
if y >= 1:
return x + mult(x, y - 1)
else:
return -x + mult(x, y + 1)
if __name__ == '__main__':
x,y = [int(i) for i in sys.argv[1:]]
print(mult(x,y))
Now it works with negative values as well. Negative values in x
are handled by the natural way negatives work in addition and subtraction, so we don't need to worry about that scenario. Plug in any integer and this function should work.
Where Do We Use Recursion?
Of course, this function is useless, Python, and most other programming languages, provides us with a multiplication operator and we don't need to worry about the internal implementation of it.
But recursion itself is useful wherever you see a repeating pattern where the steps for solving a problem involves repeating the steps of an algorithm again and again until diminishing to some base case. Another example is a factorial.
Factorials in math are defined as:
x! = x * (x-1) * (x-2) ... * (x - [x-1])
5! = 5 * (5 -1) * (5 - 2) * (5 - 3) * (5 - 4) = 120
5! = 5 * 4 * 3 * 2 * 1 = 120
And in Python:
def fact(x):
if x == 1:
return x
return x * fact(x - 1)
Here you can see a repeating pattern where the steps to solve are the same and diminishing towards a base case. This makes recursion a good tool to solve. When you can see this pattern, you know recursion is a tool you can use to implement a solution.
In most cases where recursion is used, iteration can also be used. Iteration is often cleaner, easier to maintain and understand, and is usually more performant, which is why I prefer iteration. Sometimes however, recursion really is the simplest solution, even against iteration.
Take for example Minesweeper. In Minesweeper, when you click on a blank square, all the squares next to that blank, that are also blank, need to be uncovered. Those squares in turn need to find their neighbors and if they're blank, they need to be uncovered. And then those squares also need to find their neighbors and and then be uncovered, so on and so forth until all the blanks in a contiguous region are uncovered. If you'll note, the steps of the algorithm are repeated over and over again, each square needs to find its neighbors, and each of those neighbors need to find their neighbors, this is the recursive part of the algorithm. The base case is encountered when no more neighboring squares are found that are blank.
Here it makes the most sense to solve the problem with recursion. This was a problem I solved when I implemented a Minesweeper game in React. You can take a look here: ReactSweeper Context Class. Within the clearBlanks
method in our context, is a getAllAdjacentBlanks
method. It's not a clean method and can clearly use further refactoring, but the point is you can see how it has a base case, if indicesToBeCleared
has no elements, and a recursive case otherwise where it calls itself passing in the new found blank squares.
The point is, where I can see a repeating pattern, I can usually find a solution with recursion, so long as there's a base case that causes the recursion to stop and then collapse back down to the value I'm looking for. In multiplication, that value is the product of two numbers. In factorial, it's the factorial of a number, and in getAdjacentSquare
, it's an array of all the squares that need to be revealed when a blank is clicked.
Where you see this repeating pattern, is where you can start thinking about recursion. But be careful. Interation with loops and generators is usually the simpler and more performant solution and they don't run the risk of stack overflow errors (though they do run the risk of infinite looping.)
Top comments (0)