DEV Community

Cover image for Recursive Backtracking For Combinatorial, Path Finding, and Sudoku Solver Algorithms
Jake Z.
Jake Z.

Posted on • Originally published at algodaily.com

Recursive Backtracking For Combinatorial, Path Finding, and Sudoku Solver Algorithms

This lesson was originally published at https://algodaily.com, where I maintain a technical interview course and write think-pieces for ambitious developers.

Backtracking Made Simple

Backtracking is a very important concept in computer science and is used in many applications. Generally, we use it when all possible solutions of a problem need to be explored. It is also often employed to identify solutions that satisfy a given criterion also called a constraint.

In this tutorial, I will discuss this technique and demonstrate it. We'll achieve understanding through a few simple examples involving enumerating all solutions or enumerating solutions that satisfy a certain constraint.

Let's start on the next step!

    Backtracking and Depth First Search

    In very simple terms, we can think of backtracking as building and exploring a search tree in a depth first manner. The root node of the tree, or the "path" to the leaf node, represents a candidate solution that can be evaluated. So as we traverse through each path, we're testing a solution. So in the diagram below, A -> B -> D is one possible solution.

    If the candidate path qualifies as a working solution, then it is kept as an alternative. Otherwise, the search continues in a depth first manner. In other words, once a solution is found, the algorithm backtracks (goes back a step, and explores another path from the previous point) to explore other tree branches to find more solutions.

    Efficiency Gains

    For constraint satisfaction problems, the search tree is "pruned" by abandoning branches of the tree that would not lead to a potential solution. Thus, we're constantly cutting down the search time and making it more efficient than an exhaustive or complete search. Let's now jump straight into how all of this is done via examples you might see on interview day.

      Combinatorial Problem: Finding N Combinations

      As a first problem, Iet's use a very simple problem from combinatorics-- can you find all possible N combinations of items from a set?

      In other words, given a set {1, 2, 3, 4, 5} and an N value of 3, we'd be looking for all combinations/subsets of length/size 3. In this case, they would be {1, 2, 3}, {1, 2, 4}, and so on.

      Note that the ordering is not important in a combination. so {1, 2, 3} and {3, 2, 1} are considered the same thing.

      Let's now look at the pseudo-code for this N-combination problem:

      routine: combos
      input: set
      output: display N combinations
      assumption: position of the first item is zero and result set is empty at start
      
      base case:
      1. If all combinations starting with items in positions < (size-N) have been printed. Stop
      
      recursive case:
      Combos(set,result)
      1. Repeat for each items i in the set:
          a. Put the item i in the result set
          b. if the result set has N items, display it
              else
              recursively call combos with (the input set without item i) and (the result set)
          c. Remove the item i from result set
      

        Implementation of Combinatorial Solution

        The diagram below shows how this pseudo code works for an input set {1, 2, 3, 4} and N=3.

        Notice how the search tree is built from {} (empty set), to {1} to {1, 2} to {1, 2, 3}.

        When {1, 2, 3} is found, the algorithm backtracks to {1, 2} to find all combinations starting with {1, 2}. Once that is finished the method backtracks to {1} to find other combinations starting with 1.

        In this case, the entire search tree is not stored, but is instead built implicitly. Some paths, where the possibility of finding more combinations is not possible, are abandoned. The method is elegant and its C++ implementation is shown here.

        Notice how in the base case 2 of the code, the exploration of combinations stops early on when the index of the set goes above a certain level. So in the tree above, the solutions {3} and {4} won't be explored. This is what makes the algorithm efficient.

        #include <iostream>
        #include <vector>
        #include <string>
        
        using namespace std;
        
        // helper: prints the vector
        void printVector(vector<int>& arr)
        {
            cout << "\n";
            for (int i = 0; i < arr.size(); ++i)
                cout << arr[i] << " ";
            cout << "\n";
        }
        
        // helper function:
        // prints all possible combinations of N numbers from a set
        void combosN(vector<int>& set, int N, vector<int>& result, int ind)
        {
            // base case 1
            if (ind >= set.size())
                return;
            // base case 2
            if (result.size() == 0 && ind > set.size() - N)
                return;
            for (int i = ind; i < set.size(); ++i) {
                result.push_back(set[i]);
                if (result.size() == N)
                    printVector(result); // print the result and don't go further
                else // recursive case
                    combosN(set, N, result, i + 1);
                result.pop_back();
            }
        }
        
        // To be called by user: all possible combinations of N numbers from a set
        void combosN(vector<int>& set, int N)
        {
            vector<int> result;
            combosN(set, N, result, 0);
        }
        
        int main() {
          vector<int> v = {1, 2, 3, 4};
          combosN(v, 3);
        }
        

          Combinatorial Problem With A Constraint: Finding N Combinations with Sum < S

          Let's now add a constraint to our N combinations problem! The constraint is-- that all sets where sum < S (S being a given parameter) should be printed out.

          All we need to do is modify the combosN code, so that all combinations whose sum exceeds S are not explored further, and other such combinations are not generated. Assuming the array is sorted, it becomes even more efficient.

          We've illustrated backtracking via arrays to keep things simple. This technique would work really well for unordered linked lists, where random access to elements is not possible.

          The tree below shows the abandoned paths {3, 10} and {5, 8}.


          // sum should be less than target of the argument.  Rest is the same as combosN function
          void combosNConstraint(vector<int>& arr, vector<int>& subsets, int ind, int target)
          {
              if (ind == arr.size())
                  return;
              for (int i = ind; i < arr.size(); ++i) {
                  subsets.push_back(arr[i]);
                  // do a recursive call only if constraint is satisfied
                  if (sum(subsets) <= target) {
                      printVector(subsets);
                      combosNConstraint(arr, subsets, i + 1, target);
                  }
                  subsets.pop_back();
              }
          }
          

            Enumerating Paths Through a Square Grid

            Our next combinatorial problem is that of printing all possible paths from a start location to a target location.

            Suppose we have a rectangular grid with a robot placed at some starting cell. It then has to find all possible paths that lead to the target cell. The robot is only allowed to move up or to the right. Thus, the next state is generated by doing either an "up move" or a "right move".

            Backtracking comes to our rescue again. Here is the pseudo-code that allows the enumeration of all paths through a square grid:

            routine: enumeratePaths
            input: Grid m*n
            output: Display all paths
            assumption: result is empty to begin with
            
            Base case 1:
            1. If target is reached then print the path
            Base case 2:
            2. If left or right cell is outside the grid then stop
            
            Recursive case:
            1. Add the current cell to path
            2. Invoke enumeratePaths to find all paths that are possible by doing an "up" move
            3. Invoke enumeratePaths to find all paths that are possible by doing a "right" move
            4. Remove the current cell from path
            

              Square Grid Implementation

              To see how the previous pseudo-code works, I have taken an example of a 3x3 grid and shown the left half of the tree. You can see that from each cell there are only two moves possible, i.e., up or right.

              The leaf node represents the goal/target cell. Each branch of the tree represents a path. If the goal is found (base case 1), then the path is printed. If instead, base case 2 holds true (i.e., the cell is outside the grid), then the path is abandoned and the algorithm backtracks to find an alternate path.

              Note: only a few backtrack moves are shown in the figure. However, after finding the goal cell, the system again backtracks to find other paths. This continues until all paths are exhaustively searched and enumerated.

              The code attached is a simple C++ implementation of enumerating all paths through an m * n grid.

              // helper recursive routine
              void enumeratePaths(int rows, int cols, vector < int > & path, int r, int c) {
              
                path.push_back(c + cols * r);
                // base case 1
                if (r == rows - 1 && c == cols - 1) {
                  printVector(path);
                  return;
                }
                // base case 2
                if (r >= rows) // out of bound. do nothing
                  return;
                // base case 2
                if (c >= cols) // out of bound. do nothing
                  return;
              
                // row up
                enumeratePaths(rows, cols, path, r + 1, c);
                // backtrack
                path.pop_back();
                // column right
                enumeratePaths(rows, cols, path, r, c + 1);
                path.pop_back();
              }
              // to be called by user
              void enumeratePathsMain(int rows, int cols) {
                vector < int > path;
                enumeratePaths(rows, cols, path, 0, 0);
              }
              

                Find Path Through a Maze

                We can extend the prior problem to find the path through the maze. You can think of this problem as the grid problem, but with an added constraint. The constraint is this-- that some cells of the maze are not accessible at all, so the robot cannot step into those cells.

                Let's call these "inaccessible" cell pits, where the robot is forbidden to enter. The paths that go through these cells should then be abandoned earlier on in "the search". The pseudo-code thus remains the same with one additional base case, which is to stop if the cell is a forbidden cell.

                routine: enumerateMaze
                input: Grid m * n
                output: Display all paths
                assumption: result is empty to begin with
                
                Base case 1:
                1. If target is reached then print the path
                Base case 2:
                2. If left or right cell is outside the maze then stop
                Base case 3:
                3. If the cell is a pit then stop
                
                Recursive case:
                1. Add the current cell to path
                2. Invoke enumerateMaze to find all paths that are possible by doing an "up" move
                3. Invoke enumerateMaze to find all paths that are possible by doing a "right" move
                4. Remove the current cell from path
                

                  The figure below shows how paths are enumerated through a maze with pits. I have not shown all the backtracking moves, but the ones shown give a fairly good idea of how things are working. Basically, the algorithm backtracks to either a previous cell to find new paths, or backtracks from a pit to find new paths.

                  The C++ code attached is an implementation of enumerating all paths through a maze, which is represented as a binary 2D array. The main function that we can call is enumerateMazeMain and you can add a function to initialize the maze differently. The main recursive function translated from the above pseudo-code is the enumerateMaze function.


                  class mazeClass {
                      vector<vector<int> > maze;
                  
                      void enumerateMaze(vector<int>& path, int r, int c)
                      {
                  
                          path.push_back(c + maze.size() * r);
                          // base case 1
                          if (r == maze.size() - 1 && c == maze[0].size() - 1) {
                              printVector(path);
                              return;
                          }
                          // base case 2
                          if (r >= maze.size()) // out of bound. do nothing
                              return;
                          // base case 2
                          if (c >= maze.size()) // out of bound. do nothing
                              return;
                          // base case 3
                          if (!maze[r][c])
                              return;
                  
                          // row up
                          enumerateMaze(path, r + 1, c);
                          // backtrack
                          path.pop_back();
                          // column right
                          enumerateMaze(path, r, c + 1);
                          path.pop_back();
                      }
                  
                  public:
                      // set up the maze.  Change arrmaze to define your own
                      void mazeInitialize()
                      {
                          int arrmaze[] = {
                              1,
                              1,
                              1,
                              1,
                              1,
                              0,
                              1,
                              1,
                              1,
                              0,
                              0,
                              1,
                              1,
                              1,
                              1,
                              1
                          };
                          vector<int> temp;
                  
                          int ind = 0;
                          for (int i = 0; i < 4; i++) {
                              temp.clear();
                              for (int j = 0; j < 4; ++j) {
                                  temp.push_back(arrmaze[ind]);
                                  ind++;
                              }
                              maze.push_back(temp);
                          }
                      }
                  
                      // main function to call from outside
                      void enumerateMazeMain()
                      {
                          vector<int> path;
                          if (maze.size() == 0)
                              mazeInitialize();
                          enumerateMaze(path, 0, 0);
                      }
                  };
                  
                  // to call this function use:
                  // mazeClass m;
                  // m.enumerateMazeMain();
                  

                    Solving Sudoku

                    The last example in this tutorial is coming up with a solution to one of my favorite combinatorial games-- Sudoku-- via backtracking!

                    Sudoku is a classic example of a problem with constraints, which can be solved via backtracking. It works like magic! To simplify the problem, let's use an easier version of the sudoku game.

                    We can model the game as an N * N grid, each cell having numbers from 1 .. N.

                    The rule is not to repeat the same number in a column or row. The initial sudoku board has numbers in some cells, and are empty for the rest. The goal of the game is to fill out the empty cells with numbers from 1 .. N, so that the constraints are satisfied. Let us now look at how backtracking can be used to solve a given Sudoku board.

                    Routine: solve
                    Input: Sudoku board
                    Rule: No repetition of a number in the same row or column
                    Assumption: The initial board configuration is according to Sudoku rules
                    
                    Base case:
                    1. If all empty places are filled return success
                    2. If all combinations are tried and the board is invalid, return false
                    
                    Recursive case (returns success or failure):
                    1. Choose an empty cell
                    2. For each candidate number i in the range 1..N
                        a. Place the candidate i in the empty cell
                        b. Check if the board is valid with candidate i.
                            If the board is valide then
                            {   i. result = invoke the solve routine on the next empty cell
                                ii. If result is true then stop and return success
                            }
                            else
                                Continue with the next candidate as given in step 2
                    3. return failure (no possible combination is possible)
                    

                      Results

                      It's pretty awesome that we can actually find a solution to Sudoku via a simple backtracking routine. Let's see this routine in action on a simple 4 x 4 board as shown in the figure below. There are three empty cells. We can see that all combinations of numbers are tried.

                      Once an invalid board configuration is found, the entire branch is abandoned, backtracked, and a new solution is tried. The C++ implementation is provided. You can add your own public function to initialize the board differently.


                      class sudoku {
                          vector<vector<int> > board;
                      
                          void Initialize()
                          {
                              int arrBoard[] = {
                                  2,
                                  1,
                                  3,
                                  4,
                                  1,
                                  3,
                                  -1,
                                  2,
                                  -1,
                                  2,
                                  -1,
                                  3,
                                  -1,
                                  -1,
                                  2,
                                  1
                              };
                              vector<int> temp;
                      
                              int ind = 0;
                              for (int i = 0; i < 4; i++) {
                                  temp.clear();
                                  for (int j = 0; j < 4; ++j) {
                                      temp.push_back(arrBoard[ind]);
                                      ind++;
                                  }
                                  board.push_back(temp);
                              }
                          }
                          // set (r,c) to (0,-1) when calling first time
                          // will search for the next empty slot row wise
                          bool findNextEmpty(int& r, int& c)
                          {
                              int initj = 0;
                              int initi = 0;
                              bool found = false;
                              // start searching from next position
                              if (c == board[0].size()) {
                                  initi = r + 1;
                                  c = 0;
                              }
                              for (int i = r; i < board.size() && !found; ++i) {
                                  if (i == r)
                                      initj = c + 1;
                                  else
                                      initj = 0;
                                  for (int j = initj; j < board[i].size() && !found; ++j) {
                      
                                      if (board[i][j] == -1) {
                                          r = i;
                                          c = j;
                                          found = true;
                                      }
                                  }
                              }
                              return found;
                          }
                      
                          // check if the number candidate valid at cell (r,c)
                          bool checkValid(int candidate, int r, int c)
                          {
                              bool valid = true;
                              // check column
                              for (int i = 0; i < board.size() && valid; ++i) {
                                  if ((i != r) && (board[i][c] == candidate))
                                      valid = false;
                              }
                      
                              // check row
                              for (int j = 0; j < board[0].size() && valid; ++j) {
                                  if ((j != c) && (board[r][j] == candidate))
                                      valid = false;
                              }
                              return valid;
                          }
                      
                          // recursive implementation
                          bool solve(int r, int c)
                          {
                              bool success = false;
                      
                              // base case: no more empty slots
                              if (!findNextEmpty(r, c))
                                  return true;
                      
                              // nxn is size of board
                              int n = board.size();
                              for (int i = 1; i <= n; ++i) {
                                  board[r][c] = i;
                                  if (checkValid(i, r, c)) {
                                      success = solve(r, c); // solve for next empty slot
                                  }
                                  if (success)
                                      break;
                                  else
                                      board[r][c] = -1; // try the next candidate for same slot
                              }
                              return success;
                          }
                      
                      public:
                          void print()
                          {
                              for (int i = 0; i < board.size(); ++i) {
                                  for (int j = 0; j < board[i].size(); ++j)
                                      cout << board[i][j] << " ";
                                  cout << "\n";
                              }
                              cout << "\n";
                          }
                      
                      public:
                          bool solve()
                          {
                              Initialize();
                              return solve(0, -1);
                          }
                      };
                      // how to use:
                      // sudoku s;
                      // s.solve();
                      // s.print();
                      

                        Take Away Lesson

                        Backtracking is a very important principle that every software engineer should be aware of, especially for interviews. You should use it when you need to enumerate all solutions of a problem. Take advantage of it in scenarios where the solutions required have to satisfy a given constraint.

                        But before applying backtracking blindly to a problem, think of other possible solutions and consider how you can optimize your code. As always, work things out on a piece of paper with a pen (or with pseudocode!), rather than directly jumping into code.

                          Quiz Time

                          Given this pseudo-code for the N combinations problem:

                          base case:
                          1. If all combinations starting with items in positions < (size-N) have been printed. Stop
                          
                          recursive case:
                          Combos(set, result)
                          1. Repeat for each items i in the set:
                              a. Put the item i in the result set
                              b. if the result set has N items, display it
                                  else
                                  recursively call combos with (the input set without item i) and (the result set)
                              c. Remove the item i from result set
                          

                          What should you change in the code above if all possible combinations of any size are to be displayed?

                          • Change step a of Combos routine with N items in set
                          • Change step b of Combos and display the set unconditionally
                          • Remove step c of Combos
                          • None of these options

                          Solution: Change step b of Combos and display the set unconditionally

                          Question 2

                          For the path problem through a grid, how many possible paths are there for a 5x5 grid if the start position is (0,0) and the goal is (4,4)?

                          • 64
                          • 70
                          • 32
                          • None of the above

                          Solution: 70

                          Top comments (2)

                          Collapse
                           
                          djamaile profile image
                          Djamaile • Edited

                          What program did you use for the drawings? Nice article! Always good the freshen up on these topics.

                          Collapse
                           
                          jacobjzhang profile image
                          Jake Z.

                          Hey Dhamaile,

                          Excalidraw.io :-) But really, any diagramming software works the same. It's more about choosing the right color pairings and communicating the correct message.