This post is longer than usual & very heavy on problem solving, instead of the usual "learning about a DS/Algo, then writing an example". I don't expect to get back into this style of post again until I have finished the remaining curriculum. It just happens that the best way to really learn this material was to solve some problems after covering theory. I encourage you to stick with each problem even if it doesn't click the first time around.
Sets, Subsets, & Combinations
Resources
Takeaways
- A set is a collection of elements.
- A set with no elements is still a set and is known as an empty set.
- A subset is any combination of elements from a set.
- An empty set is a subset of any set. This means every set has an empty subset.
- Sets are represented using the notation:
{,}
. - Example:
{1,5}
is a set.{}
,{1}
,{5}
, &{1,5}
are all of its possible subsets. - There is a formula for working out how many subsets a set has: A set with
n
elements has2^n
subsets. - Using our previous example (
{1,5}
):n = 2
as there are 2 elements in our set.2^n = 2^2 = 4
. So a set with 2 elements has 4 subsets. A set with 3 elements has2^3
subsets, which is 8. - When dealing with strings, often we refer to subsets as combinations. The idea is the same as subsets:
-
"AB"
has2^2
combinations (as there are 2 characters). So all 4 combinations of"AB"
are:{}
,{'A'}
,{'B'}
,{'A', 'B'}
.
-
Permutations
Resources
- Permutations of a string from the book: Programming Interviews Exposed: Secrets to Landing Your Next Job, 3rd Edition (link to book*)
- Video on permutations of a string with repetitions allowed
- Video on permutations of a string with no repetitions
*You may find that the book is freely available online as a PDF. However, I do not condone getting the book for free as it has been illegally published.
Takeaways
- A permutation is essentially an ordered combination, except the total length of each permutation must equal the original input.
- Finding all permutations of a string is sort of the same as saying "find all anagrams of a string" (except our permutations might not all be real words).
- There are two types of permutation: with repetition & without repetition.
- Simple permutation example:
"AB"
has 2 permutations:"AB"
and"BA"
. - Repetition example:
"AAB"
has 6 permutations:"AAB"
,"ABA"
,"BAA"
,"BAA"
,"ABA"
,"AAB"
. Notice there are repeated characters, this is a permutation with repetition allowed (as each "A" is treated as distinct). Without repetition, the total permutations would be 3:"AAB"
,"ABA"
,"BAA"
. - To calculate the total number of permutations when repetition is allowed use the following formula:
n!
(!
means factorial). - What does
!
(factorial) mean in real terms? Well, we know thatn
is the number of elements, and!
just means we multiply a series of descending numbers. - Lets look at an example of
n!
: Given a string"ABC"
how many characters can be placed in the first position? 3. How many can be placed in the second position? 2 (assume we have already placed a character at the first position). How many can be placed in the last position? 1. Son!
for"ABC"
is simply:3x2x1 = 6
. See how that works? We just take the number of available characters for each position, then multiply. - So how do we work out how many possible permutations there are when repetition is not allowed? Answer:
n!/2!^k
- What is
k
in that formula?k
is the number of repeated elements in the input. Example:"AAB"
hasn!
, or3x2x1
, permutations. So the total permutations are 6. Without repetition allowed the formula is:3x2x1 / 2^1
.k
is 1 as"A"
is the only repeated character. So the total permutations for"AAB"
without repetition is3x2x1 / 2^1 = 6 / 2 = 3
. Here is a helpful example of working out the distinct permutations for "TOFFEE" (hint:k
is 2)
Finding all permutations of a string
Resources (same as the resources already listed for permutations)
- Permutations of a string from the book: Programming Interviews Exposed: Secrets to Landing Your Next Job, 3rd Edition
- Video on permutations of a string with repetitions allowed
- Video on permutations of a string with no repetitions
Takeaways
- So now we know what a permutation of a string is, how do we generate all permutations of a string?
- Start with a function that takes an input string, and a
string
orStringBuilder
calledcurrent
(which represents each permutation), we can distill the operation to the following:- If you are past the last position in the input string:
- print the string and return.
- Otherwise:
- For each letter in the input string:
- If it's marked as used (already in
current
), skip to the next letter - Else, place the letter in
current
- Mark the letter as used
- Recursively call the current function to add the remaining letters to
current
- Mark the letter as unused and remove it from
current
- Continue iterating
- If it's marked as used (already in
- For each letter in the input string:
- If you are past the last position in the input string:
- The above will ensure we permute a string, with repetitions allowed. So how do we do it without repetition? And what if we want to ensure the result is in lexicographically sorted order (order based on the alphabetical order of the permutations component letters)?
- One way to achieve this is to sort the input, and associate each repeated character with the number of times it is repeated.
- For example:
"BAA"
would give us sortedkey:value
s of:{A:2,B:1}
. - Once this is done, the algorithm looks almost identical. The only difference is that we use a modified version of the input string (letters and their counts) and decrement the count of a letter when we have used it (when
count == 0
we skip the letter). - As there are
n!
permutations our time complexity isO(n! n)
, and our space complexity isO(n!)
. - As always, the code for these two variations is at the end of this post. See
PrintPermutations()
andFindPermutationsNoDupes()
.
Finding the next permutation of an integer array
Resources
- Video explanation and solution
- Problem introduction and solutions
- Article explaining the algorithm and solution
Takeaways
- Problem statement: Given an input array of integers, find the next possible permutation and return the result.
- This problem is slightly different to generating all possible permutations. For this problem we only need to find a single permutation, and it must be the next permutation that would come in lexicographical order.
- The brute force approach is to generate all possible permutations, and when we hit a permutation that is the same as the input, return the permutation that follows.
- This is obviously less than ideal: what if the given input is the last possible permutation? That's a lot of wasted effort. Time complexity would be
O(n!)
and space complexity would beO(n)
. - A better way is to first recognize a few key traits that allow us to form a solution:
- For any given input that is in descending order, no next permutation is possible. For example, no next permutation exists for
[9, 5, 4, 3, 1]
- So, if we can find the inversion point of an input - we can calculate the next permutation
- The inversion point is the position in the array after which the elements stop descending (remember we said that if it continually descends, no further permutations are possible)
- Once we find the inversion point, we swap the element at that position with the next element in the array that is larger than it. Then we sort in ascending order, the elements after the inversion point.
- Because everything after the inversion point will inherently be sorted in descending order - we can just reverse this section of the array instead.
- The array will now be arranged such that it is ordered as the next permutation - meaning we can return the result.
- For any given input that is in descending order, no next permutation is possible. For example, no next permutation exists for
- How do we find the inversion point? We iterate over the input, starting at the last element. We know we have reached the inversion point when
input[i] > input[i - 1]
. When this holds true, everything afterinput[i - 1]
(the inversion point) cannot be rearranged to produce a larger permutation, since the elements are in descending order (elements right of inversion point). - This improved approach has a time complexity of
O(n)
and a space complexity ofO(1)
(as the operation is in-place). - See
FindNextPermutation()
at the end of this post for the implementation.
Next permutation visualization
Finding all combinations/subsets of a string/integer array
Resources
- Video explanation and solution
- Combinations of a string in: Programming Interviews Exposed: Secrets to Landing Your Next Job, 3rd Edition
- Generating combinations with no duplicates (link to code)
Takeaways
- This algorithm is very similar to the algorithm for generating permutations.
- The main difference is that there is not the length condition (remember that a subset/combination can be empty and/or have less elements than the input set - whereas a permutation must be equal in length to the input).
- The removal of this condition greatly simplifies our algorithm, as we no longer need to keep track of if we have already used an element/if it needs skipping (at least, in the same way).
- We instead just keep track of/memoize our function's loop's current index and pass it to our our function when recurring.
- Our loop for permutations always started at 0. Our loop for combinations will always start at a given index. This means our combinations function will take an index as an argument.
- To generate all combinations/subsets in lexicographical order, we can first sort the input.
- To prevent duplicates we can add a check right at the start of our loop. If the loop's current index (
i
) is not the same as the index passed into our function, and if element at theith
position is the same as the element ati - 1
(previous) - we skip theith
element (current element). - The algorithm distilled is:
- We start with an empty list (called
current
) that will represent each combination, an index of 0, and an empty result/output collection to return.- Add
current
(representing a combination) to the output/result
- Add
- For each element of the input, from a given index to the final index
- Append the
ith
element tocurrent
- Recursively call the current function, passing it
current
andi + 1
(to move to the next element in the input) - After the recursive call, remove the last element from
current
(which is the one we added before the recursion) and continue with the loop (which repeats the process)
- Append the
- We start with an empty list (called
- Finding all combinations has a space complexity of
O(n)
and a time complexity ofO(2^n)
(as there are2^n
possible combinations). - Try looking at the
FindAllComibinations()
function at the bottom of this post and compare it to thePrintPermutations()
function. You will see the similarities and the differences I have explained, hopefully with greater clarity.
Finding all the letter combinations of a phone number
Resources
- Video overview and solution
- Another video explanation and solution
- Telephone words in: Programming Interviews Exposed: Secrets to Landing Your Next Job, 3rd Edition
Takeaways
- Problem statement: Write a function that takes an input array of integers. Each integer represents a number on a telephone keypad. You will notice each number on a phone's keypad represents a certain number of letters (except 1 and 0). For example: 2 represents
ABC
. Given the input array, return all of the possible "words" or combinations of letters that can be represented by the given input. - Algorithm for solving the above:
- Create a mapping array (
mapping
), where each index of the array represents the letters for that index. For example:["0", "1", "abc", "def", ...]
.mapping[2]
would give us the letters for 2, and so on. - Transform the input array into a string of digits called
digits
. Example:[2,3,4]
would become"234"
. - Our function will take
digits
, an empty string (current
, which will represent each combination), a start index of 0 (i
), and a result list (results
). - Inside our function:
- If
i
is the same length asdigits
:- Add
current
toresults
, and return.
- Add
- Else, find the letters in
mapping
that correspond to the integer atdigits[i]
- For each of the corresponding letters, add the letter to
current
and recursively call our function passing indigits
,current
,i + 1
, andresults
- If
- Create a mapping array (
- At the completion of the above algorithm, all combinations will be in
results
. - Time & space complexity are both
O(3^n * 4^m)
.n
is the number of digits that map to three characters (2,3,4, etc
), andm
is the number of digits that map to four characters (7,9
). - How can we verify that the total number of combinations will be
3^n * 4^m
? Example:"23"
has 9 combinations which are as follows["ad", "ae", "af", "bd", "be", "bf", "cd", "ce", "cf"]
. So,n = 2
which means3^n * 4^m = 3^2 = 9
(we ignore4^m
asm = 0
). - Check out
LetterCombinations()
in the code at the end of this post to see the full implementation.
Word subsets (finding all words that contain a subset of letters)
Resources
Takeaways
- Problem statement: Given two arrays
A
&B
. Each array consists ofn
number of lowercase strings. A wordb
inB
is a subset of a worda
inA
if every letter inb
is present ina
including multiplicity. For example:roo
is a subset ofroom
, but is not a subset ofroar
. We can say a worda
inA
is universal if every word inB
is a subset ofa
. The challenge is to return all of the universal words in arrayA
. - One way to solve this is to transform all of the words in
B
into a single wordb
. - Now we can check every word
a
inA
, and ifa
is a superset of our singleb
(meaningb
is a subset ofa
) thena
is universal. - The way we achieve this in code is to transform our single word
b
into an array of integers. Each position in the array will represent a letter in the alphabet, and each integer in the array will be the maximum number of occurrences of each letter inb
. For example:aab
would be represented as[2,1]
. The first index of our array is where we put the max occurrences of the lettera
. Asa
occurs two times we store two at that index.b
occurs only once, and belongs at the second index. Remember we are storing maximum occurrences, so ifB = ["aab", "aab"]
our resulting array would still be[2,1]
- as we count the occurrences of each word inB
, not the sum of all the occurrences. - Now that we have transformed
B
into a single word (lets call itbMax[]
), represented by an integer array of maximum letter occurrences, we loop overA
and for each worda
do the same transformation (lets call thisaMax[]
). - After transforming
a
we now perform another loop of 26 iterations (the number of letters in the alphabet) and for each iteration do the following comparisonaMax[i] < bMax[i]
. If any of the characters inaMax[]
(which representsa
) appear less times than the same character inbMax[]
(representing our singleb
) then we know the worda
is not universal. Otherwise,a
is universal and we can add it to our results set. - After finishing our loop over every word
a
inA
and doing the above operation each time, we can return our resulting list of universal words. - Check out
WordSubsets()
at the end of this post to see the implementation. - The time complexity of this solution is
O(A + B)
. Space isO(A.Length + B.Length)
.
Subset sum (Find/count all subsets with a given sum)
Resources
- Video explanation and bottom-up solution
- Video explanation of a top-down approach
- Implementation of top-down solution
- Another implementation of the top-down solution
Takeaways
- Problem statement: Given a set of positive integers
A
and an integerS
, find the subsets ofA
whose sum is equal toS
. - Variations of this problem are the same in premise, but can ask you to return
true
if a single subset with sumS
exists, ask you to return all the subsets that sum toS
, or simply ask you to count the number of subsets that sum toS
. In my implementations I either return or count the subsets inA
that sum toS
. - A brute force approach to this problem would be to generate all subsets for the given input, and check if any of them sum to
S
. As there are2^n
subsets the time complexity for this approach isO(2^n * n)
- less than ideal. - A better approach is to use recursion, and for each element in
A
decide between two possibilities:- Include the current item in the
current
subset and recur for the remaining items inA
with the remaining sum. - Or we exclude/skip the current item from the subset and recur for the remaining items, leaving the running total/sum unchanged.
- Include the current item in the
- It's important to note that during the recursion we keep a running total/sum that starts equal to
S
, each time we include a number in thecurrent
subset we subtract it's value from the running total and recur with the remainder. We know we have a complete subset when the running total is0
- meaning ourcurrent
subset's sum is equal toS
. - We know that our
current
subset does not sum toS
if we have run out of elements to recur for, or if our running total/sum is less than 0. - A small optimization we can make to the recursive approach is to always exclude/skip the current item if it is greater than our running total.
- Example: if our running total is
15
and the current item is20
- there is no reason to try and include it in thecurrent
subset, as it will make our running total negative. - A further optimization we can make to our recursive approach is to introduce some memoization.
- Now, every time we recur we will store a
key:value
pair of the results. This means instead of recomputing the same thing over and over, we will instead first check if we have already done the same calculation. If we have, then we will return the memoized computation instead of continuing to recompute. - This saves a decent number of cycles. In my limited testing with an input array
A = [2, 4, 6, 10, 16, 8, 1, 5, 7, 3, 44]
andS = 16
, the memoized recursive version did almost 50% less work than the non-memoized version. Of course, this will change based on the inputs - but it highlights how much more efficient a small change can be. - When solving this problem using recursion and memoization it is known as a top-down approach. Top-down means we start with the larger problem, then break it down into sub problems, ensuring we remember answers to our sub problems along the way (in case the same answer is needed again).
- A final approach to solving this problem is using dynamic programming and a bottom-up approach. Bottom-up means we start by solving the smallest sub-problem first, then work our way up to the actual problem itself.
- A common way to perform bottom-up solutions is using a 2D array as a matrix.
- It is common to use a boolean array, or similar.
- Our matrix will represent a table of our sub-problems and their solutions.
- Dynamic programming is actually an entire other topic I was planning to cover in a later post - but I accidentally got some exposure to it with this problem.
- To try to explain it in words is to do the solution injustice (and would make this post even longer), so I encourage you to instead check out the video in the resources section that explains more about tabulating the sub-problems. And if you are really curious about dynamic programming this is a gentle introduction.
- The bottom-up has a space complexity of
O(n * Sum)
. Both the top-down/bottom-up solutions have a run time ofO(n * Sum)
. - Because our bottom-up approach doesn't have the recursion overhead it's typically more efficient in real terms.
- Check out
FindAllSubsetsWithSum(), CountSumSubsets(), CountSumSubsets2()
at the end of this post for the implementations of recursion, recursion with memoization, and bottom-up dynamic programming (respectively).
Bonus
- SortedDictionary: This is part of the C# standard library. Per Microsoft, it is a binary search tree with O(log n) retrieval. However, as binary search trees aren't typically balanced, there are some questions on exactly what tree data structure it actually uses. It would appear it could actually be using something closer to a Red-Black tree.
- Memoization: Memoization is sort of like caching. We remember the results of a computation so we don't have to recompute the same thing over and over. We instead check the "cache" and see if we have already done the work previously. Here's a good article on memoization.
- Dynamic Programming: Dynamic programming is commonly bottom-up, and is a sort of table-filling algorithm (using a matrix as our table). We fill up our table with solutions to sub-problems, and then use the resulting table to solve the problem at large. Here is a good article on bottom-up. And here is a good article on dynamic programming (which also compares top-down with bottom-up).
Below you will find implementations for all of the problems discussed:
As always, if you found any errors in this post please let me know!
Top comments (0)