The Knapsack Problem is a general optimisation problem. The solution to this problem has something in it for everyone. Before we see how it can be useful, let's break apart the word 'knapsack'.
This word comes from German 'knap', which means 'bite', and 'sack', meaning 'bag'.
The word 'knapper' means 'to eat', and therefore knapsacks are generally eating bags. While a monstrous bag that bites comes to mind, this was merely a name given to a bag used by soldiers to store food. But this slowly evolved to refer to an everyday carrier for ordinary things. But it wasn't soon before the word began to be replaced by the likes of 'backpacks', and one of the only places it stuck, was the knapsack problem.
So what is the knapsack problem? It is a problem where one has to try and fit as many things of value into a knapsack without overloading it. This is generally expressed as a set of objects, each having a particular value and a weight. Objects must be chosen from this set, such that the total value of chosen objects is maximum. But there is a limit on the total weight of the objects. The knapsack problem thus refers to any problem that applies a constraint to you maximising the value of objects. Since I did mention that there's something in it for everyone, let's look at a few ways it can be used:
You’re an art thief. You need to break into a museum. Now you can use one sack to carry all the valuable pieces you've stolen, because you can't really carry more than one while dodging security systems and booby traps, and then jumping over roofs, trying to escape. So you need to figure out how to pick as many valuable items as possible but also making sure you only take as much as your 'knapsack' can carry. Before you carry out the heist, you can sit at home, carefully run the knapsack algorithm on the various pieces of art in the museum, and you'll be guaranteed that you'd choose the most valuable loot among all other criminals who attempt to rob the same museum, with similar sacks. Clearly, this is a bad example, so let's just move on to the next.
You are now a student writing a test. You have 20 questions out of which you choose which ones to answer while making sure that you've attempted for a total of 100 marks. So you now assign a value depending on how easy a question is for you. The weight becomes the maximum marks for that question. Now using these as the parameters for solving the knapsack problem, you can come up with a set of questions that will be the easiest for you, while also making it highly probable that you'll score a complete 100. This could be extremely useful if there are too many questions to choose from with a marks scheme that's equally diverse. But most tests aren't going to be testing you for your knapsack problem-solving skills. If only our tests were this 'meta'. Moving on to slightly better examples.
You could be someone who wants to chart a new diet plan for whatever reason. Anyway, let's assume you want to choose the tastiest dishes for your diet plan, while also making sure you're consuming the least calories. Again the knapsack problem is your go-to approach. Or you could visit a dietician.
Now some of the most important applications of this problem would be deciding where to invest, how to maximise profits, how to make the most out of resources while also reducing waste, and also how to select the perfect team. Yes so hypothetically, if you could buy superheroes and build your team like the Avengers, you could use the knapsack problem to maximise strength, while under a budget. I realised I could implement this when Entertainment Weekly released this very controversial post, asking people who they'd pick with a budget of $15.
It's pretty evident that there is a huge problem with the prices. Scarlet Witch cannot possibly cost less than $15 alone because she’s clearly the most powerful Avenger. I’m not the only one who has issues with this categorisation. Here’s a link to see how crazy the internet is going. Anyway, I modified costs wherever I wasn’t satisfied.
So my ‘knapsack’ has a limit of $15, and I have ‘weights’ (costs) of each ‘item’ (avenger).
The values I assigned to each of them, was a number from 1 to 25, representing how valuable and powerful the avenger seems to me (25 being the highest).
So now I have weights and values for all my ‘items’.
How do we go about doing this? This is a classic 0/1 knapsack problem, where 0/1 means you either take an item or drop it. So this is different from a fractional knapsack problem, where I’m allowed to take only, say, Thor’s eye, and not necessarily him as a whole.
Therefore, 0/1 knapsack means I take an avenger, or I don’t, without being allowed to take parts of the avenger (Hmm, I wonder what happened to Thor’s eye).
I used two techniques- The greedy approach and the dynamic programming approach.
First, let’s look at the greedy approach. As its name suggests, it is a method of choosing the best item at every point, and continue till you’ve taken everything. So for example, if you want to choose items to fit in your knapsack, you sort the items available based on what ‘best’ means to you, and put things into the knapsack in the order of decreasing priority, until it is full.
Let’s look at some ways of defining what’s 'best'.
Value - If you think the higher the value, the better the object, you would sort the objects by value. So you would then choose the highest value object from the set, then the next highest value from the remaining, and so on, until you can no longer fit anything in your knapsack. Obviously, it would be dumb to choose the objects of lowest value first.
Weight - If you think that choosing the lightest object is the best choice, you would sort the objects in increasing order of weight. So at every point, your best choice would be the object with the lightest weight. You would thus continue picking objects in this manner until the knapsack is full. Again, if you choose one heavy object and fill the knapsack right in the beginning, it would be a dumb choice. So here, we choose the lightest items first.
Density - This is a way of defining ‘best’ by taking into account both weight and value. Ideally, you would want high value and low weights to fit as much value as you can into the knapsack. You'd sort the objects according to value divided by weight (or density), and pick from highest density to lowest. Since an increase in the numerator and a decrease in the denominator increases the magnitude of density, high value and low weight imply high density. So it would make no sense to pick the lowest density first because then you’d be picking the worst of both worlds (low value and high weight).
Each of these techniques gives you a different total value because each method would end up with you having the best according to that metric. However, the knapsack problem is generally defined as maximising value while subject to a maximum weight constraint. The problem with the greedy technique is also that it chooses local optimums, and will not necessarily give you the maximum value. Assume you’re greedily picking objects by value. We see that Ironman, Captain Marvel, and Scarlet Witch have a value of 10. So picking by value, the greedy algorithm will pick all of them. But since each of them cost $5, we will also use up our budget of $15. Our team has a total value of 30.
However, if we dropped Captain Marvel, we’d be left with $5, and a total value of 20. But, we could spend $1 on Loki and $4 on Doctor strange, adding a 9+8.5 to the total value. So now we have 37.5, which is a greater value than the previously obtained 30.
We need a better approach. This is where dynamic programming comes in.
The name for this technique was randomly chosen by the person who developed it, because he wanted it to sound non-mathematic, and also so that people wouldn’t know what he was talking about. In this approach, we make use of the solutions to sub-problems to develop the overall solution.
Let’s assume we can use only objects 1,2 out of a given set of objects. Let’s also assume that the maximum weight we can use is 6. The weights and values of each of these two objects are shown below.
So now we see that the maximum total value for the given objects, subject to maximum weight constraint 6, is 10, which is obtained by adding object 1 to the knapsack.
Now assume we now have objects 1,2,3.
The maximum weight is still 6. If the weight of object 3 is greater than 6, we can’t add it to our ‘knapsack’. We see that in this case, the weight of object 3, is lesser than 6. We have two choices. We could use 3 in the knapsack, or we could drop it.
Case 1: We use object 3
If we choose object 3 in the knapsack, the remaining weight is 4 (weight of object 3), subtracted from maximum capacity, which is 6. Thus the remaining capacity of the knapsack is now 2.
Now that we’ve added object 3 already, we have objects 1,2 to choose from and a remaining capacity of 2. So we again find the maximum value that can be obtained using objects 1,2 and a maximum capacity of 2. This is the same as the first case where we had only objects 1 and 2. However, this time, since our maximum capacity is 2, we cannot use object 1. Thus, the maximum value obtained from objects 1 and 2, subject to maximum weight 2, is 8 (value of object 2). We can add object 2 to the knapsack. So 8 + 6 (value of object 3)=14, is the total value in the knapsack.
Case 2: We don't use object 3
If we don’t use 3, the maximum value is the same as the value obtained by choosing from objects 1,2, subject to maximum weight 6. We already did this in the beginning, and the maximum value is 10.
Since we already know the solution for using objects 1,2 only, and we're trying to figure out if adding object 3 can give us a selection of objects with a higher total value, we take the maximum of the above two cases as the solution. Hence, 14 is the maximum value that can be obtained by using objects 1,2,3, with a knapsack of maximum capacity 6.
This might seem like overkill with just 3 objects at hand. However, for a problem that involves larger data sizes, like choosing from 25 superheroes, this approach can be very efficient. We start with solutions for a small set of objects and go on adding objects one by one to it, and obtain the solution at each stage until we build up towards all the objects.
So we can use solutions to subproblems to compute the optimum solution for the original problem. If there is a table from which we can look up solutions to subproblems, we can easily solve the bigger problem by building this table from the bottom up. Let's look at the table for the above set of 3 items.
So this table has solutions for all subproblems, starting from no items and maximum weight 0, to all items, maximum weight 6. Row number i corresponds to having items 1 to i available to us. Column no j corresponds to a maximum weight of j. Thus cell i,j corresponds to the maximum value obtained with objects 1 to i, with weight constraint j. Thus the bottom rightmost cell gives us the final maximum value.
Using this approach, I was able to obtain a maximum value of 51.5 for my team of avengers. To know which heroes were chosen, we backtrack on the table.
For simplicity, let’s look at another table for a few objects, whose weights and values are mentioned on the left side of the table below.
Now we see that the value at (4,7) is 9 and the value at (3,7) is also 9. This means that object 4 was not included, as the value remains the same as when 3 objects are used, even after having 4 objects available to us. Now we see value at (2,7) is not the same as (3,7). This means that object 3 is present in the knapsack, as the maximum value changed when the set of objects available changed from 1,2 to 1,2,3. Since we’ve included object 3 in the knapsack. The remaining capacity of the knapsack is 4(weight of object 3), subtracted from maximum weight 7. So the remaining capacity becomes 3. So we move to (2,3). (2,3) and (1,3) are not the same. Hence item 2 is also in the knapsack. So now the remaining capacity is 3 minus 3 (the weight of object 2) which is 0. Hence the knapsack is full, and we reach column 0.
The elements in the knapsack are object 3 and object 2. The Avengers dynamic programming table contained 26 rows and 16 columns and it is highly impractical to try and explain a large table when you can explain a smaller table the same.
Similarly, I was able to see that my team consisted of Winter Soldier, Hawkeye, Vision, Loki, Gamora, Black Widow, and Ironman. Sounds like a good team.
The code I used is available here. Now I just have to sit and hope these heroes are for sale.