loading...

Algorithms: Generating Combinations #100DaysOfCode

rrampage profile image Raunak Ramakrishnan ・7 min read

In how many different ways can we select r objects from a collection of n objects? In mathematics, this is called combinations.

Combinations of 5 objects, taken 2 at a time

The formula for the number of combinations is:

where, n! denotes the factorial of a number that is the product of all numbers from 1 to n (inclusive).

Prelude : A function for calculating factorial of a number

public static long factorial(int n) {
    long res = 1L;
    for (int i = 1; i <= n; i++) {
        res *= i;
    }
    return res;
}

Calculating Combinations

That was simple! Let us now move on to calculating the number of combinations given n and r

public static long combinations(int n, int r) {
    if (r < 0) {
        return 0;
    }
    long res = 1L;
    if (r > n - r) {
        r = n - r;
    }
    for (int i = 0; i < r; i++) {
        res *= (n - i);
        res /= (i + 1);
    }
    return res;
}

What does this algorithm do? Recall that we need to find n!/r!(n-r)! which will be of the form n(n-1)...(n-r+1)/1.2...r. Similar to factorial, we initialize the result as 1 and multiply by n-i and divide by i+1. Will this result in a fractional number? No. This is because first, we multiply by n and divide by 1. Next, we multiply by n-1 and divide by 2. Now, either n or n-1 have to be even (as they are consecutive numbers). Similarly, next when we divide by 3, one of n,n-1 and n-2 must be divisible by 3.

In the above code, we also make use of the mathematical property that combinations(n,r) = combinations(n,n-r). This way, we can do less number of operations for calculating the combinations.

Generating the combinations

Counting the number of combinations was not so hard! Now, let's generate all the combinations.

  • Given n and r, we will print out all the combinations.
  • For the n objects, we will use the numbers from 0 to (n-1).
  • Additionally, we will generate them in a lexicographical order which is math speak for sorted order.
  • Finally, in a combination containing a and b, if a < b, we will print a b instead of b a. Formally stated, if a[k] and a[k+1] are the kth and (k+1)th elements in a generated combination, a[k] < a[k+1] for all k

For example, given n = 4, r = 2, we have:

0 1
0 2
0 3
1 2
1 3
2 3

i.e 6 combinations.

Notice that we have 0 1 and not 1 0. This is because we are generating each combination in lexicographical order and we take the minimum for each combination.

Generating for a specific value of r (r = 2)

If we have a specific value of r say 2, the code will involve 2 for loops like:

for (int i = 0; i < n-1; i++) {
    for (int j = i+1; j < n; j++) {
        println(i + " " + j);
    }
}

In the code above, our first loop variable i goes from 0 to n-2 and the next variable j goes from i+1 to n-1. Why so? This is because we have a requirement for taking the lexicographical minimum combination, so i < j from our constraints.

If value of r is fixed, we can simply create r for loops. But it is not fixed...

Generalizing!

Now, let's move on to the main goal - generate combinations of n numbers taken r at a time. This section will be a little verbose as I have outlined how I arrived at the correct code. If you are interested in just the algorithm, feel free to skip to the bottom of the article

If we notice our previous code for r = 2, our first combination is always 0 1 as i = 0, j = 1. Similarly, if r was 3, our first combination would be 0 1 2. There is a pattern!

By creating an array a of size r, we can generate the first combination as 0 1 2 .. r-1. We have the first combination ready. What about the rest? Somehow, if we increment elements in this array, we will generate the combinations...

Again, looking at the r = 2 case, notice that the last combination is n-2 n-1. Similarly, for r = 3, it is n-3 n-2 n-1. Thus, for r elements, it will be n-r+1 n-r+2 .. n-1. There is one more insight - there is exactly one combination which starts with n-r+1. If our array's first element reaches n-r+1, we are done!

We now have a termination condition for our function: a[0] == n-r+1

The code we have so far will look like:

int n; // Given
int r; // Given
int[] a = new int[r]; // Initialize array of size r
for (int i = 0; i < r; i++) {
    a[i] = i; // Initialize array with first combination
}

while (a[0] < n-r+1) { // Our termination condition
    // DO SOMETHING!
}

We have a while loop that checks for termination condition. For the loop to terminate, we need to steadily progress from our first combination to the last combination. As we are generating elements in lexicographical order, the last element of the array must be incremented first. Then the second from last element and so on.

In our earlier example of n = 4, r = 2, we had

0 1
0 2
0 3
1 2
1 3
2 3

After 0 3, we get 1 2. This means once the r-1 element (last element) reaches its maximum, we increment r-2 element from 0 to 1 and also reset the value of r-1 element to a[r-2]+1 as it must always be at least 1 greater than the r-2 element (from our constraints). Moving to our pseudo-code, let's add this to the while loop

int i = r-1; // variable i keeps track of which element in array we are incrementing
while (a[0] < n-r+1) { // Our termination condition
    if (i > 0 && a[i] == n-r+1) {
        i = i-1; //If a[i] has reached the max allowable value, decrement i and move to next element in array
    }
    printArray(a); // pseudo-code to print out the combination
    a[i] = a[i]+1; // increment
    if (i < r-1) {
        a[i+1] = a[i]+1; // Reset `i+1` element as previous element + 1, according to our constraints
        i = i+1; // Once you have reset the i+1 element, it is no longer < n-r+i and hence, we can move it back to old value
    }
}

We have an index variable i which we use to check which is the element in the array to be incremented. In the first if in above code, we check if the a[i] has reached its maximum value of n-r+i. If yes, we decrement i as a[i] can no longer be incremented. Moving out of if, we then print the combination and increment a[i]. Now, if i is no longer r-1 i.e it is no longer last element of a, we must reset it to r-1 and also set the value of a[r-1] as a[r-2]+1. This works for r=2. Hooray! We have abstracted out the for loop in the earlier section into a while loop with a few conditionals.

But does this work for r > 2? No... We need a minor change to make it work! Change the if statements inside the loop to while loops and we are done! In case of first loop, we need to find the maximum i which is less than n+r-i. For example n=5, r=3 we have:

0 1 2
0 1 3
0 1 4
0 2 3
0 2 4
0 3 4
1 2 3
1 2 4
1 3 4
2 3 4

As we move from 0 3 4 to 1 2 3, both i = 2 (a[2] = 4) and i = 1 (a[1] = 3) are at their maximum. We need to move to i = 0. Similarly, the second if must be a while loop because once we have incremented the a[i] for minimum i, we need to reset the outer elements of array to maintain our constraints.

Our final code:

int[] a = new int[r];
// initialize first combination
for (int i = 0; i < r; i++) {
    a[i] = i;
}
int i = r - 1; // Index to keep track of maximum unsaturated element in array
// a[0] can only be n-r+1 exactly once - our termination condition!
while (a[0] < n - r + 1) {
    // If outer elements are saturated, keep decrementing i till you find unsaturated element
    while (i > 0 && a[i] == n - r + i) {
        i--;
    }
    printArray(a); // pseudo-code to print array as space separated numbers
    a[i]++;
    // Reset each outer element to prev element + 1
    while (i < r - 1) {
        a[i + 1] = a[i] + 1;
        i++;
    }
}

Proof of termination

Now that we have our algorithm, how can we show that it terminates? In each iteration of our outer while loop, we increment the element of the array with maximum index i which has not reached value n-r+i while maintaining our constraints. Due to the lexicographical ordering, our previous combination is always lesser than our currently generated combination. As there are only a finite number of combinations till we reach our "last" combination, we can say that our algorithm will terminate.

Meta:

I began my 100 days of code challenge today with this problem. I will create a separate post explaining my motivations and plans.

Regarding this problem statement of generating combinations, I had some trouble initially moving from r=2 case to the general one. It took me some time to find the correct termination condition. I am happy that the final algorithm is relatively compact. I also want to do a proof of correctness for this algorithm later.

Posted on by:

rrampage profile

Raunak Ramakrishnan

@rrampage

Passionate about databases, distributed systems and functional programming.

Discussion

markdown guide
 

Hello, thnx for this information. Do you also have the code in Fortran? KR, Marc