loading...

Find whether a given array is a subset of another array

mayankjoshi profile image mayank joshi Originally published at nlogn.in ・2 min read

We are given two arrays, arr1[] and arr2[], that contains n, m number of distinct elements in unsorted order.
We are supposed to find, whether the given array arr2 is a subset of another given array arr1 or not.

Example -

Input:
arr1[] = {1,6,18,20,3,5,7}
int arr2[] = {18,3,7,6}

Output:
the arr2 is a subset of arr1

Input: 
arr1[] = {1,6,18,20,3,5,7}
int arr2[] = {8,3,7,6}

Output:
the arr2 is not a subset of arr1

This problem can be solved using two methods -

  1. The Naive Approach. Time complexity - O(N*M)
  2. The optimized Approach. Time complexity -Β  O(MlogM + NlogN)

Naive Approach

The naive approach would be comparing each element of the arr2 to each and every element of arr1. This would involve interacting over the N elements of arr1, M number of times, where M is the number of elements in arr2. Since this approach involves the nested loop, the time complexity of this approach reaches O(N*M).

Algorithm
is_subset(arr1, arr2, n, m)
{
    for(i=0; i<m; i++)
    {
         for(j=0; j<n; j++)
         {
            if(arr2[j] == arr1[i])
                break;
         }

         if(j==m)
             return false
    }
    return true
}

Optimized Approach

The optimized and fast approach to solving this problem (of finding whether a given array is a subset of another array) will involve, first sorting both the arrays and then comparing whether the arr2 is a subset of arr1 in O(N+M) time as described below -

  1. Initialize both the arrays (arr1, arr2)
  2. Sort both the arrays (arr1, arr2).
  3. Initialize pointer i, j to the first element of arr1, arr2.
  4. If arr1[i] == arr[j], check for next elements by increment both the pointer i,j by 1
  5. Else if arr1[i] < arr2[j], increment only the i_th_ pointer.
  6. Else, break because this symbolizes, the arr2 is not a subset of arr1.

array 1 is a subset of array 2 nlogn.in

Algorithm
issubset(arr1, arr2, n, m)
{
    sort(arr1, arr1+n);
        sort(arr2, arr2+m);

        i=0
        j=0
    while(i<n && j<m)
        {
        if(arr1[i] == arr2[j]){
            i++ 
        j++
        }
            else if(arr1[i] < arr2[j])
        i++
        else
            break
    }

    if(j == m)
        print("arr2[] is a subset of arr1[]")
    else
            print("arr2[] is not a subset of arr1[]")
}
Implementation of the above algorithm
Output
arr2[] is a subset of arr1[]




Time Complexity

The run time complexity of the above approach is O(NlogN + MlognM) β‰ˆ O(NlogN), where N is the number of elements in arr1[] and M is the number of elements in arr2[].
This time complexity arose due to the sorting we did in the beginning. The time complexity of the loop we are using for finding whether the arr2 is a subset of arr1 is O(N + M)

Posted on by:

mayankjoshi profile

mayank joshi

@mayankjoshi

I love system design and most of the time I find myself learning or designing one of them. <Br> I am a moderator at Dev.to πŸ˜€

Discussion

markdown guide
 

If you convert them to sets (O(N+M)), then check one by one (O(M)), the overall complexity is O(N+M), because set lookup is O(1).

Edit: I should add that, while set lookup in theory can deteriorate into O(n), it's highly unlikely that it'll be that bad for every element in arr2, so on average we can still reasonably expect O(M) for all the existence checks. Your algorithm does have a guaranteed O(NlogN+MlogM) though.

 

The first insight here should be that a larger array, given the arrays are of distinct elements, cannot be a subset.

Which means that you only need to test one of them.

The next insight is that one membership test failure is sufficient to disqualify the smaller array as containing a subset of the larger array.

So we can exit early.

The third insight is that we can cache the membership test, for sufficiently large arrays, using a set, although it will be faster to simply use linear probing where N is below some threshold.

Which brings it back to something like:

for (const element of smaller) {
  if (!member(element, larger)) {
    return false;
  }
}