DEV Community

Cover image for Solution: Word Subsets
seanpgallivan
seanpgallivan

Posted on

Solution: Word Subsets

This is part of a series of Leetcode solution explanations (index). If you liked this solution or found it useful, please like this post and/or upvote my solution post on Leetcode's forums.


Leetcode Problem #916 (Medium): Word Subsets


Description:


(Jump to: Solution Idea || Code: JavaScript | Python | Java | C++)

We are given two arrays A and B of words. Each word is a string of lowercase letters.

Now, say that word b is a subset of word a if every letter in b occurs in a, including multiplicity. For example, "wrr" is a subset of "warrior", but is not a subset of "world".

Now say a word a from A is universal if for every b in B, b is a subset of a.

Return a list of all universal words in A. You can return the words in any order.


Examples:

Example 1:
Input: A = ["amazon","apple","facebook","google","leetcode"]
B = ["e","o"]
Output: ["facebook","google","leetcode"]
Example 2:
Input: A = ["amazon","apple","facebook","google","leetcode"]
B = ["l","e"]
Output: ["apple","google","leetcode"]
Example 3:
Input: A = ["amazon","apple","facebook","google","leetcode"]
B = ["e","oo"]
Output: ["facebook","google"]
Example 4:
Input: A = ["amazon","apple","facebook","google","leetcode"]
B = ["lo","eo"]
Output: ["google","leetcode"]
Example 5:
Input: A = ["amazon","apple","facebook","google","leetcode"]
B = ["ec","oc","ceo"]
Output: ["facebook","leetcode"]

Constraints:

  • 1 <= A.length, B.length <= 10000
  • 1 <= A[i].length, B[i].length <= 10
  • A[i] and B[i] consist only of lowercase letters.
  • All words in A[i] are unique: there isn't i != j with A[i] == A[j].

Idea:


(Jump to: Problem Description || Code: JavaScript | Python | Java | C++)

The first major shortcut we can recognize is that if a word in A has to match all entries in B, then we shouldn't have to think of all the entries of B as separate. Our first step should then be to merge all words in B into one master word for which all words in B are subsets of it. In Example 5, for instance, when B = ["ec","oc","ceo"], the master word would be "ceo".

To accomplish this, we'll need to use some kind of frequency map. Since we're dealing with characters, we can use an arraymap of length 26 which should be faster than using a regular map structure. We'll need to have two such arraymaps: one will hold the accumulated data (Bfreq) and the other (check) will be used to temporarily store each word in B before checking it against Bfreq.

Rather than creating a new array for each word, we just need to make sure to reset check to all 0's before the next word.

While we check through the words in B, we should also keep track of how many characters are currently stored in Bfreq (cmax). If cmax goes above 10, then it won't be possible for any word in A to match it due to the constraints upon A.length, so we should return an empty array.

Once we have our master word information stored in Bfreq, we can iterate through the words in A and compare them to Bfreq in a similar manner. First, however, we can easily skip any word that isn't as long as cmax. If we get through the entire word without triggering an early break, we can add the word to our answer array (ans).

Once we're all done iterating through A, we can return ans.


Implementation:

Python here is generally much slower with an arraymap, but can use a normal dict and count() to speed things up a bit.

There's also an example of Python using Counter() and its easy comparisons for some short code, though a slower time.

Java should convert the Strings to char[] before iterating through.


Javascript Code:


(Jump to: Problem Description || Solution Idea)

var wordSubsets = function(A, B) {
    let Bfreq = new Int8Array(26), cmax = 0,
        check = new Int8Array(26), ans = []
    for (let i = 0; i < B.length; i++, check.fill()) {
        let word = B[i]
        for (let j = 0; j < word.length; j++)
            check[word.charCodeAt(j) - 97]++
        for (let j = 0; j < 26; j++) {
            let diff = check[j] - Bfreq[j]
            if (diff > 0) cmax += diff, Bfreq[j] += diff
            if (cmax > 10) return []
        }
    }
    for (let i = 0; i < A.length; i++, check.fill()) {
        let word = A[i], j
        if (word.length < cmax) continue
        for (j = 0; j < word.length; j++)
            check[word.charCodeAt(j) - 97]++
        for (j = 0; j < 26; j++)
            if (check[j] < Bfreq[j]) break
        if (j === 26) ans.push(word)
    }
    return ans
};
Enter fullscreen mode Exit fullscreen mode

Python Code:


(Jump to: Problem Description || Solution Idea)

class Solution:
    def wordSubsets(self, A: List[str], B: List[str]) -> List[str]:
        Bfreq, ans, cmax = {}, [], 0
        for word in B:
            for char in word:
                count = word.count(char)
                if char in Bfreq:
                    diff = count - Bfreq[char]
                    if diff > 0: 
                        Bfreq[char] = count
                        cmax += diff
                else: 
                    Bfreq[char] = count
                    cmax += count
            if cmax > 10: return ans
        print(Bfreq)
        for word in A:
            if len(word) < cmax: continue
            for char in Bfreq:
                if word.count(char) < Bfreq[char]: break
            else: ans.append(word)
        return ans
Enter fullscreen mode Exit fullscreen mode

Python Code w/ Counter:


(Jump to: Problem Description || Solution Idea)

class Solution:
    def wordSubsets(self, A: List[str], B: List[str]) -> List[str]:
        Bfreq = Counter()
        for word in B: Bfreq |= Counter(word)
        if sum(Bfreq.values()) > 10: return []
        return [word for word in A if not Bfreq - Counter(word)]
Enter fullscreen mode Exit fullscreen mode

Java Code:


(Jump to: Problem Description || Solution Idea)

class Solution {
    public List<String> wordSubsets(String[] A, String[] B) {
        int[] Bfreq = new int[26], check = new int[26];
        int cmax = 0;
        List<String> ans = new ArrayList<>();
        for (int i = 0; i < B.length; i++, Arrays.fill(check, 0)) {
            for (char c : B[i].toCharArray())
                check[c - 'a']++;
            for (int j = 0; j < 26; j++) {
                int diff = check[j] - Bfreq[j];
                if (diff > 0) {
                    cmax += diff;
                    Bfreq[j] += diff;
                }
            }
            if (cmax > 10) return ans;
        }
        for (int i = 0; i < A.length; i++, Arrays.fill(check, 0)) {
            int j;
            for (char c : A[i].toCharArray())
                check[c - 'a']++;
            for (j = 0; j < 26; j++)
                if (check[j] < Bfreq[j]) break;
            if (j == 26) ans.add(A[i]);
        }
        return ans;
    }
}
Enter fullscreen mode Exit fullscreen mode

C++ Code:


(Jump to: Problem Description || Solution Idea)

class Solution {
public:
    vector<string> wordSubsets(vector<string>& A, vector<string>& B) {
        int Bfreq[26] = {0}, check[26] = {0};
        int cmax = 0;
        vector<string> ans;
        for (string word : B) {
            for (char c : word) check[c - 'a']++;
            for (int j = 0; j < 26; j++) {
                int diff = check[j] - Bfreq[j];
                if (diff > 0) cmax += diff, Bfreq[j] += diff;
            }
            if (cmax > 10) return ans;
            fill(check, check+26, 0);
        }
        for (string word : A) {
            int j;
            for (char c : word) check[c - 'a']++;
            for (j = 0; j < 26; j++)
                if (check[j] < Bfreq[j]) break;
            if (j == 26) ans.push_back(word);
            fill(check, check+26, 0);
        }
        return ans;
    }
};
Enter fullscreen mode Exit fullscreen mode

Top comments (4)

Collapse
 
seanpgallivan profile image
seanpgallivan

Unfortunately, that doesn't account for multiple occurrences of the same letter.

The instructions state:

Now, say that word b is a subset of word a if every letter in b occurs in a, including multiplicity. For example, "wrr" is a subset of "warrior", but is not a subset of "world".

The code you posted would find "wrr" as a subset of "world", because the set will only compare a single "r" instead of requiring 2 for a match.

It's still a remarkable piece of code, however! One of the things that impresses (and intimidates) me the most about Python is its ability to allow for truly concise code.

Collapse
 
presiv profile image
PresiV • Edited

This is very helpful, thank you for sharing. I sometimes have difficulties with writing the perfect code without any bugs, because I don’t pay attention on writing all the words correctly or even a set of words. And when I check, I kind of skip the content, but rather pay attention to the structure. This is definitely something I need to work on. But yeah, your information is really helpful in this case. I try to improve my English vocabulary skills on a daily basis and hopefully I will get better over time. I actually find a nice website lettersolver.com/4-letter-words-en... that helps me with that. It generates a wide range of words adjusted to your search query.

Collapse
 
marcelsgomz profile image
marcelsgomz

word game always remain interesting game for the last two decays. It help you to make your mind more stronger and healthy.