This is part of a series of Leetcode solution explanations (index). If you liked this solution or found it useful, please like this post and/or upvote my solution post on Leetcode's forums.
Leetcode Problem #916 (Medium): Word Subsets
Description:
(Jump to: Solution Idea || Code: JavaScript | Python | Java | C++)
We are given two arrays
A
andB
of words. Each word is a string of lowercase letters.Now, say that word
b
is a subset of worda
if every letter inb
occurs ina
, including multiplicity. For example,"wrr"
is a subset of"warrior"
, but is not a subset of"world"
.Now say a word
a
fromA
is universal if for everyb
inB
,b
is a subset ofa
.Return a list of all universal words in
A
. You can return the words in any order.
Examples:
Example 1: Input: A = ["amazon","apple","facebook","google","leetcode"]
B = ["e","o"]Output: ["facebook","google","leetcode"]
Example 2: Input: A = ["amazon","apple","facebook","google","leetcode"]
B = ["l","e"]Output: ["apple","google","leetcode"]
Example 3: Input: A = ["amazon","apple","facebook","google","leetcode"]
B = ["e","oo"]Output: ["facebook","google"]
Example 4: Input: A = ["amazon","apple","facebook","google","leetcode"]
B = ["lo","eo"]Output: ["google","leetcode"]
Example 5: Input: A = ["amazon","apple","facebook","google","leetcode"]
B = ["ec","oc","ceo"]Output: ["facebook","leetcode"]
Constraints:
1 <= A.length, B.length <= 10000
1 <= A[i].length, B[i].length <= 10
A[i]
andB[i]
consist only of lowercase letters.- All words in
A[i]
are unique: there isn'ti != j
withA[i] == A[j]
.
Idea:
(Jump to: Problem Description || Code: JavaScript | Python | Java | C++)
The first major shortcut we can recognize is that if a word in A has to match all entries in B, then we shouldn't have to think of all the entries of B as separate. Our first step should then be to merge all words in B into one master word for which all words in B are subsets of it. In Example 5, for instance, when B = ["ec","oc","ceo"], the master word would be "ceo".
To accomplish this, we'll need to use some kind of frequency map. Since we're dealing with characters, we can use an arraymap of length 26 which should be faster than using a regular map structure. We'll need to have two such arraymaps: one will hold the accumulated data (Bfreq) and the other (check) will be used to temporarily store each word in B before checking it against Bfreq.
Rather than creating a new array for each word, we just need to make sure to reset check to all 0's before the next word.
While we check through the words in B, we should also keep track of how many characters are currently stored in Bfreq (cmax). If cmax goes above 10, then it won't be possible for any word in A to match it due to the constraints upon A.length, so we should return an empty array.
Once we have our master word information stored in Bfreq, we can iterate through the words in A and compare them to Bfreq in a similar manner. First, however, we can easily skip any word that isn't as long as cmax. If we get through the entire word without triggering an early break, we can add the word to our answer array (ans).
Once we're all done iterating through A, we can return ans.
Implementation:
Python here is generally much slower with an arraymap, but can use a normal dict and count() to speed things up a bit.
There's also an example of Python using Counter() and its easy comparisons for some short code, though a slower time.
Java should convert the Strings to char[] before iterating through.
Javascript Code:
(Jump to: Problem Description || Solution Idea)
var wordSubsets = function(A, B) {
let Bfreq = new Int8Array(26), cmax = 0,
check = new Int8Array(26), ans = []
for (let i = 0; i < B.length; i++, check.fill()) {
let word = B[i]
for (let j = 0; j < word.length; j++)
check[word.charCodeAt(j) - 97]++
for (let j = 0; j < 26; j++) {
let diff = check[j] - Bfreq[j]
if (diff > 0) cmax += diff, Bfreq[j] += diff
if (cmax > 10) return []
}
}
for (let i = 0; i < A.length; i++, check.fill()) {
let word = A[i], j
if (word.length < cmax) continue
for (j = 0; j < word.length; j++)
check[word.charCodeAt(j) - 97]++
for (j = 0; j < 26; j++)
if (check[j] < Bfreq[j]) break
if (j === 26) ans.push(word)
}
return ans
};
Python Code:
(Jump to: Problem Description || Solution Idea)
class Solution:
def wordSubsets(self, A: List[str], B: List[str]) -> List[str]:
Bfreq, ans, cmax = {}, [], 0
for word in B:
for char in word:
count = word.count(char)
if char in Bfreq:
diff = count - Bfreq[char]
if diff > 0:
Bfreq[char] = count
cmax += diff
else:
Bfreq[char] = count
cmax += count
if cmax > 10: return ans
print(Bfreq)
for word in A:
if len(word) < cmax: continue
for char in Bfreq:
if word.count(char) < Bfreq[char]: break
else: ans.append(word)
return ans
Python Code w/ Counter:
(Jump to: Problem Description || Solution Idea)
class Solution:
def wordSubsets(self, A: List[str], B: List[str]) -> List[str]:
Bfreq = Counter()
for word in B: Bfreq |= Counter(word)
if sum(Bfreq.values()) > 10: return []
return [word for word in A if not Bfreq - Counter(word)]
Java Code:
(Jump to: Problem Description || Solution Idea)
class Solution {
public List<String> wordSubsets(String[] A, String[] B) {
int[] Bfreq = new int[26], check = new int[26];
int cmax = 0;
List<String> ans = new ArrayList<>();
for (int i = 0; i < B.length; i++, Arrays.fill(check, 0)) {
for (char c : B[i].toCharArray())
check[c - 'a']++;
for (int j = 0; j < 26; j++) {
int diff = check[j] - Bfreq[j];
if (diff > 0) {
cmax += diff;
Bfreq[j] += diff;
}
}
if (cmax > 10) return ans;
}
for (int i = 0; i < A.length; i++, Arrays.fill(check, 0)) {
int j;
for (char c : A[i].toCharArray())
check[c - 'a']++;
for (j = 0; j < 26; j++)
if (check[j] < Bfreq[j]) break;
if (j == 26) ans.add(A[i]);
}
return ans;
}
}
C++ Code:
(Jump to: Problem Description || Solution Idea)
class Solution {
public:
vector<string> wordSubsets(vector<string>& A, vector<string>& B) {
int Bfreq[26] = {0}, check[26] = {0};
int cmax = 0;
vector<string> ans;
for (string word : B) {
for (char c : word) check[c - 'a']++;
for (int j = 0; j < 26; j++) {
int diff = check[j] - Bfreq[j];
if (diff > 0) cmax += diff, Bfreq[j] += diff;
}
if (cmax > 10) return ans;
fill(check, check+26, 0);
}
for (string word : A) {
int j;
for (char c : word) check[c - 'a']++;
for (j = 0; j < 26; j++)
if (check[j] < Bfreq[j]) break;
if (j == 26) ans.push_back(word);
fill(check, check+26, 0);
}
return ans;
}
};
Top comments (3)
Unfortunately, that doesn't account for multiple occurrences of the same letter.
The instructions state:
The code you posted would find "wrr" as a subset of "world", because the set will only compare a single "r" instead of requiring 2 for a match.
It's still a remarkable piece of code, however! One of the things that impresses (and intimidates) me the most about Python is its ability to allow for truly concise code.
This is very helpful, thank you for sharing. I sometimes have difficulties with writing the perfect code without any bugs, because I don’t pay attention on writing all the words correctly or even a set of words. And when I check, I kind of skip the content, but rather pay attention to the structure. This is definitely something I need to work on. But yeah, your information is really helpful in this case. I try to improve my English vocabulary skills on a daily basis and hopefully I will get better over time. I actually find a nice website lettersolver.com/4-letter-words-en... that helps me with that. It generates a wide range of words adjusted to your search query.
word game always remain interesting game for the last two decays. It help you to make your mind more stronger and healthy.