DEV Community

Cover image for LeetCode Meditations: Group Anagrams
Eda
Eda

Posted on • Originally published at rivea0.github.io

LeetCode Meditations: Group Anagrams

Let's start with the description for Group Anagrams:

Given an array of strings strs, group the anagrams together. You can return the answer in any order.

An Anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.

For example:

groupAnagrams(['eat', 'tea', 'tan', 'ate', 'nat', 'bat']);
// -> [ ['bat'], ['nat', 'tan'], ['ate', 'eat', 'tea'] ]

groupAnagrams(['']);
// -> [ [''] ]

groupAnagrams(['a']);
// -> [ ['a'] ]
Enter fullscreen mode Exit fullscreen mode

And, as one of the constraints says, each of the strings will consist of lowercase English letters.


One thing to remember from the previous Valid Anagram problem is that we can easily check if two strings are anagrams of each other by comparing their sorted versions.

So, we can use a hash table to store the sorted words. In that case, all words that are anagrams of each other will be grouped together in an array, and share the same key:

function groupAnagrams(strs: string[]): string[][] {
  let words: { [word: string]: string[] } = {};

  for (let s of strs) {
    let sortedWord = [...s].sort().join('');
    (sortedWord in words) ? words[sortedWord].push(s) : words[sortedWord] = [s];
  }

  return Object.values(words);
};
Enter fullscreen mode Exit fullscreen mode

Time and space complexity

Since we're using the sorting operation, time complexity will be O(n log n)O(n \ log \ n) as it is the best we can do with sorting. But we're doing the sorting operation for each element in strs, so the loop itself has an O(n)O(n) time complexity.
To not confuse ourselves, we'll use another variable, mm , to denote the length of strs, that is, the number of times we'll iterate for each element. Overall, the time complexity will be O(mn log n)O(m \cdot n \ log \ n) .

We can say that space complexity is O(mn)O(m \cdot n) where mm is the length of strs and nn is the length of the longest string, because in the worst case where all strings are anagrams of each other, the value array can contain mm strings, and the key's length will be nn , so, words will grow proportionally.

Using Python

To reflect the ternary operation in the TypeScript version:

(sortedWord in words) ? words[sortedWord].push(s) : words[sortedWord] = [s];

 
We could write it in Python like:

words[sorted_word].append(s) if sorted_word in words else words[sorted_word] = [s]

 
But since it's a bit clunky, we can use setdefault() where we're setting the default value of words[sorted_word] to [].

It might look like this in Python:

class Solution:
    def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
        words = {}

        for s in strs:
            sorted_word = ''.join(sorted(s))
            words.setdefault(sorted_word, []).append(s)

        return words.values()
Enter fullscreen mode Exit fullscreen mode

Now, after taking a deep breath, we can look at NeetCode's solution.


And, voilà, a more efficient solution exists:

from collections import defaultdict

class Solution:
    def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
        res = defaultdict(list)

        for s in strs:
            count = [0] * 26 # a ... z

            for c in s:
                count[ord(c) - ord('a')] += 1

            res[tuple(count)].append(s)

        return res.values()
Enter fullscreen mode Exit fullscreen mode

Here, the constraint we mentioned in the beginning gives some perspective to this solution:
For each string, we can count the number of characters from 'a' to 'z', because the strings will be just lowercase English letters.

We can still use a hash table, and map each of the 26 letters to an index, and increase the value at that index every time we see that letter. The keys will be these arrays of length 26, and the values will be the arrays of strings themselves.

For example, if we have these strings:

['eat', 'tea', 'tan']
Enter fullscreen mode Exit fullscreen mode

Then, res will look like this:

{
    (1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0): ['tan'],
    (1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0): ['eat', 'tea']
}
Enter fullscreen mode Exit fullscreen mode
Note
count is converted into a tuple because lists cannot be keys as they are mutable. You can read more on this.

Also note that in the code, we use the ASCII numbers of the characters to get their index, for example, the count of 'a' will be at the 0th index, so it's basic offset arithmetic.

For instance, the ASCII number of 'z' is 122, and 'a' is 97, when you get the difference, it will be 25, meaning that the 'z' will be at the end of the array, that is, the 25th index.


After taking another deep breath, let's try converting it into TypeScript:

function groupAnagrams(strs: string[]): string[][] {
  let result: { [count: string]: string[] } = {};

  for (let s of strs) {
    let count = new Array(26).fill(0);

    for (let c of s) {
      count[c.charCodeAt(0) - 'a'.charCodeAt(0)]++;
    }

    const key = count.toString();

    !(key in result) ? result[key] = [s] : result[key].push(s);
  }

  return Object.values(result);
};
Enter fullscreen mode Exit fullscreen mode
Note
While we couldn't use lists as keys in the Python version, we'll just convert the count array into string with toString() in this TypeScript version and use it as key.

Time and space complexity

The time complexity will be O(mn)O(m \cdot n) where mm is the total number of strings and the nn is the length of a string.

For the space complexity, the dominant item will be the res variable (the count array won't matter much because it won't grow with the input size, it is constant, or O(1)O(1) ).
In the case where each key is unique, the space complexity will be O(mn)O(m \cdot n) where mm is the total number of strings, and nn is the length of the longest string.


And, that's the end of Group Arrays. The next one will be Top K Frequent Elements, until then, happy coding.

Top comments (0)