04-1 : Group Anagrams

Arrays / Hashing

Overview

I walk you through how to solve Leetcode problem 49. Group Anagrams. There are a couple of approaches to solving this problem. I will highlight both approaches as well as the time and space complexity trade-offs between each approach.

Approach 1: Sort + HashMap

def groupAnagrams(strs):
    anagramGroups = {}
    for word in strs:
        sortedWord = ''.join(sorted(word))
        if sortedWord in anagramGroups:
            anagramGroups[sortedWord].append(word)
        else:
            anagramGroups[sortedWord] = [word]
    return anagramGroups.values()

In this first approach, we rely on sorting the words. The premise of this solution is that if we sort each word, all the anagrams of a particular word will follow the same ordering when sorted. We can use a HashMap to map the sorted version of an anagram (as the key) to the list of all words that are anagrams of the particular key.

As we go through each word, we store the sorted version of that word on the variable sortedWord. Then, we check to see if the sorted version aka sortedWord is already in our HashMap, anagramGroups. If the sortedWord is already a key, then we are going to append the current word to the value in anagramGroups at the key of the sortedWord. If sortedWord is not in the anagramGroups HashMap, then we will add it as a new key and associate it with the value of the current word passed in a list literal, []. Finally, after iterating through all words, we can return a list of our anagramGroups HashMap's values, using the .values() method.

Time Complexity: O(M * N Log N)

Where M is the number of words in or the length of our input array and where the "N" preceding "Log N" is the average word length of the words in strs. The N Log N here is a result of the sorted() method being called on each word. This solution will always be optimum as log(n) will always be smaller than 26 in all of the cases

Space Complexity: O(N)

This is because if every word was distinct, we would end up creating a HashMap with N distinct words in the worst case.