Find Anagrams in a List / by Zach Mayer

The first interview question in the new blog series is pretty simple. Given an array of strings, find which strings are anagrams of each other and group them together. The result will be a two-dimensional array.

Special Rules:

  1. Respect case. 'Abc' is different from 'abc.'
  2. Remove duplicates. Return values should all be unique.

So, given the array:

['xxy', 'cab', 'bac', 'cab']

We should return:

[['xxy'], ['cab', 'bac']]

Right off the bat, there are two things that should call out to us while we build the findAnagrams function.

First, we want unique values from the input array. That means we can convert the input array to a Set, saving a loop and check on the final product.

Second, since we respect case and we know we're looking for anagrams, we can compare strings that are sorted by their characters without any extra normalization. Since we're sorting the strings and looking for matches from those sorts, the most obvious data structure to use is a Map.

Basically, we'll create a set from our input array, then iterate over the set, sorting each entry's characters and adding the sorted strings as keys to map (where no such key exists) and the original strings as values in the map. The map will then look something like...

{'xxy': ['xxy'], 'abc': ['cab', 'bac']}

...and we can just convert the map's values to an array.

Easy-peasy, here's the code :)

Now my Big-O is a little rusty, but assuming our sort function is O-Nlog(N), our findAnagrams function should be O-M(NlogN) where M is the length of our input array and N is the length of the longest string in that array. That should be about as fast as this algorithm can get. Space complexity is O-3M in the worst case because we create and append to new instances of Set and Map.  That could probably be reduced a bit at the expense of speed and clarity using a complex reduce function, but I'm happy with this solution.