Inspired by this tweet from the always awesome Fermat's Library.
Challenge
In the language of your choice, write a script that will determine if two strings are an anagram for each other. You can use the approach outlined in the tweet, or — for bonus points — come up with your own solution 🤓.
Via Wikipedia:
An anagram is word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.
For our challenge, we'll use the strict definition — each letter must be used exactly once.
Expected Behavior:
isAnagram(stressed, desserts) // TRUE
isAnagram(happy, sad) // FALSE
Have fun!
Oldest comments (38)
Paging @heikodudzus , I feel like you always enjoy these challenges!
Hey, thank you very much! I'm glad you sent me a notice. I will look for some spare time! :)
In Python, I've always used the builtin Counter collection!
The common way, written in Scala:
I'd think sorting the characters of the string and comparing them would be a fine solution.
In Ruby:
We might want to refactor into two functions
The
?
in Ruby is just another character, but it's commonly used in naming comparison methods like this.Actually, the
join
step is unnecessary given that the arrays will match or not at that point, but it still seems like an ergonomic step if we are thinking in words. 🙂If we wanted the comparison to work for
n
words, we might take an approach like this:This would check if the "unique" array of sorted words is a length of one, indicating they are all the same.
In true Ruby style, we could monkey patch the
Array
class, like such:So then we could call:
Looking up methods on the
Array
class, I see that I can refactor the above to:This may be less clear to non-Rubyists, but that method is there for a reason for Ruby folks.
Following your train of thought was fun
Very cool
i'm learning ruby currently and the refactor steps are very clear and easy to follow.
At then end only one line method!
i would have written this in 20 lines probably and in a couple of hours :)
Cool stuff.
Hi Ben,
seems to have a small typo in your code (first part).
After "==" should be second_word, no?
While this happens to work for
stressed
anddesserts
this only checks for words which are reverse copies of each other, not anagrams.add
anddad
are anagrams (same letters) but this check does not catch it:Full blown Java implementation of the Prime Product method mentioned in the tweet (Gist for full viewing):
Not super practical maybe, but using binary XOR is a sneaky way of checking for exact pairs of things (or finding an odd one out!). :) No sorting, and it works with punctuation!
Wouldn't "hello" and "hellooo" be a false positive here? I like the feel of this approach though.
Oh, good point. This is probably better suited to a “find the one unique letter” type problem. I like the solution that uses ‘uniq’ the best so far, I think.
In javascript you can do something like this:
Scheme R5RS using srfi-13:
First we define a LUT of the first 26 primes.
Next we define our function:
To break it down, our function takes two string arguments a and b, and applies the internal function 'mult' to both.
Taking a closer look at 'mult':
It's a lambda expression which takes a string 'l', downcases it, converts it to a list of chars, and maps an anonymous lambda function across that list, then multiplies the resulting list of integers.
Taking a closer look at the anonymous lambda:
It quite simply finds the prime number corresponding to the code-point of each character in the list, minus 97 (the code-point for 'a', we can do this because of the earlier downcasing), a number which corresponds to each letter's ordinal value in the alphabet if a corresponds with 0.
My quick PHP one
Alternatively,
Edit: Added sort-based solution for Elm.
I think the primes thing was actually harder than what I did. Here is my 15-line Elm solution where you could also reuse the breakdown of how many of which letters were used. I could have turned it into 1 long line with the 3 import statements, but I won't do that to you. :)
Don't take my word for it.
Here is a full program with tests that you can run over at elm-lang.org/try. Paste the code below in the left window and hit the Compile button.
You don't need sort or multiply or any fancy stuff
Check if the wordcount array of the two strings.
Computation
Does not require to sort, which is nlogn. This only requires a single pass to create wordcount array, thus n.
You can also optimize on aborting upon first index mismatch with this solution (You could also do this with the prime hashing, by aborting on first division yielding a decimal number).
Memory
The 26th prime is 97. Assume it's 100, 6 'z's would be 1006 = 1012. It would take 39 bits, or 5 bytes to store (1012 /log2). This is much less efficient to store a hash. If you want to be stricter, you could store only 1 wordcount, and have the second string count down instead of up.
Feature extension
If I want to accept numbers, the 36th prime is 367 (See Memory). With the array, it's just allocating another 10 index. It is also possible to add this feature without redeploying code, by replacing a specifically allocated array with a map of char to count
Best answer here. The prime trick is nice but as you said it gets way too expensive very fast.
This is good. I adapted it to Haskell, recording letter frequencies in Maps/dictionaries. The Map data structure is already an instance of the Eq typeclass, so it was a one-liner to add the equality check:
Always happy to provide Clojure answers ;)
And of course I snooped the algorithm from all the other answers here.
I stick with mappings to prime numbers. Haskell, without considerations of performance:
lookup and charToPrime return a Maybe value, without catching the eye. Umlauts or digits result in a Nothing value for the character and in a product of 1 for the word. The handling of those cases could be improved.