DEV Community

Cover image for Write a script to identify an anagram
Peter Kim Frank
Peter Kim Frank

Posted on

Write a script to identify an anagram

Inspired by this tweet from the always awesome Fermat's Library.

Challenge

In the language of your choice, write a script that will determine if two strings are an anagram for each other. You can use the approach outlined in the tweet, or — for bonus points — come up with your own solution 🤓.

Via Wikipedia:

An anagram is word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.

For our challenge, we'll use the strict definition — each letter must be used exactly once.

Expected Behavior:

isAnagram(stressed, desserts) // TRUE
isAnagram(happy, sad) // FALSE
Enter fullscreen mode Exit fullscreen mode

Have fun!

Oldest comments (38)

Collapse
 
peter profile image
Peter Kim Frank

Paging @heikodudzus , I feel like you always enjoy these challenges!

Collapse
 
heikodudzus profile image
Heiko Dudzus

Hey, thank you very much! I'm glad you sent me a notice. I will look for some spare time! :)

Collapse
 
aspittel profile image
Ali Spittel

In Python, I've always used the builtin Counter collection!

from collections import Counter

def isAnagram(word, check_word):
    return Counter(word) == Counter(check_word)
Collapse
 
evanoman profile image
Evan Oman • Edited

The common way, written in Scala:

scala> def isAnagram(s1: String, s2: String): Boolean = s1.sorted.equalsIgnoreCase(s2.sorted)
isAnagram: (s1: String, s2: String)Boolean                                                   

scala> isAnagram("stressed", "desserts")                                                     
res1: Boolean = true                                                                         

scala> isAnagram("sad", "happy")                                                             
res2: Boolean = false
Collapse
 
ben profile image
Ben Halpern • Edited

I'd think sorting the characters of the string and comparing them would be a fine solution.

In Ruby:

def is_anagram?(first_word, second_word)
    first_word.chars.sort.join == first_word.chars.sort.join
end

We might want to refactor into two functions

def is_anagram?(first_word, second_word)
    sort_alphabetically(first_word) == sort_alphabetically(second_word)
end

def sort_alphabetically(string)
    string.chars.sort.join
end

The ? in Ruby is just another character, but it's commonly used in naming comparison methods like this.

Collapse
 
ben profile image
Ben Halpern

Actually, the join step is unnecessary given that the arrays will match or not at that point, but it still seems like an ergonomic step if we are thinking in words. 🙂

Collapse
 
ben profile image
Ben Halpern • Edited

If we wanted the comparison to work for n words, we might take an approach like this:

    def are_anagrams?(words)
        words.map { |w| sort_alphabetically(w) }.uniq.size == 1
    end

This would check if the "unique" array of sorted words is a length of one, indicating they are all the same.

Thread Thread
 
ben profile image
Ben Halpern • Edited

In true Ruby style, we could monkey patch the Array class, like such:

class Array
  def are_anagrams?
    self.map { |w| sort_alphabetically(w) }.uniq.size == 1
  end
end

So then we could call:

["pots", "post", "stop"].are_anagrams? # true
["pots", "post", "ghost"].are_anagrams? # false
Thread Thread
 
ben profile image
Ben Halpern • Edited

Looking up methods on the Array class, I see that I can refactor the above to:

def are_anagrams?
    self.map { |w| sort_alphabetically(w) }.uniq.one?
end

This may be less clear to non-Rubyists, but that method is there for a reason for Ruby folks.

Thread Thread
 
mtbsickrider profile image
Enrique Jose Padilla

Following your train of thought was fun

Thread Thread
 
xelaflash profile image
xelaflash

Very cool
i'm learning ruby currently and the refactor steps are very clear and easy to follow.
At then end only one line method!
i would have written this in 20 lines probably and in a couple of hours :)

Cool stuff.

Collapse
 
xelaflash profile image
xelaflash

Hi Ben,

seems to have a small typo in your code (first part).
After "==" should be second_word, no?

Collapse
 
ayumukasuga profile image
Andrey Kostakov
def isAnagram(a, b):
    return a == b[::-1]
Collapse
 
evanoman profile image
Evan Oman

While this happens to work for stressed and desserts this only checks for words which are reverse copies of each other, not anagrams.

add and dad are anagrams (same letters) but this check does not catch it:

>>> def isAnagram(a, b):   
...     return a == b[::-1]
...                        
>>> isAnagram("dad", "add")
False                      
Collapse
 
evanoman profile image
Evan Oman • Edited

Full blown Java implementation of the Prime Product method mentioned in the tweet (Gist for full viewing):

package testing;

import java.math.BigInteger;
import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.Stream;

public class IsAnagramDemo
{
    /* Get the primes we need */
    private static final List<BigInteger> alphaPrimes = primes()
            .limit(26)
            .collect(Collectors.toList());

    /* Int encoding of lowercase a in unicode */
    private static final int A_INT_ENCODING = 97;

    /* Int encoding of lowercase z in unicode */
    private static final int Z_INT_ENCODING = 122;

    /*
        Main method to demonstrate success
     */
    public static void main(String[] args) throws Exception
    {
        assert isAnagram("stressed", "desserts");

        assert !isAnagram("happy", "sad");
    }

    /**
     * Determines if two strings are anagrams using the prime product method
     *
     * @param s1 First input string
     * @param s2 Second input string
     * @return Boolean indicating anagram-ness
     * @throws IllegalArgumentException Thrown when at least one of the input strings is invalid
     */
    public static boolean isAnagram(String s1, String s2) throws IllegalArgumentException
    {
        return primeProduct(s1).equals(primeProduct(s2));
    }

    /**
     * Converts a string to the corresponding product of prime values
     *
     * @param s Input string
     * @return Big integer product of primes
     * @throws IllegalArgumentException Thrown when strong contains chars outside of A-z
     */
    private static BigInteger primeProduct(String s) throws IllegalArgumentException
    {
        /* Convert to lowercase char array */
        char[] chars = s.toLowerCase().toCharArray();

        BigInteger product = BigInteger.ONE;

        for (char c : chars)
        {
            /* Cast char to int */
            int cInt = (int)c;

            /* If the char is out of bounds we must throw an exception */
            if (cInt < A_INT_ENCODING || cInt > Z_INT_ENCODING)
            {
                throw new IllegalArgumentException("Character \"" + c + "\" not valid");
            }
            /* Otherwise we can do the prime lookup */
            else
            {
                /* Prime value corresponding to this char */
                BigInteger primeVal = alphaPrimes.get(cInt % A_INT_ENCODING);

                /* Add this prime to our product */
                product = product.multiply(primeVal);
            }
        }

        /* Return the product */
        return product;
    }

    /**
     * @return Infinite stream of primes
     */
    private static Stream<BigInteger> primes()
    {
        return Stream.iterate(BigInteger.valueOf(2L), n -> n.add(BigInteger
                .ONE)).filter(n -> n.isProbablePrime(10));
    }
}
Collapse
 
rpalo profile image
Ryan Palo

Not super practical maybe, but using binary XOR is a sneaky way of checking for exact pairs of things (or finding an odd one out!). :) No sorting, and it works with punctuation!

# Detects whether two words are anagrams or not
def is_anagram?(word1, word2)
  (word1 + word2).chars.reduce(0) { |acc, curr| acc ^ curr.ord } == 0
end

puts "stressed is an anagram of desserts: #{is_anagram?('stressed', 'desserts')}"
# => true
puts "happy is an anagram of sad: #{is_anagram?('happy', 'sad')}"
# => false
puts "banana? is an anagram of b?naana: #{is_anagram?('banana?', 'b?naana')}"
# => true
Collapse
 
ben profile image
Ben Halpern

Wouldn't "hello" and "hellooo" be a false positive here? I like the feel of this approach though.

Collapse
 
rpalo profile image
Ryan Palo

Oh, good point. This is probably better suited to a “find the one unique letter” type problem. I like the solution that uses ‘uniq’ the best so far, I think.

Collapse
 
danielcamargo profile image
Daniel Camargo

In javascript you can do something like this:

function isAnagram(s1, s2){
    return s1.split('').sort().toString() === s2.split('').sort().toString();
}

isAnagram('stressed', 'desserts');
// true
isAnagram('add', 'dad');
// true
isAnagram('happy', 'sad');
// false

Collapse
 
mkgiles profile image
Conor James Giles • Edited

Scheme R5RS using srfi-13:
First we define a LUT of the first 26 primes.

(define primes '(2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101))

Next we define our function:

(define (isAnagram a b) (let ((mult (lambda (l) (apply * (map (lambda (x) (list-ref primes (- (char->integer x) 97))) (string->list (string-downcase l))))))) (= (mult a) (mult b))))

To break it down, our function takes two string arguments a and b, and applies the internal function 'mult' to both.
Taking a closer look at 'mult':

(mult (lambda (l) (apply * (map (lambda (x) (list-ref primes (- (char->integer x) 97))) (string->list (string-downcase l))))))

It's a lambda expression which takes a string 'l', downcases it, converts it to a list of chars, and maps an anonymous lambda function across that list, then multiplies the resulting list of integers.
Taking a closer look at the anonymous lambda:

lambda (x) (list-ref primes (- (char->integer x) 97))

It quite simply finds the prime number corresponding to the code-point of each character in the list, minus 97 (the code-point for 'a', we can do this because of the earlier downcasing), a number which corresponds to each letter's ordinal value in the alphabet if a corresponds with 0.

Collapse
 
ripsup profile image
Richard Orelup • Edited

My quick PHP one

<?php

function isAnagram($word1, $word2) {
  $word1Array = str_split($word1);
  $word2Array = str_split($word2);
  sort($word1Array);
  sort($word2Array);

  return $word1Array === $word2Array;
}

var_dump (isAnagram("test","etst")); //returns bool(true)
var_dump (isAnagram("test","ftst")); //returns bool(false)

?>


Collapse
 
nektro profile image
Meghan (she/her) • Edited
function isAnagram(a,b) {
    const vals = [a,b].map(x => x.split('').sort());
    return vals[0].every((v,i) => v === vals[1][i]);
}

Alternatively,

function isAnagram(a,b) {
    return new Set([a,b].map(x => x.split('').sort().join(''))).size === 1;
}
Collapse
 
kspeakman profile image
Kasey Speakman • Edited

Edit: Added sort-based solution for Elm.

import String
import List

sortLetters word =
  List.sort (String.toList word)

isAnagram word1 word2 =
  sortLetters word1 == sortLetters word2

I think the primes thing was actually harder than what I did. Here is my 15-line Elm solution where you could also reuse the breakdown of how many of which letters were used. I could have turned it into 1 long line with the 3 import statements, but I won't do that to you. :)

Read this code from the bottom up. I got used to writing it like this in F# because of its single-pass compiler. This is not a requirement of Elm, though.

import Dict exposing (Dict)
import String
import Char


updateCount countM =
  case countM of
    -- no previous count of letter, set to 1
    Nothing ->
      Just 1

    -- saw same letter again, increment
    Just i ->
      Just (i + 1)

updateAnalysis char dict =
  Dict.update (Char.toLower char) updateCount dict

analyzeWord word =
  String.foldr updateAnalysis Dict.empty word

isAnagram word1 word2 =
  -- built-in equality superpowers
  analyzeWord word1 == analyzeWord word2

Don't take my word for it.

Here is a full program with tests that you can run over at elm-lang.org/try. Paste the code below in the left window and hit the Compile button.

I included type annotations for clarity, but they are not necessary.

import Html exposing (Html, text, div, span, u)
import Html.Attributes exposing (style)
import Dict exposing (Dict)
import String
import Char


-- anagram functions

type alias WordAnalysis = Dict Char Int

updateCount : Maybe Int -> Maybe Int
updateCount countM =
  case countM of
    -- no previous count of letter, set to 1
    Nothing ->
      Just 1

    -- saw same letter again, increment
    Just i ->
      Just (i + 1)

updateAnalysis : Char -> WordAnalysis -> WordAnalysis
updateAnalysis char dict =
  Dict.update (Char.toLower char) updateCount dict

analyzeWord : String -> WordAnalysis
analyzeWord word =
  String.foldr updateAnalysis Dict.empty word

isAnagram : String -> String -> Bool
isAnagram word1 word2 =
  analyzeWord word1 == analyzeWord word2


-- test functions

anagramTests : List (String, String)
anagramTests =
  [ ("stressed", "desserts")
  , ("stressed", "deserts")
  , ("marching", "charming")
  , ("happy", "sad")
  ]

renderTest : (String, String) -> Html msg
renderTest (word1, word2) =
  div []
    [ u [] [ text word1 ]
    , text " and "
    , u [] [ text word2 ]
    , if isAnagram word1 word2 then
        span
          [ style [ ( "color", "green" ), ("font-weight", "bold") ] ]
          [ text " are anagrams" ]
      else
        span [] [ text " are not anagrams" ]
    ]

main =
  div [] ( List.map renderTest anagramTests )
Collapse
 
smartyalgo profile image
Michael Ho • Edited

You don't need sort or multiply or any fancy stuff

int[26] wordcount; //array containing encounter count of each character.

for each char:
    charnum <- ConvertCharToNum(char)
    wordcount[charnum]++

return wordcount

Check if the wordcount array of the two strings.

Computation
Does not require to sort, which is nlogn. This only requires a single pass to create wordcount array, thus n.
You can also optimize on aborting upon first index mismatch with this solution (You could also do this with the prime hashing, by aborting on first division yielding a decimal number).

Memory
The 26th prime is 97. Assume it's 100, 6 'z's would be 1006 = 1012. It would take 39 bits, or 5 bytes to store (1012 /log2). This is much less efficient to store a hash. If you want to be stricter, you could store only 1 wordcount, and have the second string count down instead of up.

Feature extension
If I want to accept numbers, the 36th prime is 367 (See Memory). With the array, it's just allocating another 10 index. It is also possible to add this feature without redeploying code, by replacing a specifically allocated array with a map of char to count

Collapse
 
misterysin profile image
MTD

Best answer here. The prime trick is nice but as you said it gets way too expensive very fast.

Collapse
 
heikodudzus profile image
Heiko Dudzus • Edited

This is good. I adapted it to Haskell, recording letter frequencies in Maps/dictionaries. The Map data structure is already an instance of the Eq typeclass, so it was a one-liner to add the equality check:

import qualified Data.Map.Strict as M

-- incrementor for Map entries (used by Map's alter function)

inc Nothing = Just 1
inc (Just x) = Just (x+1)

-- produces a Map containing frequencies of elements in a given list (or Foldable)
-- e.g. a Map of frequencies of characters in a given word

frequencies :: (Foldable t, Ord k, Num a) => t k -> M.Map k a
frequencies = foldr (M.alter inc) M.empty

-- equality is already defined for Maps :)

isAnagram word1 word2 = frequencies word1 == frequencies word2
Collapse
 
joshavg profile image
Josha von Gizycki

Always happy to provide Clojure answers ;)

(defn anagram? [w1 w2]
  (= (sort w1) (sort w2)))

And of course I snooped the algorithm from all the other answers here.

Collapse
 
heikodudzus profile image
Heiko Dudzus • Edited

I stick with mappings to prime numbers. Haskell, without considerations of performance:

import Data.Char(toLower)
import Data.List(lookup)

primes = sieve [2..]
  where sieve (p:xs) = p : sieve [x | x <- xs, x `mod` p /= 0]

charToPrime c = lookup c $ zip ['a'..'z'] primes

eval word = product <$> sequence (map (charToPrime . toLower) word)

isAnagram word1 word2 = eval word1 == eval word2

lookup and charToPrime return a Maybe value, without catching the eye. Umlauts or digits result in a Nothing value for the character and in a product of 1 for the word. The handling of those cases could be improved.