DEV Community

Je We
Je We

Posted on

Character Frequency

Alt Text

Hello everyone! Pictured above is how I looked when I went to the grocery store today. While we're all bored let's go through a toy problem! This one is called Character Frequency, and you might be able to guess from the name that we're going to count characters. What's more fun than counting? At summer camp we had an activity called Rock Counting which was always in high demand.

Specifically, Character Frequency is going to be a function that takes a string and spits out an array of tuples, one for each character, indicating character frequencies, e.g. ['a', 5]. These tuples should be arranged from highest frequency to lowest, and characters of equal frequency should be sorted alphabetically.

Let's dive right in. I chose to do this problem recursively because I love recursion to a fault. So, knowing I wanted to work recursively and that I'd be building an array, I started with this base case:

  if(!string) return []; 

If we think ahead, we know that we want our results sorted alphabetically after they're sorted by frequency. We don't know the frequency, but it's easy enough to sort the string alphabetically. We can use a neat little trick, splitting the string into an array so we can sort it alphabetically, and then rejoining it back into a string:

  string = string.split('').sort().join('');

Now, since I'm working recursively, I know I'm only going to care about one character at a time, so how about setting up a counter for that character?

  let frequency = 0;

Alright, let's count this sucker by iterating through the string and incrementing frequency where appropriate. And by "this sucker" I mean the first character of the string AKA string[0]:

  for(let char of string) {
    if(char === string[0]) frequency++;
  }

OK, we've got our full count, so let's store our tuple for use later:

  const tuple = [string[0], frequency];

It's safe to say it would be a waste of to count any character's frequency twice, so let's go ahead and remove all instances of the current character from the string. We can do this using the string replace method, and a RegExp for the character with a global flag (to replace all occurrences).

  string = string.replace(new RegExp(string[0], 'g'), '');

This is the way we'll whittle down the string for our recursive calls rather than the familiar pattern of string.slice(1). Now we just want to return ALL the tuples in one big happy collection, so we return the one we just built with all the rest that we accept on faith we'll get from our recursive call:

  return [tuple].concat(characterFrequency(string))

That's not quite the whole story, though. We want our tuples to be sorted by frequency. We know that the frequency of each character can be found at its tuple's 1st index, so we can use this to sort our return:

  return [tuple].concat(characterFrequency(string))
  .sort((tuple1, tuple2) => tuple1[1] > tuple2[1] ? -1 : 1);

This looks kind of confusing. I always found the array sort method a bit strange. The idea is that the callback takes two elements from the array to compare, and then based on whether the callback returns a negative or positive number, the first element or the second element gets sorted first, respectively. So, what the code above says is: if a character's frequency is greater than the next character's frequency, return -1 (sort it first), otherwise return 1 (sort it after).

Now this is going to work, take my word for it. However, I had an idea while writing this post for a little refactor. You might notice that when we're counting a character's frequency we waste a little time counting through the whole string. Since we've alphabetized the string, we know that as soon as a character doesn't equal the character we're counting, that there are no more occurrences of that character in the string. For example, if we alphabetize 'feed', it becomes 'eedf'. When we're iterating through 'eedf', once we hit 'd' we know we've counted all the e's. So what if we count character frequency using a while loop that runs until the next character ISN'T the current character? Like so:

  let i = 1;
  let frequency = 1;
  while(string[i] === string[0]) {
    frequency++;
    i++;
  }

Now, we can use whatever value i ends up at to determine where to start the slice for the recursive call:

  return [[string[0], frequency]].concat(characterFrequency(string.slice(i)))
  .sort((tuple1, tuple2) => tuple1[1] > tuple2[1] ? -1 : 1);

Alright, that might save us marginal amounts of time! Thanks for reading!

Top comments (0)