DEV Community

TK
TK

Posted on • Originally published at leandrotk.github.io

Algorithms Problem Solving: Jewels and Stones

This post is part of the Algorithms Problem Solving series. And it was originally published at TK's blog.

Problem description

This is the Jewels and Stones problem. The description looks like this:

You're given strings J representing the types of stones that are jewels, and S representing the stones you have. Each character in S is a type of stone you have. You want to know how many of the stones you have are also jewels.

The letters in J are guaranteed distinct, and all characters in J and S are letters. Letters are case sensitive, so "a" is considered a different type of stone from "A".

Examples

# input: J = "aA" | S = "aAAbbbb"
# output: 3

# input: J = "z", S = "ZZ"
# output: 0
Enter fullscreen mode Exit fullscreen mode

Solution

At first, I tried to understand some corner cases. For example, what should I return if the J or the S is an empty string?

As my first solution, the brute force solution, I needed to look through the string. So I didn't need to care about the empty strings. For empty strings, it doesn't loop, it just returns the default counter, in this case, 0.

I just needed to loop through the J and for each character in the J string, I need to compare to each character of S. If they match, I increment the counter.

After looping through each character, just return the final counter.

def num_jewels_in_stones(J, S):
    jewels = 0

    for j in J:
        for s in S:
            if s == j:
                jewels += 1

    return jewels

print(num_jewels_in_stones("aA", "aAAbbbb")) # 3
print(num_jewels_in_stones("z", "ZZ")) # 0
Enter fullscreen mode Exit fullscreen mode

This is a O(N^2) solution in terms of time complexity. Or more precisaly: O(len(J) * len(S)). For the space complexity, it is O(1) as we just store the value in a counter. If len(J) or len(S) increase, the used space keeps constant.

Just to iterate this solution, we can make it O(N) in terms of time complexity by using a hash table to store all characters as key and the counter as value.

def num_jewels_in_stones_opt(J, S):
    chars_counter = {}
    counter = 0

    for char in J:
        if char in chars_counter:
            chars_counter[char] += 1
        else:
            chars_counter[char] = 1

    for char in S:
        if char in chars_counter:
            counter += chars_counter[char]

    return counter

print(num_jewels_in_stones_opt("aA", "aAAbbbb")) # 3
print(num_jewels_in_stones_opt("z", "ZZ")) # 0
Enter fullscreen mode Exit fullscreen mode

So, for each J's character. Verify if it is already in the chars_counter. If it is, just increment it. If not, initialize the counter.

Then, for each S's character, verify if this character is in the chars_counter. If it is, get it and add to the counter variable. If not, do nothing.

After this iteration, we need the final value of the counter. Just return it.

As we said before, it runs in O(N). Better than the first solution. But, for the worst-case scenario, the space complexity is O(N), worse than the first approach.


For more stories and posts about my journey learning & mastering software engineering, take a look at what I've been writing and documenting.

My Twitter & Github. ☺

Resources

Top comments (6)

Collapse
 
musale profile image
Musale Martin

Nice article. I wonder what the impact of using the collections.Counter package has on the second solution? So instead of initializing the chars_counter to a dict, you initialize it to the collections.Counter in-built for J's characters.

Collapse
 
arcticspacefox profile image
ArcticSpaceFox

Cool series, but I wanna solve the problem before reading your solution. Hope to see more coming but no stress 👍

Collapse
 
teekay profile image
TK • Edited

Thanks! I have 10 more posts to publish. I just need to organize them and I'll definitely publish them here :)

Collapse
 
arcticspacefox profile image
ArcticSpaceFox

Hmm weird, during testing of both algorithms they take the same amount of time, do you see why that is? I test them on 10 Million random chars of aAbB

Collapse
 
teekay profile image
TK

I think it is because you are running with short length strings (aAbB). The first solution will be very slow for the worst-case scenario, where the string has a long length.

Imagine this example: strings with 10,000 chars.

The first solution has a nested for. The combination of each character would be 10,000 * 10,000 = 100,000,000.

The second solution has two for, but not nested. It would be 10,000 + 10,000 = 20,000 for the worst-case scenario.

The curve increases differently for each solution. Did it make sense to you?

Just for curiosity: how are you testing the amount of time spent by each algorithm?

Collapse
 
arcticspacefox profile image
ArcticSpaceFox

I use timeit and measure time before and after. Thank you I will try to modify my file which is tested on