Mage AI

Posted on Nov 30, 2021

Solving Interview Problems with Deep Learning

#tensorflow #machinelearning #programminginterviews #technicalinterview

![Image description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/qksna9yv4dtg1hw0dzz9.jpg)_Photo credit: Godzilla vs. Kong_

It’s been a while since I’ve practiced programming interview questions, and I worry that my skills are lacking. It’s always important to work through some every now and then to stay sharp so here we go:

Let’s start with Fizz Buzz:

Write a program that prints the numbers from 1 to 100. But for multiples of 3print “Fizz” instead of the number and for the multiples of 5 print “Buzz”. For numbers which are multiples of both 3 and 5 print “FizzBuzz”.

This solution was actually inspired by Joel Grus. We can represent numbers with binary encoding:

import tensorflow as tf
import numpy as np
INPUT_SIZE = 10 # We can encode up to 2^15 numbers with binary encoding.
HIDDEN_SIZE = 100 # Hidden layer of size 100
OUTPUT_SIZE = 4 # One hot encoding for the 4 possible outputs: "Fizz" "Buzz" "FizzBuzz", ""
NUM_EPOCHS = 10000
def binary_encode(number):
    data = np.zeros(INPUT_SIZE)
    for bitshift in range(INPUT_SIZE):
        data[bitshift] = number >> bitshift & 1 # Get the bit at position bitshift
    return data
def fizz_buzz_encode(i):
    if   i % 15 == 0: return np.array([0, 0, 0, 1])
    elif i % 5  == 0: return np.array([0, 0, 1, 0])
    elif i % 3  == 0: return np.array([0, 1, 0, 0])
    else:             return np.array([1, 0, 0, 0])
def fizz_buzz_decode(encoding):
    max_idx = np.argmax(encoding)
    if max_idx == 0:
        return ""
    if max_idx == 1:
        return "fizz"
    if max_idx == 2:
        return "buzz"
    if max_idx == 3:
        return "fizzbuzz"

Now, we can generate some training data (we will generate training data from 101 to 1024) since our actual problem will solve fizzbuzz for 1–100.

X_train = np.array([binary_encode(i) for i in range(101, 2 ** INPUT_SIZE)])
y_train = np.array([fizz_buzz_encode(i) for i in range(101, 2 ** INPUT_SIZE)])

And now, let’s define our multilayer perceptron model that will actually perform the bulk of the work. First, we define the inputs to the model:

X = tf.placeholder('float', [None, INPUT_SIZE])
Y = tf.placeholder('float', [None, OUTPUT_SIZE])

Next, let’s define the weights and biases that we will learn via backpropagation:

w1 = tf.get_variable("w1", [INPUT_SIZE, HIDDEN_SIZE], initializer=tf.random_normal_initializer())
b1 = tf.get_variable("b1", [HIDDEN_SIZE])
w2 = tf.get_variable("w2", [HIDDEN_SIZE, OUTPUT_SIZE], initializer=tf.random_normal_initializer())
b2 = tf.get_variable("b2", [OUTPUT_SIZE])

And now, let’s feed our input through the model:

z1 = tf.add(tf.matmul(X, w1), b1)
a1 = tf.nn.relu(z1)
z2 = tf.add(tf.matmul(a1, w2), b2)

Then, let’s compute the loss and tell tensorflow to minimize that loss:

cost = tf.nn.softmax_cross_entropy_with_logits(logits=z2, labels=Y)
optimizer = tf.train.GradientDescentOptimizer(0.001).minimize(cost)

Now, let’s actually feed our real training data into that model:

init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    for i in range(NUM_EPOCHS):
        c, o = sess.run([cost, optimizer], feed_dict={X: X_train, Y: y_train})
        print(np.sum(c))
    X_test = np.array([binary_encode(i) for i in range(1, 100)])
    y_test = np.array([fizz_buzz_encode(i) for i in range(1, 100)])
    pred = sess.run(z2, feed_dict={X:X_test, Y:y_test})

And print out the results:

def real_fizz_buzz(i):
    txt = ""
    if i % 3 == 0:
        txt += "fizz"
    if i % 5 == 0:
        txt += "buzz"
    return txt

for i in range(len(X_test)):
    text = fizz_buzz_decode(pred[i])
    true_text = real_fizz_buzz(i + 1)
    print("{}: {} ({})".format(i + 1, text, "✔" if text == true_text else "x"))

Results:

1:  (✔)
2:  (✔)
3:  (x)
...
97:  (✔)
98:  (✔)
99: fizz (✔)
Total Accuracy: 89.89%

For all that work, we achieved an embarrassingly low 90% accuracy. I was going to try to fix this but then quickly lost interest. Let’s move onto something a bit more interesting:

Write a program that counts the number of unique characters in a string.

This sounds like a problem that can be solved with an LSTM. Let’s generate some data.

import tensorflow as tf
import random
import numpy as np
CHARS = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
STRING_LENGTH = 12
num_examples = 10000
# Args:
#   n: Number of examples to generate.
# Returns:
#   strings_v: numpy array of the form (n, STRING_LENGTH, len(CHARS)). One hot encoding of sequences of text
#   strings: Array of actual generated random text:
#   uniques_v: numpy array of the form (n, len(CHARS)). One hot encoding of number of unique characters
#   uniques: numpy array of length n, number of unique characters for each sequence.
def generate_data(n=num_examples):
    chars_to_idx = { c: i for i, c in enumerate(CHARS)}

    strings_v = np.zeros([n, STRING_LENGTH, len(CHARS)])
    strings = [''] * n
    uniques = np.zeros(n)
    uniques_v = np.zeros([n, len(CHARS)])
    for x in range(n):
        for y in range(STRING_LENGTH):
            random.shuffle(CHARS)
            char = CHARS[0]

            strings_v[x][y][chars_to_idx[char]] = 1
            strings[x] += char

        uniques[x] = len(set(strings[x]))
        uniques_v[x][len(set(strings[x])) - 1] = 1

    return strings_v, strings, uniques_v, uniques

Next let’s create our LSTM model:

HIDDEN_LAYERS = 64
X = tf.placeholder("float", [None, STRING_LENGTH, len(CHARS)])
y = tf.placeholder("float", [None, len(CHARS)])
X_seq = tf.unstack(X, STRING_LENGTH, 1)
lstm_cell = tf.contrib.rnn.BasicLSTMCell(HIDDEN_LAYERS)
#sequence of 12 chars to output of 7
outputs, states = tf.contrib.rnn.static_rnn(lstm_cell, X_seq, dtype=tf.float32)
final_output = outputs[-1]
weights = tf.get_variable("weights", [HIDDEN_LAYERS, len(CHARS)], initializer=tf.random_normal_initializer())
biases = tf.get_variable("biases", [len(CHARS)], initializer=tf.random_normal_initializer())
prediction = tf.add(tf.matmul(final_output, weights), biases)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=y))
optimizer = tf.train.AdamOptimizer(1e-2)
train_op = optimizer.minimize(cost)

Now, let’s go ahead and run our model:

init = tf.global_variables_initializer()
costs = []
EPOCHS = 300
with tf.Session() as sess:
    sess.run(init)
    for i in range(EPOCHS):
        _, c, _ = sess.run([train_op, cost, prediction], feed_dict={X:X_train, y: y_train})
        costs.append(c)
        if i % 10 == 0:
            print('cost: {}, epoch: {}'.format(c, i))

            X_test, X_test_strings, y_test, y_test_strings = generate_data(100)
            p = sess.run(prediction, feed_dict={X: X_test, y: y_test})
            prediction_idxs = np.argmax(p, axis=1)
            prediction_vals = prediction_idxs + 1
correct = 0.0
            for i in range(len(y_test_strings)):
                string = X_test_strings[i]
                actual_val = y_test_strings[i]
                predicted_val = prediction_vals[i]
                # Print the first 5 examples
                if i < 5:
                    print('string: {}, pred: {}, actual: {}'.format(string, predicted_val, actual_val))

                if predicted_val == actual_val:
                    correct += 1
            print("{}% accuracy\n\n".format(correct * 100 / len(y_test_strings)))

cost: 0.0496655665338, epoch: 280
string: eeaccbdeagac, pred: 6, actual: 6.0
string: gefaddbbcfac, pred: 7, actual: 7.0
string: acedcbgdcagf, pred: 7, actual: 7.0
string: aadebcdacefg, pred: 7, actual: 7.0
string: abeeaebcbbag, pred: 5, actual: 5.0
100% accuracycost: 0.0413268692791, epoch: 290
string: ebbfbfaebede, pred: 5, actual: 5.0
string: fgaccdbabedg, pred: 7, actual: 7.0
string: dgdcbaefcdad, pred: 7, actual: 7.0
string: ffagdfedccad, pred: 6, actual: 6.0
string: gfggfgbebcdb, pred: 6, actual: 6.0
99% accuracy

Hopefully the interviewer won’t be disappointed that we cannot solve this problem for any string that’s greater than MAX_LENGTH, but overall, it achieved a 99% accuracy. I didn’t deal with variable length inputs in this case, we could’ve easily done that by adding padding.

Now, instead of just printing out the number of unique characters, print out the actual unique characters in the order they appear.

Ex: “acabdb” => “acbd”

Oof. Thats a much tougher problem. In this case, the value we need to return is a sequence of characters instead of a single character or number so we will need some form of sequence to sequence model in order to learn this relationship.

import numpy as np
import tensorflow as tf
from tensorflow.contrib import rnn
import random
#"abc" => "abc"
#"aabbac" => "abc"
#"abacd" => "abcd"
MAX_LENGTH = 6 # Max length of 6 
chars = ["a", "b", "c", "d", "e", "f"]
all_chars = chars + [' '] # Space for padding
NUM_EXAMPLES = 50000
# Args:
#   n: number of examples to generate
# Returns:
#   strings: list of strings that may contain duplicates
#   solutions: strings without duplicates
#   strings_v: One hot encoding of strings with duplicates (without padding)
#   solutions_v: One hot encoding of solutions (with padding)
def generate_data(n=NUM_EXAMPLES):
    all_chars_to_idx = { c:i for i, c in enumerate(all_chars) }
    strings_v = np.zeros((NUM_EXAMPLES, MAX_LENGTH, len(all_chars)))
    solutions_v = np.zeros((NUM_EXAMPLES, MAX_LENGTH, len(all_chars)))

    strings = [''] * NUM_EXAMPLES
    solutions = [''] * NUM_EXAMPLES

    for i in range(NUM_EXAMPLES):
        for l in range(MAX_LENGTH):
            char = random.choice(chars) # only sample from valid characters
            strings[i] += char
            if char not in solutions[i]:
                solutions[i] += char

        # Pad solutions strings
        num_missing = MAX_LENGTH - len(solutions[i])
        solutions[i] += ' ' * num_missing

    for x in range(len(strings)):
        for y in range(MAX_LENGTH):
            string_char = strings[x][y]
            strings_v[x][y][all_chars_to_idx[string_char]] = 1

            solution_char = solutions[x][y]
            solutions_v[x][y][all_chars_to_idx[solution_char]] = 1

    return strings, solutions, strings_v, solutions_v

Again, we can one hot encode the sequences, but this time, we will add some padding since the length of the output string is variable. Instead of trying to return a variable length sequence, we will just return a sequence that is equal to the length of the input, and pad the output with spaces.

For example, “abdab” would map to a string of the same length, but with spaces as padding: “abd “.

Let’s create a test and training split.

strings, solutions, strings_v, solutions_v = generate_data()
split_at = len(strings) - (len(strings) // 10)
strings_train = strings[:split_at]
solutions_train = solutions[:split_at]
X_train = strings_v[:split_at]
y_train = solutions_v[:split_at]
strings_test = strings[split_at:]
solutions_test = solutions[split_at:]
X_test = strings_v[split_at:]
y_test = solutions_v[split_at:]

Now we can set up our model. Let’s start with the encoder:

encoded_input = tf.placeholder(tf.float32, shape=(None, MAX_LENGTH, len(all_chars)))
decoded_input = tf.placeholder(tf.float32, shape=(None, MAX_LENGTH, len(all_chars)))
with tf.name_scope("basic_rnn_seq2seq") as scope:
    encoded_sequence = tf.unstack(encoded_input, MAX_LENGTH, 1)
    encoder_cell = rnn.BasicLSTMCell(128, forget_bias=1.0)
    encoded_outputs, states = rnn.static_rnn(encoder_cell, encoded_sequence, dtype=tf.float32)

And now, the decoder:

with tf.name_scope("lstm_decoder") as scope:
    decoded_sequence = tf.unstack(decoded_input, MAX_LENGTH, 1)
    decoder_cell = rnn.BasicLSTMCell(128, reuse=True)
    decoded_outputs, _ = rnn.static_rnn(decoder_cell, decoded_sequence, initial_state=states, dtype=tf.float32)

Now, we can compute the predictions by multiplying the decoder’s hidden layer output with a fully connected layer:

with tf.name_scope("fully_connected") as scope:
    weights = tf.get_variable('weights', (128, len(all_chars)), initializer=tf.random_normal_initializer())
    biases = tf.get_variable('biases', (len(all_chars)), initializer=tf.random_normal_initializer())

    predictions = []
    encoded_sequence = tf.unstack(decoded_input, MAX_LENGTH, 1)
for output in decoded_outputs:
        prediction = tf.add(tf.matmul(output, weights), biases)
        predictions.append(prediction)
concatenated_outputs = tf.stack(predictions, 0)
    concatenated_outputs = tf.transpose(concatenated_outputs, perm=[1, 0, 2])
    concatenated_inputs = tf.concat(decoded_input, 0)

Now, we can compute the loss:

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=concatenated_outputs, labels=concatenated_inputs))
# FC Layer
optimizer = tf.train.AdamOptimizer(1e-2)
train_op = optimizer.minimize(cost)

And now let’s run our model:

def decode_guess(one_hot):
    return ''.join([all_chars[m] for m in np.argmax(one_hot, axis=1)])
init = tf.global_variables_initializer()
costs = []
with tf.Session() as sess:
    sess.run(init)
    for i in range(30):
        e, r, c, t, c_out, c_in = sess.run([encoded_outputs, predictions, cost, train_op, concatenated_outputs, concatenated_inputs], feed_dict={encoded_input: X_train, decoded_input: y_train})
        costs.append(c)
        if i % 5 == 0:
            print('training cost: {}, epoch: {}'.format(c, i))

            results = sess.run(predictions, feed_dict={encoded_input: X_test, decoded_input: y_test})
            guesses = np.array(results).transpose(1, 0, 2)
            for i in range(5):
                string = strings_test[i]
                solution = solutions_test[i]
                guess_decoded = decode_guess(guesses[i])
                print("{}: {} - {}".format(string, solution, guess_decoded))

            correct = 0.0
            for i in range(len(strings_test)):
                string = strings_test[i]
                solution = solutions_test[i]
                guess_decoded = decode_guess(guesses[i])
                if i < 5:
                    print("input: {}, solution: {}, prediction: {}".format(string, solution, guess_decoded))
if solution == guess_decoded:
                    correct += 1

            print("{} % accuracy".format(correct / len(strings_test) * 100))

And here are the results:

training cost: 2.07600975037, epoch: 0
cebbdf: cebdf  -       
dffcbc: dfcb   -       
aadeab: adeb   -       
faceec: face   -       
bfaeec: bfaec  -       
0.0 % accuracy...training cost: 0.219934388995, epoch: 25
cebbdf: cebdf  - cebdf 
dffcbc: dfcb   - dfcb  
aadeab: adeb   - adeb  
faceec: face   - face  
bfaeec: bfaec  - bfaec 
100.0 % accuracy

It looks like by 25 epochs, our model has a fairly good understanding of how to solve this problem.