Choon-Siang Lai

Posted on Nov 6, 2018 • Originally published at cslai.coolsilon.com

Learning a new language for more than 2 months (feat. Exercism)

#rust

Usually I take about a week to learn a new language so I can start doing some ~~real~~ work with it. After all a programming language (at least the high level and dynamic ones) is just assignment, calculation, branching, looping and reuse (and in certain cases, concurrency/parallelism, not gonna dive deep in defining the difference though). Well, that was true until I started learning Rust, partly for my own leisure.

I still don't feel comfortable writing a complete Rust code.

Though I really like the language.

So in my current day job, some of my colleagues are writing mainly Golang code. I tried learning it for a while, but considering I am coming from Python, I hated it so much (and I don't already love Python enough, mainly due to the limitations of lambda). Yes, it gets the job done, however it just feels very ... raw, and overly simplified. Being a mostly functional person, the limitation on how to express an idea is frustrating (as if the statically typed part is not bad enough).

This is not a post that objectively tells why Golang sucks, it is just purely not something I would code in. I ended up continue writing code in Python because it makes more sense for the work I do.

Throughout the years, the team discussed on and off about Rust. Then I finally gave it a more serious look early this year. However it was only until around June I started to put in time and effort to go through the documentation. Meanwhile, just to practice purpose I found Exercism and found it good enough to be a 4Clojure equivalent to other languages.

Then exercism went through a redesign, not too long after I started to put in effort in solving the puzzles.

And came back with a mentor system that I still don't fully appreciate (though I signed up to be one).

Learning Rust is both fun, and frustrating. Being a high-level and dynamically typed language programmer, I technically do not really need it. So being safe, faster is not something I find appealing. High level languages are not really known for being fast, though they are mostly safe anyway.

So I continue solving more programming puzzles/katas/exercises, and started to hit walls. The thing I like about exercism post redesign is the separation between core and optional exercises. The core exercises are usually there for the learners to learn about concepts, one at a time an it is a good enough mix of maths and programming problem (don't bother to be solving complicated mathematics problems WHILE learning a new language here).

Next I will list some notable exercises with notes.

Pythagorean triplet

I am not a mathematician, so to me the question screams loop, loop and more looping to me. Find the sum of two squared numbers that is equal to another squared number, and the total of these three numbers is 1000. So in my early iterations I submitted this

pub fn find() -> Option<u32> {
    let mut result: u32 = 0;

    for a in 0u32.. {
        if a < 998 {
            for b in (a + 1u32).. {
                if a + b < 999 {
                    for c in (b + 1u32).. {
                        if a.pow(2) + b.pow(2) == c.pow(2) && a + b + c == 1000 {
                            result = a * b * c;
                            break;
                        } else if a + b + c > 1000 {
                            break;
                        }
                    }
                } else {
                    break;
                }
            }
        } else {
            break;
        }
    }

    Some(result)
}

Yea, it is rather fugly to look at. Then it became the following version because my mentor at the time told me that it could be done without all the excessive branching. (I obviously didn't care much about efficiency)

pub fn find() -> Option<u32> {
    let mut result: Option<u32> = None;

    for a in 0u32..998 {
        for b in (a + 1u32)..(999 - a) {
            for c in (b + 1u32)..(1001 - a - b) {
                if a.pow(2) + b.pow(2) == c.pow(2) && a + b + c == 1000 {
                    result = Some(a * b * c);
                    break;
                }
            }
        }
    }

    result
}

Much cleaner, but was told I didn't need looping for the third number since I already know what it is if I have the first two.

pub fn find() -> Option<u32> {
    let mut result: Option<u32> = None;

    for a in 0u32..998 {
        for b in (a + 1u32)..(999 - a) {
            let c = 1000 - a - b;

            if a.pow(2) + b.pow(2) == c.pow(2) {
                result = Some(a * b * c);
                break;
            }
        }
    }

    result
}

Good enough, not the most efficient solution but got a green light to move on. I still don't like it personally as it doesn't look very functional enough and after a few months I re-attempted again.

pub fn find() -> Option<u32> {
    (1u32..998)
        .filter_map(|a| {
            (a + 1..999 - a)
                .filter_map(|b| {
                    let c = 1000 - a - b;

                    match a.pow(2) + b.pow(2) == c.pow(2) {
                        true => Some(a * b * c),
                        false => None,
                    }
                })
                .next()
        })
        .next()
}

Slightly cleaner though probably not as easy to interpret (depending on who you ask), but definitely could be done better. However I do appreciate the help from my mentor.

Saddle point

In short the objective is to find an entry which is greater than all its rowmates, but smallest among the cellmates (pun not intended). As usual, my first attempt is clumsy and somehow started a discussion with my mentor on code readability.

pub fn find_saddle_points(input: &[Vec<u64>]) -> Vec<(usize, usize)> {
    input
        .iter()
        .enumerate()
        .map(|(row_idx, row)| match find_max(row) {
            Some(max) => row.iter()
                .enumerate()
                .filter(|(_idx, &value)| value == max)
                .map(|(col_idx, &value)| {
                    let col = input.iter().map(|crow| crow[col_idx]).collect::<Vec<u64>>();
                    let min = find_min(&col);

                    if value == min {
                        Some((row_idx, col_idx))
                    } else {
                        None
                    }
                })
                .filter(|x| x.is_some())
                .map(|x| x.unwrap())
                .collect::<Vec<(usize, usize)>>(),
            _ => vec![],
        })
        .fold(vec![], |mut result: Vec<(usize, usize)>, incoming| {
            result.append(&mut incoming.clone());
            result
        })
}

fn find_min(input: &Vec<u64>) -> u64 {
    input.iter().fold(
        input[0],
        |result, &incoming| {
            if result <= incoming {
                result
            } else {
                incoming
            }
        },
    )
}

fn find_max(input: &Vec<u64>) -> Option<u64> {
    match input.len() {
        0 => None,
        _ => input.iter().fold(Some(input[0]), |result, &incoming| {
            if result.unwrap() >= incoming {
                result
            } else {
                Some(incoming)
            }
        }),
    }
}

Not the most efficient solution, cloning is probably unnecesary, still unfamiliar with the borrow checker, but code somehow works. Then I was told about flat_map that I totally overlooked in the documentation. After some iterations, the code became

use std::cmp::{max, min};
use std::u64::MAX;

pub fn find_saddle_points(input: &[Vec<u64>]) -> Vec<(usize, usize)> {
    input
        .iter()
        .enumerate()
        .flat_map(|(row_idx, row)| match find_max(&row) {
            Some(max) => row.iter()
                .enumerate()
                .filter(|(_idx, &value)| value == max)
                .flat_map(|(col_idx, &value)| {
                    match find_min(&input.iter().map(|crow| crow[col_idx]).collect::<Vec<u64>>()) {
                        min if value == min => vec![(row_idx, col_idx)],
                        _ => vec![],
                    }
                })
                .collect::<Vec<(usize, usize)>>(),
            _ => vec![],
        })
        .collect::<Vec<(usize, usize)>>()
}

fn find_min(input: &Vec<u64>) -> u64 {
    input.iter().cloned().fold(MAX, min)
}

fn find_max(input: &Vec<u64>) -> Option<u64> {
    match input.len() {
        0 => None,
        _ => Some(input.iter().cloned().fold(0, max)),
    }
}

Not sure if it would look different if I re-do today (like what I did to the previous exercise). However I learned a lot throughout the peer review process.

Then the exercises get harder.

Luhn

Then there's this checksum validator. Starting from the right, just double every digit in the odd index, minus 9 from it if it exceeds 9, then sum up the resulting digit, as well as the even-indexed digits. The result of the sum has to be 10-divisible.

pub fn is_valid(code: &str) -> bool {
    let _code = code.chars().filter(|x| !x.is_whitespace());

    match _code.clone().count() {
        _invalid if _invalid <= 1 => false,
        _ => {
            let result = _code
                .map(|x| x.to_digit(10))
                .rev()
                .enumerate()
                .map(|(idx, x)| match x {
                    Some(n) => Some(match (idx % 2 == 1, n * 2) {
                        (true, result) if result > 9 => result - 9,
                        (true, result) => result,
                        _ => n,
                    }),
                    None => None,
                })
                .fold(Some(0), |current, incoming| {
                    current.and_then(|x| match incoming {
                        Some(n) => Some(x + n),
                        None => None,
                    })
                });

            match result {
                Some(n) if n % 10 == 0 => true,
                _ => false,
            }
        }
    }
}

This is one of the earlier submissions, and somehow I still struggle with ownership here. Then I started getting intrigue with the power of the Option enum. Then I had this

pub fn is_valid(code: &str) -> bool {
    let _code = code
        .chars()
        .filter(|x| !x.is_whitespace())
        .collect::<Vec<char>>();

    match _code.len() {
        _invalid if _invalid <= 1 => false,
        _ => _code
            .iter()
            .map(|x| x.to_digit(10))
            .rev()
            .enumerate()
            .map(|(idx, x)| {
                x.and_then(move |n| {
                    Some(match (idx % 2 == 1, n * 2) {
                        (true, result) if result > 9 => result - 9,
                        (true, result) => result,
                        _ => n,
                    })
                })
            }).fold(Some(0), |current, incoming| {
                current.and_then(|x| match incoming {
                    Some(n) => Some(x + n),
                    None => None,
                })
            }).map_or(false, |n| n % 10 == 0),
    }
}

Atbash Cipher

I like breaking things, though I hate the frustration when things don't work as expected. This time I am trying to send in a closure into a function as parameter. Of course this is not necessary the best solution but getting to this state took me quite some time due to the fight with the borrow checker.

pub fn encode(plain: &str) -> String {
    convert(
        plain,
        Some(&|idx, x| format!("{}{}", x, if (idx + 1) % 5 == 0 { " " } else { "" })),
    )
}

/// "Decipher" with the Atbash cipher.
pub fn decode(cipher: &str) -> String {
    convert(cipher, None::<&Fn(usize, char) -> String>)
}

fn transform(x: char) -> char {
    match x {
        x if x.is_alphabetic() => (219u8 - x as u8) as char,
        x => x,
    }
}

fn preprocess(text: &str) -> Vec<char> {
    text.to_ascii_lowercase()
        .chars()
        .filter(|x| x.is_alphanumeric() && x.is_ascii())
        .collect()
}

fn convert(text: &str, decorator: Option<&Fn(usize, char) -> String>) -> String {
    (match decorator {
        Some(_decorator) => preprocess(text)
            .into_iter()
            .enumerate()
            .map(|(idx, x)| _decorator(idx, transform(x)))
            .collect::<String>(),
        None => preprocess(text)
            .into_iter()
            .map(|x| transform(x))
            .collect::<String>(),
    }).trim()
        .to_string()
}

Sometimes having someone experienced as a mentor is a good thing, he immediately pointed out some parts that can made the code simpler, yielding this

/// "Encipher" with the Atbash cipher.
pub fn encode(plain: &str) -> String {
    convert(
        plain,
        Some(&|idx, x| format!("{}{}", if idx != 0 && idx % 5 == 0 { " " } else { "" }, x)),
    )
}

/// "Decipher" with the Atbash cipher.
pub fn decode(cipher: &str) -> String {
    convert(cipher, None::<&Fn(usize, char) -> String>)
}

fn transform(x: char) -> char {
    match x {
        x if x.is_alphabetic() => (219u8 - x as u8) as char,
        x => x,
    }
}

fn convert(text: &str, decorator: Option<&Fn(usize, char) -> String>) -> String {
    let preprocess = || {
        text.chars()
            .flat_map(char::to_lowercase)
            .filter(|x| x.is_alphanumeric() && x.is_ascii())
    };

    (match decorator {
        Some(_decorator) => preprocess()
            .enumerate()
            .map(|(idx, x)| _decorator(idx, transform(x)))
            .collect::<String>(),
        None => preprocess().map(|x| transform(x)).collect::<String>(),
    })
}

The idea is more or less the same compared to the earlier iteration shown above, just slightly cleaner.

Bracket push

In short this exercise is to check if the brackets are balanced. Nothing too major here, but my mentor told me apparently the code generated by Rust deals with recursion is good enough. There seems to be some TCO in place, though I don't really care. I usually swap recursion code back to normal loop only when necessary (after I blow up the stack).

pub struct Brackets(Vec<char>);

impl<'a> From<&'a str> for Brackets {
    fn from(input: &str) -> Self {
        Brackets(
            input
                .chars()
                .filter(|x| vec!['{', '}', '[', ']', '(', ')'].contains(x))
                .collect::<Vec<char>>(),
        )
    }
}

impl Brackets {
    pub fn are_balanced(&self) -> bool {
        Self::balance_test(true, &[], &self.0[..])
    }

    fn balance_test(result: bool, stack: &[char], brackets: &[char]) -> bool {
        match (stack.first(), brackets.first()) {
            (Some(o), Some(c)) if [('{', '}'), ('[', ']'), ('(', ')')].contains(&(*o, *c)) => {
                Self::balance_test(result, &stack[1..], &brackets[1..])
            }
            (_, Some(n)) if ['}', ')', ']'].contains(n) => false,
            (_, Some(n)) => Self::balance_test(
                result,
                &vec![vec![*n], stack.to_vec()].concat().as_slice(),
                &brackets[1..],
            ),
            (_, None) => result && stack.len() == 0,
        }
    }
}

However, the notable part of this exercise is that I was told about the rpds crate. This is an awesome crate for functional people I suppose because it doesn't really do mutation in place, which gave me a lot of headache. Knowing Rust mutates stuff in a safe way is one thing, but wanting something that is done differently to completely eliminate mutation in place is another.

Unfortunately the use of that crate seems to slightly impact performance.

Sublist

Again, a rather clumsy submission in the beginning.

#[derive(Debug, PartialEq)]
pub enum Comparison {
    Equal,
    Unequal,
    Sublist,
    Superlist,
}

pub fn sublist<T: PartialEq>(alpha: &[T], beta: &[T]) -> Comparison {
    match (alpha.len(), beta.len()) {
        (a, b) if a < b => is_sublist(&alpha, &beta),
        (a, b) if a > b => is_superlist(&alpha, &beta),
        _ => is_equal(&alpha, &beta),
    }
}

fn is_sublist<T: PartialEq>(alpha: &[T], beta: &[T]) -> Comparison {
    if alpha.len() == 0 {
        return Comparison::Sublist;
    }

    let _result = (0..(1 + beta.len() - alpha.len()))
        .any(|idx_start| alpha == &beta[idx_start..(idx_start + alpha.len())]);

    match _result {
        true => Comparison::Sublist,
        false => Comparison::Unequal,
    }
}

fn is_superlist<T: PartialEq>(alpha: &[T], beta: &[T]) -> Comparison {
    match is_sublist(&beta, &alpha) {
        Comparison::Sublist => Comparison::Superlist,
        _result => _result,
    }
}

fn is_equal<T: PartialEq>(alpha: &[T], beta: &[T]) -> Comparison {
    match alpha == beta {
        true => Comparison::Equal,
        false => Comparison::Unequal,
    }
}

One of the problem is that I abuse match statement everywhere that it really should be written as an if statement. Also through the review conversation I was also told about the window method so I could rely on it instead of calculating the sublist indices myself.

#[derive(Debug, PartialEq)]
pub enum Comparison {
    Equal,
    Unequal,
    Sublist,
    Superlist,
}

pub fn sublist<T: PartialEq>(alpha: &[T], beta: &[T]) -> Comparison {
    if is_sublist(&alpha, &beta) {
        Comparison::Sublist
    } else if is_sublist(&beta, &alpha) {
        Comparison::Superlist
    } else if alpha == beta {
        Comparison::Equal
    } else {
        Comparison::Unequal
    }
}

fn is_sublist<T: PartialEq>(alpha: &[T], beta: &[T]) -> bool {
    alpha.len() < beta.len()
        && (alpha.len() == 0 || beta.windows(alpha.len()).any(|window| window == alpha))
}

There seems to be a lot of quality-of-life kind of method calls in Rust (which makes it a fun language to learn) that I always overlooked. While the doumentation takes me some time to get used to, but methods like window, flat_map are not things I would look for in the documentation (which was why I usually ended up writing my own version).

At this stage, I am starting to get used to the borrow checker, and sometimes would get some intuition on how to fix the error by just reading the errors. The word use of 'intuition' here is deliberately chosen, because it sometimes becomes a reflex action whenever an error is shown, without thinking what causes it.

Poker

This is one of more hardcore exercise, and I don't think my solution is good. My original plan was to represent score with a large number, however that became over large so I used a vector to represent scores instead.

/// Given a list of poker hands, return a list of those hands which win.
///
/// Note the type signature: this function should return _the same_ reference to
/// the winning hand(s) as were passed in, not reconstructed strings which happen to be equal.
use std::cmp::Ordering;

#[derive(Debug, Eq)]
struct Hand {
    spade: Vec<usize>,
    heart: Vec<usize>,
    club: Vec<usize>,
    diamond: Vec<usize>,
}

impl Hand {
    fn new(hand: &str) -> Hand {
        hand.split(" ").fold(
            Hand {
                spade: vec![0; 14],
                heart: vec![0; 14],
                club: vec![0; 14],
                diamond: vec![0; 14],
            },
            |current, incoming| {
                let index = Self::get_index(
                    &incoming
                        .chars()
                        .take(incoming.len() - 1)
                        .collect::<String>(),
                );

                match incoming.chars().last().unwrap() {
                    'S' => Hand {
                        spade: Self::suit_increment(&current.spade, index),
                        heart: current.heart,
                        club: current.club,
                        diamond: current.diamond,
                    },
                    'H' => Hand {
                        heart: Self::suit_increment(&current.heart, index),
                        spade: current.spade,
                        club: current.club,
                        diamond: current.diamond,
                    },
                    'C' => Hand {
                        club: Self::suit_increment(&current.club, index),
                        spade: current.spade,
                        heart: current.heart,
                        diamond: current.diamond,
                    },
                    _ => Hand {
                        diamond: Self::suit_increment(&current.diamond, index),
                        spade: current.spade,
                        heart: current.heart,
                        club: current.club,
                    },
                }
            },
        )
    }

    fn suit_increment(original: &Vec<usize>, index: usize) -> Vec<usize> {
        original
            .clone()
            .iter()
            .enumerate()
            .map(|(idx, &x)| {
                if (index == 0 && idx == 13) || index == idx {
                    x + 1
                } else {
                    x
                }
            })
            .collect::<Vec<usize>>()
    }

    fn flatten(&self) -> Vec<usize> {
        vec![0; 14]
            .iter()
            .enumerate()
            .map(|(idx, _)| {
                &self.spade[idx] + &self.heart[idx] + &self.club[idx] + &self.diamond[idx]
            })
            .collect()
    }

    //
    // Score vector
    // [type] [# of threes/fours] [# of pair/largest] [#High A] [#B] .. [#2] [#Low A]
    // only exception is straight flush A1234, then set [#High A] to 0
    //
    fn score(&self) -> Vec<usize> {
        let flattened = self.flatten();

        (0..17)
            .into_iter()
            .map(|x| {
                if x == 0 {
                    self.type_bit()
                } else if x == 1 {
                    if self.has_threes() && self.has_pairs(1) {
                        flattened.iter().rposition(|&y| y == 3).unwrap()
                    } else if self.is_four_of_a_kind() {
                        flattened.iter().rposition(|&y| y == 4).unwrap()
                    } else {
                        0
                    }
                } else if x == 2 {
                    if self.has_threes() && self.has_pairs(1) {
                        flattened.iter().rposition(|&y| y == 2).unwrap()
                    } else if self.is_four_of_a_kind() {
                        flattened.iter().rposition(|&y| y == 1).unwrap()
                    } else {
                        0
                    }
                } else if x == 3 && flattened.iter().take(5).all(|&x| x > 0) {
                    0
                } else {
                    flattened[flattened.len() + 2 - x]
                }
            })
            .collect()
    }

    fn type_bit(&self) -> usize {
        if self.is_flush() && self.is_straight() {
            8
        } else if self.is_four_of_a_kind() {
            7
        } else if self.has_threes() && self.has_pairs(1) {
            6
        } else if self.is_flush() {
            5
        } else if self.is_straight() {
            4
        } else if self.has_threes() {
            3
        } else if self.has_pairs(2) {
            2
        } else if self.has_pairs(1) {
            1
        } else {
            0
        }
    }

    fn has_threes(&self) -> bool {
        self.has_repeat(3) > 0
    }

    fn has_pairs(&self, count: usize) -> bool {
        self.has_repeat(2) == count
    }

    fn has_repeat(&self, repeat: usize) -> usize {
        self.flatten()
            .iter()
            .take(13)
            .filter(|&x| *x == repeat)
            .count()
    }

    fn is_straight(&self) -> bool {
        let flattened = self.flatten();

        (0..flattened.len() - 4).any(|x| flattened.iter().skip(x).take(5).all(|&x| x > 0))
    }

    fn is_flush(&self) -> bool {
        [&self.spade, &self.heart, &self.club, &self.diamond]
            .iter()
            .any(|&x| x.into_iter().sum::<usize>() == 5)
    }

    fn is_four_of_a_kind(&self) -> bool {
        self.has_repeat(4) > 0
    }

    fn get_index(value: &str) -> usize {
        let cards = [
            "A", "2", "3", "4", "5", "6", "7", "8", "9", "10", "J", "Q", "K",
        ];

        cards.iter().position(|&x| x == value).unwrap()
    }
}

impl PartialEq for Hand {
    fn eq(&self, other: &Hand) -> bool {
        self.score() == other.score()
    }
}

impl PartialOrd for Hand {
    fn partial_cmp(&self, other: &Hand) -> Option<Ordering> {
        self.score().partial_cmp(&other.score())
    }
}

impl Ord for Hand {
    fn cmp(&self, other: &Hand) -> Ordering {
        self.score().as_slice().cmp(other.score().as_slice())
    }
}

pub fn winning_hands<'a>(hands: &[&'a str]) -> Option<Vec<&'a str>> {
    let shands = hands.iter().map(|x| Hand::new(x)).collect::<Vec<Hand>>();

    if hands.len() == 1 {
        Some(hands.to_vec())
    } else {
        let scores = shands
            .iter()
            .map(|x| x.score())
            .collect::<Vec<Vec<usize>>>();
        let max = &scores.iter().max().unwrap();

        Some(
            scores
                .iter()
                .enumerate()
                .filter(|(_, x)| x == max)
                .map(|(idx, _)| hands[idx])
                .collect::<Vec<&str>>(),
        )
    }
}

As shown in the score method, I am using a vector with 17 dimensions to represent the score of a given hand. In a way it is something like a number, just a not so efficient version when it comes to comparing. I don't quite like the complexity of the puzzle (too complex for the purpose of learning a language) so I didn't spent too much time polishing the solution.

Anagram

In a lot of ways, I see this problem the same as indexing documents. Just that in this case, we are indexing words. In a lot of document indexing algorithm, words are assumed to be iid, every word is somewhat independent. Therefore, two documents with similar wordings but in different order might end up being indexed similarly.

So in this solution, I break the word into a vector of unicode characters (in string form because Rust doesn't do magic unicode handling). Just like two documents with different order of words, an anagram is just two words with the same characters in different order. This may not be the most efficient algorithm in solving the exercise, but it is the most intuitive solution to me.

use std::collections::HashSet;

pub fn anagrams_for<'a>(word: &str, possible_anagrams: &[&'a str]) -> HashSet<&'a str> {
    let words = word_vec(&word, &possible_anagrams);
    let dictionary = char_dictionary(&words);

    let encoded = words
        .iter()
        .map(|&x| word_encode(&break_string(&x), &dictionary))
        .collect::<Vec<Vec<usize>>>();

    (1..words.len())
        .filter(|&x| encoded[x as usize] == encoded[0])
        .map(|x| possible_anagrams[x - 1])
        .fold(HashSet::new(), |mut current, incoming| {
            current.insert(&incoming);
            current
        })
}

fn word_encode(broken_word: &Vec<String>, dictionary: &Vec<String>) -> Vec<usize> {
    dictionary
        .iter()
        .map(|x| {
            broken_word
                .iter()
                .filter(|y| y.to_lowercase() == *x)
                .count()
        }).collect()
}

fn word_vec<'a>(word: &'a str, possible_anagrams: &[&'a str]) -> Vec<&'a str> {
    possible_anagrams
        .into_iter()
        .filter(|x| x.to_lowercase() != word.to_lowercase())
        .fold(vec![&word], |mut current, incoming| {
            current.push(incoming);
            current
        })
}

fn char_dictionary(words: &Vec<&str>) -> Vec<String> {
    words
        .into_iter()
        .flat_map(|&x| break_string(x))
        .fold(Vec::new(), |mut current, incoming| {
            if !current.contains(&incoming.to_lowercase()) {
                current.push(incoming.to_lowercase());
            }

            current
        })
}

fn break_string(word: &str) -> Vec<String> {
    let idx: Vec<(usize, char)> = format!("{}\0", word).char_indices().collect();
    let stack: Vec<(usize, char)> = Vec::new();
    let result: Vec<String> = Vec::new();

    word.char_indices()
        .zip(idx[1..idx.len()].iter())
        .enumerate()
        .fold(
            (stack, result),
            |(mut stack, mut result), (j, (start, end))| {
                if end.0 - start.0 == 1 {
                    // we don't know yet, keep it in the stack
                    if let Some(last) = stack.pop() {
                        // pop the previous
                        result.push(last.1.to_string());
                    }

                    if j == word.char_indices().count() - 1 {
                        result.push(start.1.to_string());
                    } else {
                        stack.push(start);
                    }
                } else {
                    // so this is a fat char
                    match stack.pop() {
                        Some(last) => {
                            // merge it with the stack
                            result.push(format!("{}{}", last.1, start.1));
                        }
                        None => {
                            // stack is empty, so just u
                            result.push(start.1.to_string());
                        }
                    }
                }

                (stack, result)
            },
        ).1
}

Minesweeper

I have the tendency of abusing some language structure that I really for other purposes that it is not designed for. Previously it was the pattern matching statement, this time it is the Option enum.

pub fn annotate(minefield: &[&str]) -> Vec<String> {
    let minefield: Vec<Vec<u8>> = minefield.iter().map(|&x| Vec::from(x)).collect();

    minefield
        .iter()
        .cloned()
        .enumerate()
        .map(|(row_idx, row)| {
            row.iter()
                .enumerate()
                .map(|(col_idx, &cell)| {
                    Some(cell)
                        .filter(|&x| x == b' ')
                        .map(|_| match count(&minefield, row_idx, col_idx) {
                            0 => String::from(" "),
                            n => n.to_string(),
                        }).unwrap_or((cell as char).to_string())
                }).collect::<String>()
        }).collect()
}

fn count(minefield: &Vec<Vec<u8>>, row_idx: usize, col_idx: usize) -> usize {
    neighbours(&minefield, row_idx, col_idx)
        .iter()
        .filter(|(row_idx, col_idx)| minefield[*row_idx][*col_idx] == b'*')
        .count()
}

fn neighbours(minefield: &Vec<Vec<u8>>, row_idx: usize, col_idx: usize) -> Vec<(usize, usize)> {
    (row_idx.saturating_sub(1)..=max(row_idx, minefield.len()))
        .flat_map(|r_idx| {
            (col_idx.saturating_sub(1)..=max(col_idx, minefield.first().map_or(0, |x| x.len())))
                .map(move |c_idx| (r_idx, c_idx))
        }).filter(|neighbour| *neighbour != (row_idx, col_idx))
        .collect()
}

fn max(idx: usize, length: usize) -> usize {
    (idx + 1).min(length - 1)
}

While I am relatively more comfortable with borrow checker now compared to when I first started, but the same thing cannot be said to dealing with life times. Till now I still find it to be the most complicated thing that I cannot really explain it to other people. Being mostly a programmer in higher level languages where garbage collection is in place, I am very spoilt for not having to even think about them.

I didn't even care about stack and heap when I first started as they are abstracted in the languages i used.

pub fn annotate(minefield: &[&str]) -> Vec<String> {
    minefield
        .iter()
        .enumerate()
        .map(|(row_idx, row)| {
            row.chars()
                .enumerate()
                .map(
                    |(col_idx, cell)| match (cell, count(&minefield, row_idx, col_idx)) {
                        (' ', n) if n != 0 => n.to_string(),
                        _ => cell.to_string(),
                    },
                ).collect::<String>()
        }).collect()
}

fn count(minefield: &[&str], row_idx: usize, col_idx: usize) -> usize {
    neighbours(&minefield, row_idx, col_idx)
        .filter(|(row_idx, col_idx)| {
            minefield[*row_idx]
                .chars()
                .nth(*col_idx)
                .map_or(false, |x| x == '*')
        }).count()
}

fn neighbours<'a>(
    minefield: &'a [&str],
    row_idx: usize,
    col_idx: usize,
) -> impl Iterator<Item = (usize, usize)> + 'a {
    (row_idx.saturating_sub(1)..=max(row_idx, minefield.len()))
        .flat_map(move |r_idx| {
            (col_idx.saturating_sub(1)..=max(col_idx, minefield.first().map_or(0, |x| x.len())))
                .map(move |c_idx| (r_idx, c_idx))
        }).filter(move |neighbour| *neighbour != (row_idx, col_idx))
}

fn max(idx: usize, length: usize) -> usize {
    (idx + 1).min(length - 1)
}

Parallel letter frequency

Finally some sort of concurrency/parallelism code. However that means lifetime management is going to bite really hard if you are not careful (and not using crates that improves quality of life).

use std::collections::HashMap;
use std::sync::mpsc;
use std::thread;

pub fn frequency(input: &[&str], worker_count: usize) -> HashMap<char, usize> {
    if input.len() == 0 {
        return HashMap::new();
    }

    let (tx, rx) = mpsc::channel();

    input
        .chunks((input.len() as f64 / worker_count as f64).ceil() as usize)
        .for_each(move |chunk| {
            let ttx = mpsc::Sender::clone(&tx);
            let chunk = chunk.iter().map(|&x| String::from(x)).collect::<Vec<_>>();

            thread::spawn(move || {
                chunk.iter().for_each(|sentence| {
                    ttx.send(
                        sentence
                            .to_lowercase()
                            .chars()
                            .filter(|x| x.is_alphabetic())
                            .fold(HashMap::new(), |mut current, incoming| {
                                *current.entry(incoming).or_insert(0) += 1;
                                current
                            }),
                    ).unwrap()
                });
            });
        });

    rx.iter().fold(HashMap::new(), |result, incoming| {
        incoming
            .into_iter()
            .fold(result, |mut current, (key, count)| {
                *current.entry(key).or_insert(0) += count;
                current
            })
    })
}

Strangely I don't really see much performance improvement on my computer, perhaps because docker or other stuff is constantly running which takes a lot of CPU time. However it seems to work at my mentor's end so I guess this is it, my first piece of concurrency code in Rust. The way it works, at least for this example reminds me of Go channel and a little bit of go routine though. Not too interested in digging further to figure out how it really works, I am picking up this language for fun (:

My mentor told me that crossbeam and rayon seems to be able to fix my struggle with the lifetime so I can avoid writing code that practically does nothing but cloning in line 16.

There will be async/await being introduced in the future, which I suppose is similar to asyncio/coroutine in Python 3. I suppose that would be more similar to how goroutine works.

Forth Interpreter

About half way doing this, I switched to the 2018 edition as the release is coming soon. Also the fact that I do not have any legacy code that needs porting also helps. This exercise apparently do not compile at all in the 2015 edition. So this is the final version of the interpreter. I did awefully lots of iterations on this, and even rewrote the whole thing twice just to clear up the ownership and mutation.

use std::collections::HashMap;
use std::rc::Rc;

pub type Value = i32;
pub type ForthResult = Result<(), Error>;

#[derive(Debug, PartialEq)]
pub enum Error {
    DivisionByZero,
    StackUnderflow,
    UnknownWord,
    InvalidWord,
}

#[derive(Clone)]
pub enum Token {
    Value(Value),
    BinaryOperator(Rc<dyn Fn(Value, Value) -> Result<Vec<Value>, Error>>),
    UnaryOperator(Rc<dyn Fn(Value) -> Result<Vec<Value>, Error>>),
}

pub struct Forth {
    symbols: HashMap<String, Vec<Token>>,
    stack: Vec<Value>,
}

impl Forth {
    pub fn new() -> Forth {
        Forth {
            symbols: Self::symbols_init(),
            stack: vec![],
        }
    }

    pub fn stack(&self) -> Vec<Value> {
        self.stack.to_vec()
    }

    pub fn eval(&mut self, input: &str) -> ForthResult {
        for incoming in self.tokenize(vec![], Self::input_split(input).as_slice())? {
            match incoming {
                Token::Value(value) => {
                    self.stack.push(value);
                }
                Token::BinaryOperator(operator) => {
                    let y = self.stack.pop().ok_or(Error::StackUnderflow)?;
                    let x = self.stack.pop().ok_or(Error::StackUnderflow)?;

                    self.stack.extend(operator(x, y)?);
                }
                Token::UnaryOperator(operator) => {
                    let x = self.stack.pop().ok_or(Error::StackUnderflow)?;

                    self.stack.extend(operator(x)?);
                }
            }
        }
        Ok(())
    }

    fn symbols_init() -> HashMap<String, Vec<Token>> {
        let mut result = HashMap::new();

        result.insert(
            "+".to_string(),
            vec![Token::BinaryOperator(Rc::new(|x, y| Ok(vec![x + y])))],
        );
        result.insert(
            "-".to_string(),
            vec![Token::BinaryOperator(Rc::new(|x, y| Ok(vec![x - y])))],
        );
        result.insert(
            "*".to_string(),
            vec![Token::BinaryOperator(Rc::new(|x, y| Ok(vec![x * y])))],
        );
        result.insert(
            "/".to_string(),
            vec![Token::BinaryOperator(Rc::new(|x, y| match (x, y) {
                (_, 0) => Err(Error::DivisionByZero),
                (m, n) => Ok(vec![m / n]),
            }))],
        );
        result.insert(
            "dup".to_string(),
            vec![Token::UnaryOperator(Rc::new(|x| Ok(vec![x, x])))],
        );
        result.insert(
            "drop".to_string(),
            vec![Token::UnaryOperator(Rc::new(|_| Ok(vec![])))],
        );
        result.insert(
            "swap".to_string(),
            vec![Token::BinaryOperator(Rc::new(|x, y| Ok(vec![y, x])))],
        );
        result.insert(
            "over".to_string(),
            vec![Token::BinaryOperator(Rc::new(|x, y| Ok(vec![x, y, x])))],
        );

        result
    }

    fn tokenize(&mut self, mut result: Vec<Token>, input: &[String]) -> Result<Vec<Token>, Error> {
        match input.first() {
            None => Ok(result),
            Some(value) => match (i32::from_str_radix(value, 10), self.symbols.get(value)) {
                (Ok(value), _) => {
                    result.push(Token::Value(value));
                    self.tokenize(result, &input[1..])
                }
                (_, Some(tokens)) => {
                    result.extend_from_slice(tokens.as_slice());
                    self.tokenize(result, &input[1..])
                }
                _ => {
                    let slice = self.tokenize_word(None, None, &input)?;
                    self.tokenize(result, slice)
                }
            },
        }
    }

    fn tokenize_word<'a>(
        &mut self,
        name: Option<String>,
        word: Option<Vec<Token>>,
        input: &'a [String],
    ) -> Result<&'a [String], Error> {
        match (input.first(), word.is_none()) {
            (Some(t), true) if t == ":" => self.tokenize_word(None, Some(vec![]), &input[1..]),
            (Some(t), _) if t == ";" => {
                self.symbols.insert(name.unwrap(), word.unwrap());
                Ok(&input[1..])
            }
            (Some(n), false) if name.is_none() => match i32::from_str_radix(n, 10) {
                Ok(_) => Err(Error::InvalidWord),
                Err(_) => self.tokenize_word(Some(n.to_string()), word, &input[1..]),
            },
            (Some(value), _) => match (i32::from_str_radix(value, 10), self.symbols.get(value)) {
                (Ok(value), _) => self.tokenize_word(
                    name,
                    word.map(|mut w| {
                        w.push(Token::Value(value));
                        w
                    }),
                    &input[1..],
                ),
                (_, Some(tokens)) => self.tokenize_word(
                    name,
                    word.map(|mut w| {
                        w.extend_from_slice(tokens);
                        w
                    }),
                    &input[1..],
                ),
                _ => Err(Error::UnknownWord),
            },
            _ => Err(Error::InvalidWord),
        }
    }

    fn input_split(input: &str) -> Vec<String> {
        input
            .split_terminator(|x| "  \u{0000}\u{0001}\r\n\t".contains(x))
            .map(|x| x.to_lowercase())
            .collect()
    }
}

Still not completely satisfied with this 12th revision, but I suppose it is the best I can deliver for now. The idea of this interpreter is to read whatever thrown to it, then first break it into tokens. These tokens are then converted to actual something, a function if it is a word, also actual numbers. All the built-in words/operators are stored in a hashmap as some sort of dictionary that can be replaced (as defined in the unit tests).

Then after the tokens are converted into rust objects, then the code proceeds to evaluate them, from left to right. In this revision, whenever an error is encountered it is thrown back immediately.

That's it, I think in total I put in at least 3 months instead of a week to complete the whole core exercises, and still don't feel comfortable being productive with it. Not sure if I want to pursue the remaining exercises after going through these though. A big thanks to all mentors who helped me throughout the whole learning journey, and hopefully more people sign up as a mentor for rust track in future as the backlog is getting so huge these days.

DEV Community

Learning a new language for more than 2 months (feat. Exercism)

Pythagorean triplet

Saddle point

Luhn

Atbash Cipher

Bracket push

Sublist

Poker

Anagram

Minesweeper

Parallel letter frequency

Forth Interpreter

Top comments (0)

Read next

What is shift-left ⬅️ programming?

Self-Aligning Dish in Rust: Blink

Bare-Metal Embedded Programming on K230 Using Rust

Why is a Hash Function Irreversible?