Tim McNamara

Posted on Mar 31, 2023

Two trie implementations in Rust (one's super fast)

#rust #datastructures #trie

A trie is a data structure that enables very fast lookups, while taking up quite a small amount of space. Here's an implementation in Rust:

use std::collections::HashMap;

#[derive(Default, Debug)]
struct TrieNode {
    is_end_of_word: bool,
    children: HashMap<char, TrieNode>,
}

#[derive(Default, Debug)]
pub struct Trie {
    root: TrieNode,
}

impl Trie {
    pub fn new() -> Self {
        Trie {
            root: TrieNode::default(),
        }
    }

    pub fn insert(&mut self, word: &str) {
        let mut current_node = &mut self.root;

        for c in word.chars() {
            current_node = current_node.children.entry(c).or_default();
        }
        current_node.is_end_of_word = true;
    }

    pub fn contains(&self, word: &str) -> bool {
        let mut current_node = &self.root;

        for c in word.chars() {
            match current_node.children.get(&c) {
                Some(node) => current_node = node,
                None => return false,
            }
        }

        current_node.is_end_of_word
    }
}

fn main() {
    let mut trie = Trie::new();
    trie.insert("hello");
    trie.insert("hi");
    trie.insert("hey");
    trie.insert("world");

    println!("{trie:#?}");

    println!("hiiii? {}", trie.contains("hiiii"));
}

You can gain some extra speed by replacing the standard library's hash map implementation with fxhash. To do so, replace the first few lines with this:

use std::collections::HashMap;
use fxhash::FxBuildHasher;

type FxHashMap<K, V> = HashMap<K, V, FxBuildHasher>;

#[derive(Default)]
struct TrieNode {
    is_end_of_word: bool,
    children: FxHashMap<char, TrieNode>,
}

What's happening?

The standard library's std::collections::HashMap uses a hash function -- a function which takes an input and morphs it into an output of a fixed size -- that's designed to be cryptographically secure and general purpose at the expense of speed. Because we know that our keys will be the char type, which takes up 4 bytes, we can use a hash function that's more suited for short keys. (By the way, in case you were wondering, it probably still makes sense to use a hash function for char values, even though we could just use the bits' values directly as they're all a fixed sized. Because human language tends to use a very narrow spectrum of bit patterns within a char, and hash tables like keys to be well spread out, the hash function can spread the bits out for what is hopefully a net performance win.)

Top comments (6)

Leonora Tindall • Mar 31 '23

Nice example, Tim! I personally love tries, possibly because a trie was the first "cool" datastructure I got to apply to a real job.

I tried this myself, with a different main() function that includes the words and cracklib-small word files from my Ubuntu install. These are ~1MB and ~500KB respectively, and while words contains only English words, cracklib-small contains many non-English tokens like zlzfrh and omits many real words.

fn main() {
    let words = include_str!("/usr/share/dict/words").split_whitespace();
    let cracklib = include_str!("/usr/share/dict/cracklib-small").split_whitespace();
    let mut trie = Trie::new();
    for word in words {
        trie.insert(word);
    }
    let (mut hits, mut misses) = (0,0);
    for word in cracklib {
        if trie.contains(word) { hits += 1 } else { misses += 1 };
    }
    println!("Hits {}, Misses {}", hits, misses);
}

This program uses your Trie implementation to check how many of the cracklib-small tokens are in the English words file. (It's about 75%.)

I ran this through hyperfine, a really excellent Rust benchmarking program, and on my T480, the default HashMap implementation takes about 92.7 ms to insert and check all 1.5MB of data, while the FxHashMap implementation only takes about 70.5 ms.

Neat stuff, thank you for sharing!

Vadym Serdiuk • Sep 7 '24

Good day, @noracodes. I wanted to say that I am not sure about "fairness" of performance measures that you have provided. The reason is that include_str!() macro is being executed in compile time. Assuming that - we can not say that performance or throughput of the application provided as example are measured correctly. I am sure that current Trie implementation is great, but your example in general is not representative.
Just wanted to be sure that other people understand that correctly.

Aiden Nigh • Mar 8

Given that he is benchmarking the Trie implementations and not how fast the strings are being loaded from memory your comment makes no sense. If anything, him embedding the str data at compile time would make it a more robust test.

Frank Pagliughi • May 13 '24 • Edited

I've been looking to get a better handle on iterators, and as an exercise am trying to figure out how to write a recursive iterator for this Trie to return all of the words in the collection.

So far, the only success I've had is to cache the word as a string in the terminal node, thus replacing the bool is_end_of_word with an optional string, value: Some(String). The you can easily create a recursive chained iterator for the node, TrieNode::iter() with the option and the hash map like:

Box::new(self.value
    .iter()
    .chain(self.children.values().map(|node| node.iter()).flatten()))

But this explodes the size of the collection.

Any ideas for implementing an iterator that builds the strings dynamically and returns each when hitting the terminating node?

Zeke Fast • Oct 11

@fpagliughi If you wanted to implement unordered iterator you can use depth-first traversal of the tree nodes accumulating characters for the current word on the stack (most likely Vec) and yielding the item on iterator each time you find is_end_of_word. Iterator::next() triggers tree traversal and when you go up on the tree (i.e. further from root) you push characters on stack and when you go down (i.e. closer to root) you pop characters off the stack.

If you plan to implement something which do more than just issuing words (or items) for iteration you probably would like to use Visitor pattern to be able to change the way how you traverse nodes and collect words. With Visitor, for example, you can implement different strategies, i.e. make sure that items yielded alphabetically ordered or yield shortest first or vise versa longest first. You can do all of that without actually changing the tree structure and just adding visitors algorithms as extension points.

Surely, if you do it for real project and not as a brain exercise or for education purposes I would suggest to look into existing production implementations and don't use plain Trie (Prefix Tree), but take a look at Radix Tree or even ART (Adaptive Radix Tree). They usually do better job at reducing tree depth making data structure more CPU cache frendlier and use vectorization (CPU instructions) for node checks. Definitely depends on what you gonna store in the Trie and amount of that matter.

Vadym Serdiuk • Sep 8 '24

You start from strings, I mean to use the implementation of Trie that was presented - it has method insert(_,word: &str). That means that for filling that "tree" - you should feed some data (strings) through the insert calls. That means that between something and insert calls you can gather the data for another place - that's the easiest way to get all the words 8).