DEV Community

Timevolt
Timevolt

Posted on

Autocomplete Like Neo: Building a Trie from Scratch

The Quest Begins (The "Why")

Ever typed a few letters into a search bar and watched the suggestions pop up instantly? I remember the first time I tried to implement that feature for a side‑project and ended up looping through a list of 100 k words on every keystroke. My laptop sounded like it was about to launch into orbit, and the UI froze for a full second each time I typed. Honestly, it felt like I was trying to defeat Agent Smith with a rubber chicken.

The problem wasn’t my code—it was the data structure. Scanning a flat list is O(N × L) for each query (N words, L average length), which quickly becomes untenable. I needed a way to share work between words that start with the same prefix, something that would let me jump straight to the relevant candidates without re‑checking the whole dictionary every time. That’s when the Trie (pronounced “try”) entered my radar, and it felt like discovering the hidden door in the Matrix that leads straight to the source.

The Revelation (The Insight)

A Trie is simply a tree where each node represents a character, and a path from the root to a node spells out a prefix. The magic lies in shared prefixes: words like “cat”, “car”, and “cart” all reuse the same “c‑a” branch. Instead of storing each word independently, we store each character only once per path.

Why does this give us O(L) lookup? Because to find all words that start with a prefix, we just walk down the Trie following the characters of that prefix. Once we reach the node representing the prefix, every descendant leaf is a valid completion. No extra scanning, no repeated character comparisons—just a direct walk whose length equals the length of the prefix.

Think of it like navigating a subway system: you don’t re‑check every station each time you want to know which lines go through a particular stop; you follow the map from the entrance to the stop, then explore the outward tracks from there.

Wielding the Power (Code & Examples)

Let’s build a minimal Trie for autocomplete in Python. We’ll store a boolean is_word to mark terminal nodes and a dictionary children for the next characters.

class TrieNode:
    def __init__(self):
        self.children = {}          # char -> TrieNode
        self.is_word = False

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word: str):
        node = self.root
        for ch in word:
            if ch not in node.children:
                node.children[ch] = TrieNode()
            node = node.children[ch]
        node.is_word = True

    def _dfs(self, node: TrieNode, prefix: str, results: list):
        if node.is_word:
            results.append(prefix)
        for ch, child in node.children.items():
            self._dfs(child, prefix + ch, results)

    def suggestions(self, prefix: str) -> list:
        node = self.root
        for ch in prefix:
            if ch not in node.children:
                return []               # prefix not present
            node = node.children[ch]
        results = []
        self._dfs(node, prefix, results)
        return results
Enter fullscreen mode Exit fullscreen mode

Before vs. After

Before (naïve list scan)

def naive_suggestions(words, prefix):
    return [w for w in words if w.startswith(prefix)]
Enter fullscreen mode Exit fullscreen mode

If words has 200 k entries and the average word length is 8, each call does roughly 1.6 M character comparisons.

After (Trie)

Insertion costs O(total characters) once—say 1.6 M ops for the whole dictionary. Each query then costs O(L) where L is the length of the prefix (usually < 10). For a 5‑letter prefix, we do ~5 node hops, a speed‑up of over 300× in practice.

Common Traps

  1. Forgetting to mark terminals – If you omit is_word, you’ll never know when a path actually spells a complete word, leading to missing suggestions.
  2. Using a list for children instead of a dict – Linear search through children turns each step into O(σ) where σ is alphabet size, eroding the O(L) guarantee. Keep it a hash map for constant‑time look‑ups.

Interview‑Style Problems

  1. “Design an autocomplete system” – You’re given a list of sentences and their frequencies. Return the top 3 hot sentences for a given prefix. The Trie stores each sentence; each node keeps a max‑heap (or sorted list) of the top 3 frequencies seen in its subtree, letting you answer queries in O(L) time.
  2. “Longest word where every prefix is also a word” – Insert all words into the Trie, then DFS only through nodes where is_word is True. The deepest node you can reach gives the answer. This runs in O(N × L) for building and O(N × L) for the search—still linear in total input size.

Why This New Power Matters

With a Trie in your toolbox, you can turn any “search‑as‑you‑type” feature from a sluggish brute‑force scan into a snappy, instantaneous experience. Suddenly, building a search bar, a code‑completion engine, or even a spell‑checker feels less like wrestling a dragon and more like casting a precise spell.

You’ve also gained a mental model for problems that hinge on shared prefixes—think IP routing, DNA sequence matching, or game‑state caching. The Trie teaches you to look for redundancy and eliminate it, a skill that pays dividends far beyond autocomplete.

Your Next Quest

Grab a dictionary of your favorite movie quotes, insert them into a Trie, and build a tiny autocomplete that suggests the next line as you type. Try extending it to rank suggestions by popularity or to support fuzzy matches (think Levenshtein automata).

What’s the first feature you’ll add to your own Trie-powered autocomplete? Drop your ideas in the comments—I can’t wait to see what you build! 🚀

Top comments (0)