Building a Dictionary API for AI Agents: 162K Words at the Edge

#ai #cloudflare #api #tutorial

Your AI agent doesn't have a dictionary.

When a user asks your agent what a word means, the LLM generates a plausible-sounding definition. It might be right. It might be subtly wrong. It might be different every time. There's no ground truth underneath.

For most applications, this is fine. For educational robots teaching children, language learning platforms, or any system where vocabulary accuracy matters — it's not.

The Problem

LLMs are probabilistic. Every response is a new generation. Ask the same model the same question twice and you may get different answers. For creative writing, this is a feature. For a dictionary, it's a bug.

Educational applications need:

Consistent definitions that don't change between requests
Age-appropriate explanations (a 5-year-old needs different language than an adult)
Accurate translations with proper native script (not romanized approximations)
Pronunciation audio (robots need to speak words correctly)
Etymology (teachers want to explain where words come from)

No LLM provides this reliably out of the box.

Word Orb: The Infrastructure Layer

We built Word Orb as the vocabulary infrastructure for our own product — thedailylesson.com, which serves 62,000 lesson segments in 11 languages and 3 age groups.

One API call returns a structured word object:

{
  "word": "courage",
  "def": "The mental or moral strength to face fear, danger, or difficulty",
  "ipa": "/kɜːrɪdʒ/",
  "pos": "noun",
  "etymology": "From Old French corage, from Latin cor (heart)",
  "tones": {
    "child": "Being brave even when you feel scared inside.",
    "teen": "The strength to face fear, pain, or difficulty head-on.",
    "adult": "The mental or moral strength to persevere through fear or adversity.",
    "elder": "A lifelong companion — the quiet resolve that carries us through uncertainty."
  },
  "langs": {
    "es": { "word": "coraje", "phonetic": "ko-RA-he" },
    "zh": { "word": "勇气", "phonetic": "yǒng qì" },
    "ar": { "word": "شجاعة", "phonetic": "sha-JAA-ah" }
  }
}

Every response is deterministic. Same word, same result, every time. The definitions are backed by etymology, not generated on the fly.

The Architecture

Word Orb runs on Cloudflare's edge stack:

Workers — API compute at 300+ edge cities (~3ms response)
D1 — SQLite at the edge for 162,250 words and 846,000 translation rows
R2 — Object storage for 240,000 pronunciation audio files
KV — Edge caching for hot words

Why D1 over Postgres? No connection pooling. SQLite is co-located with the Worker. A word lookup is a single query against a local database, not a network round-trip to a remote instance. The difference shows up in tail latencies: our p99 is under 50ms globally.

Why R2 for audio? 240,000 pronunciation files can't live in a database. R2 gives us S3-compatible object storage with no egress fees. Files are served through Cloudflare's CDN automatically.

Cache-Miss Generation

When someone requests a word that isn't in the database, Word Orb generates it in real time — definition, etymology, translations, pronunciation data, and age-appropriate tones. This takes 2-4 seconds on the first request. The result gets written to D1 and cached permanently. Every subsequent request hits the edge database at 3ms.

This means the dictionary grows organically. Start with 162K words, and it expands as users request new terms.

Why Audio Matters

Educational robots and voice assistants need to pronounce words correctly. TTS can approximate this, but it gets pronunciation wrong for unfamiliar words, loanwords, and proper nouns.

Word Orb stores pronunciation audio files in R2, organized by word:

/audio/pronunciation/{word}.mp3
/audio/kelly_says/{word}.mp3

The pronunciation path gives you clean dictionary pronunciation. The kelly_says path gives you the word used in an educational sentence.

Age-Appropriate Tones

Every word has four tone variants. This is the feature educational systems need most.

A robot explaining "mortality" to a 5-year-old:

"It means that every living thing has a beginning and an ending, like how flowers bloom and then go to sleep."

The same robot explaining "mortality" to an adult:

"The condition of being subject to death; the state of human finitude."

This is pedagogically informed content structuring. The child explanation uses concrete metaphors. The adult explanation uses precise terminology. The elder explanation acknowledges lived experience.

MCP Integration

Word Orb is an MCP server. One line in your config:

{"mcpServers":{"word-orb":{"url":"https://word-orb-api.nicoletterankin.workers.dev/mcp"}}}

Works with Claude Desktop, ChatGPT, Cursor, Gemini, and any MCP-compatible client.

npm Package

npm install @lotd/word-orb

const WordOrb = require('@lotd/word-orb');
const orb = new WordOrb('wo_YOUR_KEY');
const word = await orb.word('courage');
console.log(word.tones.child);

Who This Is For

Educational robot companies — pronunciation audio, age-appropriate language, multilingual support
Language learning platforms — 47-language translations with native script and phonetics
Teaching AI assistants — 12 pedagogical archetypes, verified definitions
Any AI agent that talks to humans — deterministic vocabulary, not hallucinated