DEV Community

Cover image for Beyond 'Correct Horse Battery Staple': Passphrases in Inflected Languages
Tomasz Lipinski
Tomasz Lipinski

Posted on

Beyond 'Correct Horse Battery Staple': Passphrases in Inflected Languages

A developer's journey through grammatical cases, gender agreements, and the unexpected complexity of "simple" memorable passwords

TL;DR: I built a passphrase generator for an inflected language. You can try it at niezgadniesz.pl (the name means "you won't guess it" in Polish—a fitting promise for a passphrase generator). Read on to learn why this was way harder than expected.

The Promise of Passphrases

You've probably seen the famous XKCD comic: correct horse battery staple is both stronger and more memorable than Tr0ub4dor&3. The math checks out. Four random words from a 7,776-word Diceware list give you about 51 bits of entropy—enough to resist serious attacks while remaining human-friendly.

So I decided to build a passphrase generator for Polish speakers. How hard could it be?

Very hard, as it turns out. And the reasons why reveal something important about password security that most English-centric discussions completely miss.

The English Privilege in Password Security

English is what linguists call an analytic language. Words don't change much based on their grammatical role. "Horse" is "horse" whether it's the subject, object, or part of a prepositional phrase.

Polish is a synthetic language with a fusional morphology. This means:

  • 7 grammatical cases (nominative, genitive, dative, accusative, instrumental, locative, vocative)
  • 3 grammatical genders (masculine, feminine, neuter—with masculine further split into personal, animate, and inanimate)
  • Adjectives must agree with nouns in case, gender, and number
  • Verbs conjugate based on person, number, gender, and aspect

Let me show you what this means in practice.

A Simple Example That Isn't Simple

Take the English phrase: "correct horse"

Now let's translate it to Polish. "Correct" → "poprawny", "horse" → "koń"

But wait. What case should we use?

Case "Correct horse" in Polish
Nominative poprawny koń
Genitive poprawnego konia
Dative poprawnemu koniowi
Accusative poprawnego konia
Instrumental poprawnym koniem
Locative poprawnym koniu
Vocative poprawny koniu!

That's 7 different forms of the same two-word phrase. And I haven't even touched plural forms yet (another 7 variations).

The Entropy Problem

Here's where it gets interesting from a security perspective.

If your generator outputs words only in their dictionary form (nominative singular), native speakers will instantly recognize the disconnection:

"zielony pies szybki drzewo"
(green dog fast tree)
Enter fullscreen mode Exit fullscreen mode

A Polish speaker reads this and immediately notices the grammatical disconnection. Each word stands alone in its dictionary form, with no case agreements or natural flow between them.

Does this matter for security? Actually, no. The strength of a passphrase comes from the randomness of word selection, not from how natural it sounds. An attacker running through all possible four-word combinations doesn't care whether your phrase is grammatically correct—they're trying every combination regardless.

Does it matter for memorability? Potentially, yes. Our brains are pattern-matching machines that evolved to process language. Some research suggests we remember grammatically coherent phrases more easily because they form a single cognitive unit rather than four separate items. The phrase "the green dog runs quickly" might stick better than "green dog fast tree" simply because it engages our language processing in a more natural way.

However, this is where imageability becomes the key factor. Even a grammatically awkward phrase becomes memorable if each word evokes a vivid mental picture. "Giraffe keyboard ocean sock" breaks every rule of Polish grammar, but you can probably visualize it right now—a giraffe typing on a keyboard floating in the ocean, wearing socks. That mental image is your memory anchor, not the grammar.

The Implementation Nightmare

Let's say you want to generate grammatically plausible Polish phrases anyway. Here's what you need:

1. Part-of-Speech Tagging

You can't just throw words together. You need to know:

  • Nouns: "dom" (house), "kawa" (coffee)
  • Adjectives: "duży" (big), "gorąca" (hot)
  • Verbs: "biegnie" (runs), "śpi" (sleeps)
  • Participles: "biegnący" (running), "śpiący" (sleeping)

2. Gender Assignment

Every noun has an inherent gender that adjectives must match:

masculine: duży dom (big house)
feminine:  duża kawa (big coffee)  
neuter:    duże dziecko (big child)
Enter fullscreen mode Exit fullscreen mode

Use the wrong ending and it's instantly recognizable as machine-generated.

3. Case Declension Tables

For each adjective-noun pair, you need full declension tables. A single adjective like "duży" (big) has 14 forms in the singular alone:

Case Masculine Feminine Neuter
Nom. duży duża duże
Gen. dużego dużej dużego
Dat. dużemu dużej dużemu
Acc. dużego/duży* dużą duże
Ins. dużym dużą dużym
Loc. dużym dużej dużym
Voc. duży duża duże

*Accusative masculine depends on whether the noun is animate or inanimate. Yes, really.

4. Verb Conjugation

If you include verbs, they must agree with subjects in person and number—and in past tense, also in gender:

On biegł (He was running) - masculine
Ona biegła (She was running) - feminine  
Ono biegło (It was running) - neuter
Enter fullscreen mode Exit fullscreen mode

Two Approaches (And Their Trade-offs)

After much experimentation, I identified two viable strategies:

Approach 1: Dictionary Forms Only

Just use nominative singular forms and accept that phrases will sound like disconnected word lists. This is the simplest implementation and maintains clear entropy calculations.

"żyrafa klawisz morze skarpeta"
(giraffe key sea sock)
Enter fullscreen mode Exit fullscreen mode

Pros: Easy to implement, entropy is straightforward to calculate

Cons: Phrases might be slightly harder to remember for some users; feels unnatural to native speakers

Approach 2: Template-Based Generation

Create grammatical templates that define sentence structure, then build word lists that match each slot. Words must agree in gender, case, and number within the template.

Template: [Adjective-NOM] [Noun-NOM] [Verb-3SG] [Adverb]
Example:  "Zielony        pies       biega      szybko"
          (Green dog runs quickly)
Enter fullscreen mode Exit fullscreen mode

This requires separate word lists for each grammatical category:

adjectives_masc = ["zielony", "mały", "stary", ...]  # 500 words
nouns_masc = ["pies", "dom", "stół", ...]            # 800 words  
verbs_3sg = ["biega", "skacze", "śpi", ...]          # 300 words
adverbs = ["szybko", "cicho", "wysoko", ...]         # 200 words

# Entropy = log2(500) + log2(800) + log2(300) + log2(200)
#         = 8.97 + 9.64 + 8.23 + 7.64
#         = 34.48 bits per phrase
Enter fullscreen mode Exit fullscreen mode

Pros: Natural-sounding output that native speakers find easier to remember; grammatically correct

Cons: Requires extensive linguistic work to build categorized word lists; entropy per word may be lower than simple lists; you must account for structural constraints in your security calculations

What I Actually Built

I went with Approach 2: Template-Based Generation. The implementation required building separate word lists categorized by part of speech and grammatical gender, then defining templates that ensure proper agreement.

The linguistic foundation came from leveraging a comprehensive Polish morphological dictionary—an open-source resource containing over 3 million word forms with full grammatical annotations. Instead of manually categorizing words, I could programmatically extract nouns by gender, filter verbs by conjugation pattern, and ensure every word form came with its complete grammatical metadata.

Fun fact: my generator currently uses 197 grammatical categories—and that's less than 10% of all possible categories in Polish. The language really is that complex.

However, the implementation still isn't perfect. A morphological dictionary tells you how words inflect, but not which words naturally combine. It knows that "wysoki" (tall) is a masculine adjective and "słoń" (elephant) is a masculine noun—but it doesn't know that elephants can't really be "tall" in the same way humans can. The result: most generated phrases are grammatically correct, but occasionally you get something that sounds slightly off to a native speaker. Building truly natural-sounding phrases would require additional semantic data—information about which adjectives typically modify which nouns, which verbs make sense with which subjects. That's a much deeper rabbit hole. But the result is phrases that feel natural to Polish speakers:

"Wysoki słoń biega cicho"
(Tall elephant runs quietly)
Enter fullscreen mode Exit fullscreen mode

The key insight: when phrases sound like actual Polish sentences, users engage their natural language memory rather than treating each word as a separate item to memorize. The grammatical structure becomes a scaffold that holds the words together.

Lessons for Developers Building Multilingual Security Tools

1. Language Complexity Varies Wildly

What works for English won't work for Polish, Finnish, Hungarian, Arabic, or Japanese. Each language family brings unique challenges.

2. "Simple" Features Hide Linguistic Depth

"Just translate the word list" is never just translation. You're dealing with morphology, syntax, phonology, and cultural associations.

3. Entropy Calculations Must Account for Structure

If your generation method uses templates or grammatical patterns, your effective entropy depends on both word selection AND structural choices. Be explicit about this in your security calculations.

4. User Testing is Non-Negotiable

Native speakers will immediately spot what feels wrong. Test with real users before assuming your solution works.

5. Invest in Linguistic Infrastructure

Building proper grammatical templates and categorized word lists takes significant upfront effort, but it pays off in user experience. Native speakers notice—and appreciate—when generated content respects their language's structure.

The Broader Point

Password security discussions are dominated by English-language perspectives. But the majority of internet users aren't native English speakers, and the tools we build should work for everyone.

Understanding how different languages handle word formation isn't just academically interesting—it's practically essential for building secure, usable authentication systems worldwide.

If you're building security tools, take the time to understand the linguistic landscape of your users. The "obvious" solution in your language might be completely wrong in another.


Have you built multilingual security tools? I'd love to hear about the linguistic challenges you've encountered. Drop a comment below!

Top comments (0)