Eray Gündoğmuş

Posted on Mar 1

Scaling i18n Beyond Lazy Loading: What Framework Comparisons Miss About Real-World Localization

#ai #webdev #typescript #machinelearning

I've read a lot of "how we scaled our i18n" posts. They almost all follow the same arc: team evaluates frameworks, picks i18next or FormatJS, implements lazy loading per namespace, ships it, writes the blog post. Bundle size goes down, page load improves, everyone's happy.

I know because we wrote that exact post internally 18 months ago. Then we spent the next 12 months dealing with everything the framework comparison didn't cover.

This post is about those 12 months. Not the framework choice — the infrastructure, workflow, and operational problems that show up at scale but never make it into the comparison table.

The framework comparison trap

Most i18n framework evaluations look something like this:

Criteria	i18next	FormatJS	Polyglot
Bundle size (core)	~40KB	~32KB	~3KB
Lazy loading	Plugin-based	Built-in	Manual
ICU support	Plugin	Native	None
React integration	react-i18next	react-intl	Manual
TypeScript	Good	Good	Weak
Community	Large	Large	Small

This is accurate and useful for picking a library. It is almost entirely useless for predicting what your localization workflow will look like at 500+ keys, 6+ languages, and a team of 8+ developers shipping weekly.

Here's what the comparison table doesn't tell you:

How will you detect unused translation keys after a refactor?
What happens when two feature branches modify the same locale file?
How long does it take to push a one-word translation fix to production?
How will your translators get context about what they're translating?
Who maintains the mapping between translation keys and the codebase?

These are the problems that actually consume engineering time. The framework is a pipe fitting. The plumbing is everything else.

Problem 1: The namespace illusion

Namespace splitting is the first optimization every team implements. Instead of one massive translation.json, you split into auth.json, dashboard.json, settings.json, etc.

// i18next namespace config
i18n.init({
  ns: ['common', 'auth', 'dashboard', 'settings', 'billing'],
  defaultNS: 'common',
  backend: {
    loadPath: '/locales/{{lng}}/{{ns}}.json',
  },
});

With lazy loading, each namespace loads on demand:

// Only loads 'dashboard' namespace when this route mounts
const DashboardPage = () => {
  const { t } = useTranslation('dashboard');
  return <h1>{t('title')}</h1>;
};

Bundle problem: solved. Performance: improved. Blog post: written.

But here's what happens next.

The namespace decision tax

Developer writes a new error component for the billing page. Where does billing.errors.cardDeclined go? In billing.json or errors.json? If you have both namespaces, now you have a naming convention meeting. If you decide it goes in billing.json, what happens when the same error also shows on the settings page where users update their payment method?

We tracked this over 4 sprints. Developers spent an average of 7 minutes per new key deciding which namespace it belonged to. With 15-20 new keys per sprint, that's 2+ hours of namespace deliberation per sprint across the team. And that's before the reviewer disagrees and asks for a rename.

The cross-namespace import problem

Components that span multiple features need multiple namespaces:

// A notification component that shows messages from different domains
const NotificationItem = ({ notification }) => {
  const { t: tBilling } = useTranslation('billing');
  const { t: tAuth } = useTranslation('auth');
  const { t: tDashboard } = useTranslation('dashboard');

  const getMessage = () => {
    switch (notification.domain) {
      case 'billing': return tBilling(`notifications.${notification.key}`);
      case 'auth': return tAuth(`notifications.${notification.key}`);
      case 'dashboard': return tDashboard(`notifications.${notification.key}`);
    }
  };

  return <p>{getMessage()}</p>;
};

This defeats the purpose of lazy loading. You're now loading 3 namespaces for a single component. Worse, you've created implicit coupling between namespaces and feature boundaries that breaks every time someone moves a component.

What actually works

Flat keys with convention-based prefixes, no namespace boundaries:

// No namespaces. One flat map. Convention handles organization.
const t = useTranslations('billing');
t('errors.cardDeclined');     // → "billing.errors.cardDeclined"
t('notifications.overdue');   // → "billing.notifications.overdue"

The namespace is a logical prefix, not a file boundary. No files to split. No lazy loading configuration per namespace. The entire locale payload is edge-cached on a CDN and delivered in a single request.

"But won't that be slow?" At 1,000 keys, a gzipped JSON payload is ~18KB. That's smaller than most hero images. At 5,000 keys, it's ~70KB. A single CDN request at ~2ms from the nearest PoP beats the waterfall of 8 namespace requests at ~40ms each, every time.

Problem 2: The merge conflict multiplier

If your translations live as JSON files in the repo, every feature branch that adds a key modifies the same file. With a team of 6 developers, you're touching en.json in every sprint, often in every PR.

Git's diff algorithm treats JSON as text. A single reordered key produces a multi-line diff. Two developers adding keys in the same file — even in different sections — produces a merge conflict because JSON doesn't have a natural merge strategy.

// Developer A adds key at line 847
"billing.newPlan.title": "Choose your plan",

// Developer B adds key at line 848
"billing.upgrade.cta": "Upgrade now",

// Git: CONFLICT

Namespace splitting reduces conflict frequency but doesn't eliminate it. With 5 namespaces and 6 developers, you still get 1-2 conflicts per sprint. Each conflict requires someone to manually resolve a JSON file, re-validate the JSON syntax, and re-run the translation CI check.

What actually works

Remove JSON files from the repo entirely.

// Translations fetched from CDN at request time
import { getMessages } from '@better-i18n/use-intl/server';

export default async function Layout({ children, params: { locale } }) {
  const messages = await getMessages({
    project: 'my-app',
    locale,
  });

  return (
    <IntlProvider locale={locale} messages={messages}>
      {children}
    </IntlProvider>
  );
}

No locales/ directory in the repo. No JSON files to conflict. Keys are managed on a platform and served via CDN. If you want a Git record, the platform creates archive PRs — but they're documentation, not source of truth.

The Git PR flow inverts:

Before:  Code → JSON files → TMS → JSON files → Merge conflicts
After:   Code → Platform → CDN (live) + Git PR (archive)

Problem 3: The ghost key problem

After 6 months of active development, we had 2,400 translation keys. After running a dead code analysis, 340 of them (14%) weren't referenced anywhere in the codebase.

These ghost keys accumulated over multiple refactors. When you rename a component from <OnboardingWizard> to <SetupFlow>, the code changes but nobody remembers to delete onboarding.wizard.step1.title from the JSON file. Why? Because deleting a key feels risky. What if it's used somewhere you didn't check? What if another component imports it dynamically?

So the keys stay. And your translators translate all of them. At $0.08/word across 6 languages, those 340 ghost keys cost us roughly $1,100 in unnecessary translation work over 6 months. That's before you count the translator time wasted on context questions about strings that don't correspond to any visible UI.

Why regex-based detection fails

The obvious fix is to scan the codebase for key references. Most teams write a script:

# Find keys in JSON that aren't referenced in source
grep -r "onboarding.wizard" src/ --include="*.tsx"

This misses:

Namespaced keys: t('step1.title') inside a component that called useTranslations('onboarding.wizard') — the full key is onboarding.wizard.step1.title, but the source code only has step1.title.
Dynamic keys: t(\error.${code}) — you can't statically resolve the full key.
Passed references: <Child t={t} /> — the key usage is in the child, but the namespace binding is in the parent.

What actually works: AST scanning

Parse the actual syntax tree, not the text:

$ npx @better-i18n/cli scan

Scanning src/ for translation keys...

Found 2,412 keys in project
  ├── 2,072 matched (used in code)
  ├── 340 unused (safe to remove)
  └── 18 dynamic (manual review needed)

Dynamic keys (cannot statically resolve):
  src/components/ErrorBanner.tsx:14  →  t(`error.${code}`)
  src/lib/notifications.ts:28       →  t(`notification.${type}.title`)
  ...

The AST parser:

Finds all useTranslations('namespace') calls and tracks the bound variable
Follows the variable through the component scope
When it sees t('key'), resolves the full qualified key: namespace.key
Cross-references against the platform's key inventory
Flags dynamic keys (template literals, computed values) for manual review

This runs in CI. Every PR gets a check: "This PR removes 3 key references. The following keys will become unused: [list]." No ghost keys accumulate.

Problem 4: The 47-minute typo fix

Your German translator finds that "Weiter" (next) should be "Fortfahren" (continue) on the checkout page. One word.

With file-based i18n, pushing that fix requires:

Open translation file or TMS
Edit the string
If TMS: wait for sync → PR bot → merge → build → deploy
If file-based: edit JSON → commit → push → build → deploy

Either way, a one-word change triggers a full deployment pipeline. CI runs, tests execute, containers build, CDN cache purges. Minimum 5 minutes. Average 15-45 minutes depending on your pipeline.

Now multiply this by 3-5 translation fixes per week. Your deployment pipeline is spending 15-25% of its runs on copy changes that don't touch a single line of application code.

What actually works: CDN-decoupled translations

Translation fix flow:
  1. Edit string in dashboard
  2. Click "Publish"
  3. CDN cache invalidates globally (~2 seconds)
  4. Next user request serves the corrected translation

Time: under 30 seconds. No build. No deploy. No PR.

For the architecture-curious:

Browser → Next.js Server Component
  → getMessages({ project, locale })
    → CDN Edge (cache HIT: ~2ms / MISS: ~40ms)
      → Platform API → Database

The response is a flat JSON object, edge-cached with stale-while-revalidate. Even on cache miss, the server doesn't block — it serves stale content and revalidates in the background.

For SSR/RSC, messages are fetched once in the layout and serialized into the RSC payload. No additional CDN request from the client. Hydration uses the pre-fetched messages.

// Server: one CDN fetch per request
const messages = await getMessages({ project: 'app', locale });

// Client: zero CDN fetches, messages arrive via RSC payload
<IntlProvider locale={locale} messages={messages}>
  {children}
</IntlProvider>

Problem 5: The context black hole

Translators see a key and a source string:

Key: checkout.summary.total_label
Value: "Total"

They don't see the UI. "Total" in English is unambiguous. In German, it's three different words depending on context:

Table column header → "Gesamt"
Button text → "Gesamtbetrag"
Invoice line item → "Summe"

Your translator picks one. It's wrong. You get a bug report from a German user. A developer screenshots the UI, explains the context in Slack, the translator fixes it, and the fix goes through the deployment pipeline.

This cycle repeats 5-10 times per sprint. Each cycle involves a developer context-switch (15 min), a Slack thread (30 min round-trip), and a deployment (15-45 min). That's 1-2 hours per mistranslation.

What reduces this

AI translation with glossary enforcement. Define domain-specific terms once:

| Term (EN)    | Term (DE)       | Context                    |
|--------------|-----------------|----------------------------|
| Total        | Gesamt          | Summary labels, headers    |
| Workspace    | Arbeitsbereich  | Not "Arbeitsplatz"         |
| Dashboard    | Dashboard       | Keep English (UX standard) |
| Provider     | Anbieter        | Service provider context   |

When AI translates a new key, it checks the glossary first. "Total" always becomes "Gesamt" in summary contexts. Consistency that's mechanically impossible when human translators work independently across 800+ keys.

Screenshot context on keys. Attach a visual reference to each key so translators see the component, not just the string.
Review workflow with diff. Translators see what changed, not the entire file. When a developer changes "Total" to "Subtotal", the translator sees the before/after in context.

Problem 6: The type safety gap

This one is specific to TypeScript projects. Your translation keys are strings. Your IDE doesn't know which keys exist. You type t('header.buttn_text') with a typo, and you find out when a user reports a blank button.

// No error at compile time. No error at build time. Error at runtime.
const title = t('dashbord.title'); // typo: should be 'dashboard.title'

i18next has @types/i18next which can provide type checking if you configure it correctly. FormatJS has similar support. But both require you to generate type definitions from your JSON files and keep them in sync. That's another CI step, another file to maintain, another thing that can go stale.

What actually works

Generate types from the source of truth (the platform), not from local files:

$ npx @better-i18n/cli generate-types

Generated src/i18n.d.ts with 2,072 typed keys

// IDE autocomplete for every key. Red squiggly on typos.
const t = useTranslations('dashboard');
t('title');        // ✅ autocomplete
t('titl');         // ❌ TypeScript error: 'titl' not in DashboardKeys
t('stats.users');  // ✅ autocomplete with nested keys

The type file is generated from the platform's key inventory, which is the same data the CDN serves. One source of truth, zero drift.

The real scaling checklist

Framework comparisons evaluate the 10% of i18n work that involves writing t() calls. Here's what you should actually evaluate when your project passes 500 keys:

Question	File-based answer	CDN-first answer
How do you detect unused keys?	Manual audit or regex script	AST scanner in CI
How do you fix a translation typo?	Full deploy (15-45 min)	CDN publish (30 sec)
How do you avoid merge conflicts?	Namespace splitting (reduces, doesn't eliminate)	No locale files in repo
How do translators get context?	Screenshot in Slack	Attached to key, AI-assisted
How do you ensure key type safety?	Generated types from local JSON (drift risk)	Generated types from platform (single source)
How long is your translation pipeline?	Dev → TMS → PR → Merge → Build → Deploy	Dev → Platform → CDN
What's your cost per language added?	Linear: new files, new conflicts, new translations	Sublinear: glossary-enforced AI + human review

The migration is not the hard part

If you're currently running i18next or FormatJS with file-based translations, the migration to a CDN-first model is straightforward:

Export your current JSON files (you already have them)
Import into a translation platform via CLI
Replace i18next.init({ backend: { loadPath } }) with getMessages({ project, locale })
Remove the locales/ directory from your repo
Run the AST scanner to verify key coverage

Steps 1-4 took our team about 4 hours. Step 5 found 340 unused keys in 30 seconds.

The framework you use for t() calls barely changes. The infrastructure around it changes completely.

What I'd recommend today

Pick any modern i18n library. i18next, FormatJS, or next-intl — they all handle the t() function well. This is the least important decision you'll make.
Don't split into namespace files. Use logical key prefixes (billing.errors.cardDeclined) without file boundaries. Let the CDN serve everything in one cached response.
Set up AST scanning from day one. The longer you wait, the more ghost keys accumulate. Run it in CI so unused keys are caught in the PR that removes them.
Decouple translations from deployments immediately. A translation fix should never trigger a build. This alone will save your team hours per week.
Build a glossary before you translate. 40 domain-specific terms in a glossary produces more consistent translations than a 10-page style guide nobody reads.
Measure developer time on i18n. Track it for one sprint. Count the merge conflicts, the Slack threads with translators, the namespace debates, the deployment time on copy-only changes. The number will be higher than you expect.

Our team uses Better i18n for CDN-delivered translations, AST key scanning, and AI translation with glossary enforcement. If you're evaluating your i18n infrastructure, the free tier is enough to test whether the CDN-first model works for your workflow.

DEV Community

Scaling i18n Beyond Lazy Loading: What Framework Comparisons Miss About Real-World Localization

The framework comparison trap

Problem 1: The namespace illusion

The namespace decision tax

The cross-namespace import problem

What actually works

Problem 2: The merge conflict multiplier

What actually works

Problem 3: The ghost key problem

Why regex-based detection fails

What actually works: AST scanning

Problem 4: The 47-minute typo fix

What actually works: CDN-decoupled translations

Problem 5: The context black hole

What reduces this

Problem 6: The type safety gap

What actually works

The real scaling checklist

The migration is not the hard part

What I'd recommend today

Top comments (0)