DEV Community

Sanusi Hassan
Sanusi Hassan

Posted on

Translating an entire multilingual site shouldn't mean re-prompting an LLM for every file

A dark code editor showing a file tree where content-eng.ts branches into <br>
language files content-ara.ts, content-fra.ts, content-spa.ts and others, <br>
next to a code snippet containing a username placeholder.

If you have ever built a multilingual site, you have probably hit this wall. You build the whole thing in one language first, usually English, and only then do you turn to translation. And that second half of the job is far more tedious than it has any right to be.

This post is about a specific, annoying corner of localization work: translating structured content files and code, and why the usual "just paste it into an LLM" approach quietly wastes a lot of your time.

The setup most of us end up with

A common pattern for a multilingual site is a content directory with one file per language. Something like:

src/content/
  content-eng.ts
  content-ara.ts
  content-fra.ts
  content-spa.ts
  content-hin.ts
  content-zho.ts
Enter fullscreen mode Exit fullscreen mode

You write the whole site against the default language file, content-eng.ts, get everything working, and then the translation phase begins. Each of those other files has to end up as a faithful translation of the English one, with the exact same structure: same keys, same nesting, same TypeScript types, same interpolation placeholders. Only the human-readable string values should change.

On paper this is simple. In practice it is death by a thousand repetitions.

Why the "just use an LLM" answer gets old fast

LLMs are genuinely good at this kind of translation. They understand that welcomeMessage: "Welcome back, {username}" should become welcomeMessage: "Bienvenido de nuevo, {username}" and that the placeholder must stay untouched. They preserve structure, they handle context, they get the tone right.

The problem is not capability. The problem is the workflow around it.

Every time you open a new chat to translate a file, you are re-establishing the same context from scratch:

  • "Keep all the keys and structure identical."
  • "Do not translate the placeholders like {username} or {count}."
  • "Leave URLs, code identifiers, and technical terms alone."
  • "Return valid TypeScript, not markdown, not commentary."
  • "Translate into Arabic." (then French, then Spanish, then...)

You paste the file, you wait, you copy the result back, you check it compiles, you move to the next language, and you do the entire dance again. For one site with six languages, that is the same prompt repeated five times. For multiple sites, or every time you update the source content, it multiplies. New tab, same prompts, same copy-paste, same verification. It is the definition of redundant work, and "the AI is doing it" does not make the process any less repetitive.

What you actually want

Step back and the requirement is clear. You want to:

  1. Hand over the source file once.
  2. Specify the target languages once.
  3. Get back one correctly-structured, translated file per language.
  4. Have placeholders, keys, and code left exactly as they were.

In other words, you want the LLM's translation ability without re-typing the instructions for every file and every language.

Treating code and content files as translatable documents

This is the angle that made me look at the problem differently. A content-eng.ts file is, in a sense, just a document to be translated, with strict rules about what may change and what may not. The same is true for a lot of code: comments, user-facing strings, and documentation should be translated, while syntax, identifiers, and structure must be preserved exactly.

LLM-based translation is well suited to this because it works from a large context window rather than chopping the input into isolated sentences. It can see the whole file at once, understand that a {username} token is a placeholder and not a word to translate, and keep the surrounding structure intact.

This is the part of DocTranslating that is genuinely useful for developers. Alongside the usual document formats, its Gemini engine handles code files, things like .ts, .js, .py, and others, translating the human-readable parts while leaving the code itself alone. You give it the file and the target languages, and you get back translated files without re-stating the rules each time. It also supports translating one file into multiple target languages in a single pass, which is exactly the "one source, many outputs" shape this problem has.

It will not replace a careful review of locale files for a production app, and it should not. But for the bulk grunt-work of getting from one language to many, it removes the part that makes the job feel like punishment: the repetition.

A few honest caveats

A couple of things worth being straight about, because localization has sharp edges.

LLM translation is per-file context, not whole-project context. If a term needs to be translated consistently across many files, you still need to enforce that yourself, for example by giving the same terminology instruction or by keeping a glossary. The model does not magically remember decisions from a file it translated an hour ago.

And for right-to-left languages like Arabic, translation of the strings is the easy part; making sure your UI actually renders RTL correctly is a separate front-end concern that no translation tool solves for you.

The takeaway

The interesting shift here is conceptual. Once you start treating your content files and code as documents to be translated under strict structural rules, the redundant per-file, per-language prompting disappears. The LLM was never the bottleneck. The workflow around it was.

If your localization process currently looks like opening a fresh tab for every file and re-typing the same instructions, that is the part worth fixing first.

Top comments (0)