Judy Su

Posted on Apr 3

How to Translate JSON, XLIFF, po, strings and i18n Files Without Breaking Your Format

#ai #softwaredevelopment #tooling #tutorial

How to Translate JSON, XLIFF, and i18n Files Without Breaking Your Format

Posted under: Localization, Developer Tools, AI Translation

If you've ever shipped a multilingual app, you already know the pain.

You export your en.json file, paste it into Google Translate or DeepL, and get back... a blob of plain text. No keys. No structure. Just translated strings floating in a void, completely detached from the format your app expects. Now you have to manually reconstruct the file, re-match every key, and pray nothing got reordered.

Or you go the enterprise route — a translation management system (TMS) with a $300/month minimum, a three-week onboarding process, and a dedicated "integration consultant." For a one-time product update or a small startup going global, that's overkill by a mile.

There's a gap here, and it's a frustrating one. This article covers the practical options for translating structured files — JSON, XLIFF, Markdown, CSV, .po, and more — and why preserving format is the part that actually matters.

Why "just translate the text" doesn't work for i18n files

When you're internationalizing an app, your strings live inside structured files. A React app might use a JSON file like this:

{
  "onboarding": {
    "welcome": "Welcome back, {{name}}!",
    "cta": "Continue to dashboard",
    "skip": "Skip for now"
  },
  "errors": {
    "network": "Connection failed. Please try again.",
    "auth": "Your session has expired."
  }
}

Translating this file has some non-obvious requirements:

Keys must be preserved exactly. Your app references onboarding.welcome — if the translator renames it or flattens the structure, your app breaks.
Placeholders must survive untouched. {{name}} is code, not text. A translator who doesn't know this will turn it into {{nombre}} or drop it entirely.
Nesting must be maintained. Nested JSON is not the same as flat JSON. A file that loses its hierarchy will throw errors at runtime.
The file must be valid after translation. Malformed JSON (missing commas, unescaped quotes) causes hard failures.

The same logic applies to XLIFF files, which are used by Xcode, Phrase, Crowdin, Smartling, and Articulate Rise — among many others. XLIFF is XML-based, which means misplaced tags or broken attributes silently corrupt your translation file. .po files for Gettext, Android XML, .arb files for Flutter, .strings for iOS — they all carry structure that generic translation tools destroy.

The options, honestly evaluated

Option 1: Manual translation via a freelancer or agency

Best for: legal documents, marketing copy, anything where nuance is critical

Worst for: structured files with lots of strings, fast turnaround, variable volume

Freelancers and agencies work in CAT tools designed for human translators, not for structured file formats. The file gets handed off, worked on in a translation environment, and handed back — usually in the same format. But the cost scales with word count and language pair, turnaround is days to weeks, and coordinating across 5+ languages requires real project management.

Option 2: Enterprise TMS (Phrase, Crowdin, Smartling, Lokalise)

Best for: large teams with ongoing localization pipelines

Worst for: small teams, one-time projects, startups without a localization budget

These platforms are excellent and battle-tested. They support every file format, have GitHub integrations, built-in machine translation, glossaries, and QA checks. They also start at $150–300/month before you've translated a single string, require weeks of setup, and assume you have a dedicated localization program manager.

If you're a solo developer shipping a side project in 5 languages, this is not your tool.

Option 3: Google Translate / DeepL with copy-paste

Best for: casual phrases, quick checks, understanding foreign content

Worst for: anything structured

You already know this doesn't work. You paste in your JSON, the structure disappears, and you spend the next hour reformatting. These tools are built for human-readable text, not machine-readable files.

Option 4: Writing a script with the OpenAI or Claude API

Best for: developers comfortable with Python/Node, teams with engineering bandwidth

Worst for: non-technical users, teams that need it now

This is the DIY approach: write a parser, extract strings, send them to an LLM in batches, reassemble the output, validate the file. Done well, this works great. But it takes time to build, maintain, and debug — especially when you're handling edge cases like placeholder preservation, nested structures, and RTL language support.

What actually solves the format problem

The key insight is that the hard part of i18n file translation isn't the translation itself — modern AI models (Claude, GPT-4o, Gemini) do that very well. The hard part is round-tripping the file format: extracting translatable strings, sending them through the model, and reassembling the result into an identical structure with only the text values changed.

A tool that does this correctly will:

Parse the source file to extract only the text strings (not keys, not placeholders, not tags)
Batch strings efficiently to minimize API costs
Preserve all non-translatable content (interpolation variables, HTML tags, format codes)
Validate the output file before returning it
Return a file in exactly the same format as the input

Summon Translator is built around this exact problem. You upload a file — JSON, XLIFF, Markdown, CSV, .po, .strings, Android XML, .arb, .properties, or a scanned PDF — pick your target languages and AI model, and download translated files in the same format you uploaded. No reformatting. No copy-pasting. No manual reconstruction.

The model choice matters: you can pick Claude Sonnet for high-quality nuanced translation, GPT-4o Mini for cost efficiency, Gemini 2.0 Flash for speed, or DeepSeek for budget-conscious bulk runs. The platform shows you the exact cost before anything runs, so there are no surprise bills.

For a mobile app update with 500 strings across 5 languages, the total cost is around $27. For a 12-document hospital consent form package across 4 languages, around $89. Pay-per-use, no subscription required.

A practical walkthrough: translating a React i18n JSON file

Here's what the flow looks like for a typical developer use case.

Starting point: You have en.json with your English strings and need Spanish, French, German, and Japanese versions.

Step 1: Upload en.json to Summon Translator.

Step 2: Select target languages — Spanish (ES), French (FR), German (DE), Japanese (JP).

Step 3: Choose your AI model. For product UI strings, Claude Sonnet or GPT-4o gives the most natural-sounding output. If you're translating a large file and cost is a priority, GPT-4o Mini or Gemini Flash work well.

Step 4: Review the cost preview. For 500 strings across 4 languages, you're looking at roughly $5–8 depending on model.

Step 5: Download. You get back es.json, fr.json, de.json, ja.json — all in the same structure as your input, with all keys intact, all placeholders preserved, all nesting maintained.

Total time: under 5 minutes. No engineering work beyond the upload.

XLIFF-specific considerations

XLIFF (XML Localization Interchange File Format) deserves a separate mention because it's the standard for most professional localization workflows — and it's where format-preservation matters most.

An XLIFF file contains <source> and <target> elements, wrapped in <trans-unit> tags with attributes like id, resname, and approved. A correct XLIFF translation populates the <target> elements while leaving everything else untouched. An incorrect one might:

Translate the id attribute (breaking the file)
Modify <ph> (placeholder) tags
Drop <mrk> (markup) elements
Change the XML declaration or encoding

Tools like Xcode, Phrase, and Articulate Rise 360 are strict about this. If the XLIFF isn't well-formed or if structural elements have been altered, the import will fail — sometimes silently.

Summon Translator handles XLIFF by parsing the XML structure and operating only on the text content of <source> elements, leaving all tags, attributes, and structural elements intact in the output. For eLearning content built in Articulate Rise, this is particularly relevant — the exported XLIFF from Rise has complex tag nesting that breaks most generic approaches.

When to use which approach

Scenario	Recommended approach
One-time file, small word count	Summon Translator (free trial)
Ongoing pipeline, large team	TMS (Phrase, Crowdin, Lokalise)
Marketing copy, legal documents	Human translator + CAT tool
Developer with time to build	Custom LLM script
Quick check, no structure needed	DeepL / Google Translate

Final thought

The localization tooling market has a huge gap between "paste text into DeepL" and "implement a full TMS." Most developers and small teams live in that gap — they need structured file translation that preserves format, costs per-use rather than per-month, and takes minutes rather than days.

If you're in that gap, Summon Translator offers 1,000 words free to try it — no credit card required, use code 1TIME at checkout.

Upload your JSON. Get your JSON back. Translated.

Tags: i18n, localization, JSON translation, XLIFF, file translation, AI translation, developer tools, internationalization

DEV Community