If you've been building features with Claude Code or Cursor, you know the feeling. You're in flow, the agent is writing components, wiring up routes, adding tests. Then you hit the translation strings. What's the German word for "workspace" that we should use? Is it "Arbeitsbereich" or did we decide to keep it as "Workspace"? Who should review this?
And even when you get the source strings right, the translations themselves are often a separate step. Someone coordinates them, someone else reviews them, the feature sits in a PR waiting. With AI coding agents speeding up development, this gap only gets wider.
Been working in this space for a while, and the thing that made the biggest difference was simple: just take the translation knowledge your team already has collectively, make it structured, and ensure your coding agent has access to it.
The problem isn't translating, it's consistency and brand voice
AI can translate. That's not the hard part any more. The hard part is getting translations that sound like your product, consistently, when multiple people and agents are all contributing to the same codebase.
Think about what happens without a shared glossary and style guide:
- Developer A's agent translates "Workspace" as "Arbeitsbereich", Developer B's agent keeps it as "Workspace"
- One PR uses formal German ("Sie"), another uses informal ("du")
- A contractor pastes in ChatGPT translations with a completely different tone
- Your native speaker on the team corrects the same term for the third time this month
The app ends up feeling like it was translated by five different people, because it was. Your brand voice gets lost the moment it crosses a language boundary. And every new language multiplies the problem.
This is why a glossary and style guide matter more than the translation engine. They're the single source of truth that keeps your brand voice consistent regardless of who or what writes the code.
Start with a glossary and style guide
Before you automate anything, sit down with and make some decisions. These are the things that no one, not your coding agent, not a new team member, not a translation tool, can figure out on their own easily:
- Which terms should never be translated? Product names, feature names, technical terms your users know in English. "Workspace", "Dashboard", "API" might all stay as-is in German.
- Which terms have a specific translation? Maybe "Save" is always "Speichern", not "Sichern". Maybe "team member" becomes "Teammitglied", not "Mitarbeiter". These are the terms your native speakers have opinions about.
- What's the tone? Formal or informal? In German that's the difference between "Sie" and "du", in French between "vous" and "tu". This needs to be consistent across your whole app.
- Any language-specific decisions? Scandinavian languages often sound better with a natural spoken register rather than formal written style. Japanese needs the right politeness level. These are things worth writing down once.
This is a conversation with your PM, your content person, the native speakers on your team. It doesn't take long, but the decisions need to be explicit, not just in someone's head. People usually have opinions and examples ready once you ask.
Once you have this in place, document it. Agents or tools that use it will produce translations that actually sound like your product. Without it, even the best AI will just guess differently every time.
A nice thing this opens up for is automating. The actual translations into target languages can then happen in CI. For example, a GitHub Action picks up new strings on the PR, translates them with glossary enforcement, and commits them back. By the time someone reviews, all languages are there. Think of it like linting or tests, but for translations.
You could wire this up yourself
And for a small project with a few languages, you should. Call an LLM API, write a validation script, keep a glossary file in your repo. There are open source tools like i18n-ai-translate and attranslate that let you bring your own API key and run translations locally. Good starting points.
But the translation itself is only one piece. The harder parts come later:
- Consistency over time: How do you make sure translations stay consistent as your glossary evolves and new people join the team?
- Quality validation: Who catches broken placeholders, wrong formality levels, glossary violations, or tone drift?
- Compound learning: When someone on your team corrects a translation, does that correction inform future translations?
- Non-technical review: Can your PM or content person review and edit translations without opening a JSON file?
- Keeping it running: You built it, now you maintain it. Every edge case, every new language, every LLM API change is on your team. That's time spent on translation infrastructure instead of your product.
What I'm building to solve this
It's not worth rebuilding this infrastructure for every project. Localhero.ai is what I decided to build, a translation service for real product teams, not translators. The core idea is that everything compounds: every translation you ship, every correction someone makes, every glossary term you add feeds into making the next translation better.
One cool part is how it connects to coding agents. There's an agent skill that loads your glossary, tone settings, and naming conventions dynamically every time your agent works on translation-related code. Install it with npx skills add localheroai/agent-skill and it works with Claude Code, Cursor, and any agent that reads skills.sh skill files. If your PM adds a new term to the glossary or adjusts the style guide, every developer's agent picks that up on the next task. No syncing, no Slack messages.
On the CI side, a GitHub Action ensures everything is in sync. Every PR that touches locale files gets translated automatically with glossary enforcement, translation memory, and language-specific rules for things like German formality or Scandinavian natural register. Just write the source language and let CI handle the localization.
There's a lot more going on under the hood than just calling an LLM. Quality checks catch broken placeholders, glossary violations, and tone drift before anything lands in your PR. Every PR gets a dedicated review page where anyone on the team can edit translations inline or apply suggested fixes. And when someone corrects a translation, that decision feeds back into translation memory, so the system learns from your team's choices over time. It's like working with a translator that actually remembers what you decided last month.
It works with JSON (React, Next.js, Vue), YAML (Rails), and PO files (Django, Python). The agent skill and CLI are open source, and there's a free plan if you want to try it on a real project.
The compounding part
The thing I like most about this setup is that it gets better on its own. Every translation you ship, every correction your team makes, every glossary term you add feeds forward. Six months in, the system knows your voice better than any new hire would. That's the part you can't get from a script and an API call.
If you're shipping in multiple languages and it still feels like a separate workstream, start with the glossary. Get your team's decisions documented, give that context to your agent, and see what happens. And if you want help setting it up, reach out, happy to help.


Top comments (0)