Mario García for Let's Talk! Open Source

Posted on Feb 27

App Localization with Python and Argos Translate

#python #tutorial

Last year, I worked on localizing a platform from Spanish to English. The strings were stored in JSON files within a directory called es, and the goal was to generate the same files translated into English and save them in a directory named en. Here's an article I wrote on how I optimized the process by generating the initial localization using Python and DeepL.

However, DeepL is not Open Source and its usage is limited depending on the plan you choose. So, what’s a good Open Source alternative? Argos Translate—and in this article, I’ll show you how to use it for your localization projects.

Argos Translate is an Open Source tool that uses OpenNMT for translations and works offline. They also offer LibreTranslate, an API built on top of Argos Translate that doesn't require creating an account.

It can be used as a Python library, a command-line tool, or a GUI application. For this workflow, it’s recommended to use the Python library along with the API provided by LibreTranslate.

Install Dependencies

Before importing it to your Python script, you may install it by running the following command:

pip install argostranslate

Make sure that compatible versions of the following dependencies are installed:

urllib3
charset-normalizer
chardet

To get them just run:

pip install --upgrade --force-reinstall urllib3==1.26.19 charset-normalizer==2.1.1 chardet==4.0.0

If you want to try LibreTranslate, install it by executing:

pip install libretranslate

Download Language Model

Language models can be installed via the Python library, downloaded using the command-line tool, or obtained manually from the Argos Translate Package Index.

Via the command-line tool, by running:

argospm install translate-es_en

Here, es is the original language, and en is the target language.

If you use argospm, downloading the model from Python is optional, but here's the script to do it:

import argostranslate.package

from_code = "es"
to_code = "en"

argostranslate.package.update_package_index()
available_packages = argostranslate.package.get_available_packages()
package_to_install = next(
    filter(
        lambda x: x.from_code == from_code and x.to_code == to_code, available_packages
    )
)
argostranslate.package.install_from_path(package_to_install.download())

First, import the required library: argotranslate.package.

Set source and target language:

from_code = "es"
to_code = "en"

Update the package index and get a list of the available packages:

argostranslate.package.update_package_index()
available_packages = argostranslate.package.get_available_packages()

Find and get the name of the package to install by filtering from the available packages:

package_to_install = next(
    filter(
        lambda x: x.from_code == from_code and x.to_code == to_code, available_packages
    )
)

And finally, install the model:

argostranslate.package.install_from_path(package_to_install.download())

Using the Command-line Tool

It will work for direct translation, but not for translating JSON files. If you want to translate text, you can use it this way:

argos-translate --from es --to en "¡Hola mundo!"

You'll get the following output: Hey, World!

Using the Python Library

Suppose you have the following content in a JSON file:

{
  "tipo-perfil": {
    "label": "Tipo de perfil",
    "description": "Tipo de perfil",
    "tooltip": "Tipo de perfil",
    "validations": {
        "required": "El campo Tipo de perfil es requerido",
        "minMessage": "El número de caracteres debe ser de al menos {min}",
        "maxMessage": "El número de caracteres debe ser máximo de {max}",
        "regexMessage": "Formato de Tipo de perfil inválido"
    }
  }
}

As the language model is already installed, you can translate the string in the JSON file with the following Python script:

import json
import argostranslate.translate

installed_languages = argostranslate.translate.get_installed_languages()
spanish = next(filter(lambda l: l.code == "es", installed_languages))
english = next(filter(lambda l: l.code == "en", installed_languages))
translation = spanish.get_translation(english)

def translate_json(obj, translator):
    if isinstance(obj, dict):
        return {k: translate_json(v, translator) for k, v in obj.items()}
    elif isinstance(obj, list):
        return [translate_json(i, translator) for i in obj]
    elif isinstance(obj, str):
        return translator.translate(obj)
    else:
        return obj

with open("input.json", "r", encoding="utf-8") as f:
    data = json.load(f)

translated_data = translate_json(data, translation)

with open("translated.json", "w", encoding="utf-8") as f:
    json.dump(translated_data, f, indent=2, ensure_ascii=False)

Import the required libraries: json & argostranslate.translate

Load the language models installed:

installed_languages = argostranslate.translate.get_installed_languages()
spanish = next(filter(lambda l: l.code == "es", installed_languages))
english = next(filter(lambda l: l.code == "en", installed_languages))
translation = spanish.get_translation(english)

Open the JSON file and read the content:

with open("input.json", "r", encoding="utf-8") as f:
    data = json.load(f)

Call the function that translates the content of the JSON file:

translated_data = translate_json(data, translation)

And save the result in a new file:

with open("translated.json", "w", encoding="utf-8") as f:
    json.dump(translated_data, f, indent=2, ensure_ascii=False)

The recursive translate_json function takes an object that can be a dictionary, list, or string. Strings are translated directly, while dictionaries and lists are processed recursively to translate all nested strings, ensuring the entire JSON is translated correctly regardless of depth.

If you have multiple JSON files in a folder and subfolders, you can extend the script to process all of them automatically.

Add the os module to the imports.

import os

Set input and output folders:

input_folder = "es"
output_folder = "en"
os.makedirs(output_folder, exist_ok=True)

Then, use the following code to recursively translate all JSON files while preserving the folder structure:

for root, dirs, files in os.walk(input_folder):
    for filename in files:
        if filename.endswith(".json"):
            input_path = os.path.join(root, filename)
            # Mantener la misma estructura de subcarpetas en output
            relative_path = os.path.relpath(input_path, input_folder)
            output_path = os.path.join(output_folder, relative_path)
            os.makedirs(os.path.dirname(output_path), exist_ok=True)

            with open(input_path, "r", encoding="utf-8") as f:
                data = json.load(f)
            translated_data = translate_json(data)
            with open(output_path, "w", encoding="utf-8") as f:
                json.dump(translated_data, f, indent=2, ensure_ascii=False)

The above code will replace this single-file block:

with open("input.json", "r", encoding="utf-8") as f:
    data = json.load(f)

translated_data = translate_json(data, translation)

with open("translated.json", "w", encoding="utf-8") as f:
    json.dump(translated_data, f, indent=2, ensure_ascii=False)

Using the LibreTranslate API

You can also translate your JSON files using the LibreTranslate API instead of using local models. The workflow is almost identical to the previous script, with only a few changes.

Remove argostranslate.translate from imports and add requests.

import json
import requests

Create the translate_text function, set the source and target languages, and send a request to the API.

def translate_text(text):
    source = "es"
    target = "en"
    response = requests.post(
        "https://translate.argosopentech.com/translate",
        json={"q": text, "source": source, "target": target},
        headers={"Content-Type": "application/json"}
    )
    return response.json()["translatedText"]

The translate_json function will now call translate_text, which uses the API, instead of the local translator.

def translate_json(obj, translator):
    if isinstance(obj, dict):
        return {k: translate_json(v, translator) for k, v in obj.items()}
    elif isinstance(obj, list):
        return [translate_json(i, translator) for i in obj]
    elif isinstance(obj, str):
        return translator.translate(obj)
    else:
        return obj

If you're translating a single file, you must open it first, and load the data, call the translate_json function, and save the result in another file.

with open("input.json", "r", encoding="utf-8") as f:
    data = json.load(f)

translated_data = translate_json(data)

with open("translated.json", "w", encoding="utf-8") as f:
    json.dump(translated_data, f, indent=2, ensure_ascii=False)

If you want to translate multiple JSON files recursively, use the following code:

for root, dirs, files in os.walk(input_folder):
    for filename in files:
        if filename.endswith(".json"):
            input_path = os.path.join(root, filename)
            relative_path = os.path.relpath(input_path, input_folder)
            output_path = os.path.join(output_folder, relative_path)
            os.makedirs(os.path.dirname(output_path), exist_ok=True)

            with open(input_path, "r", encoding="utf-8") as f:
                data = json.load(f)

            translated_data = translate_json(data)

            with open(output_path, "w", encoding="utf-8") as f:
                json.dump(translated_data, f, indent=2, ensure_ascii=False)

And don't forget to add the os module to the imports, and set input and output folders:

import os

input_folder = "es"
output_folder = "en"
os.makedirs(output_folder, exist_ok=True)

Conclusion

In this article, you learned how to use Argos Translate and LibreTranslate to simplify translating JSON-based applications.

DEV Community