Building a Language Translation Model with LLM

#learnai #oxlo #ai

We are building a translation utility that turns an LLM into a domain-aware document translator. It preserves formatting, respects a custom glossary, and runs on Oxlo.ai's flat per-request pricing, so sending a full page costs the same as a single sentence. If you localize docs, apps, or support content, this is a pipeline you can ship today.

What you'll need

Python 3.10 or newer.
The OpenAI SDK. Install it with pip install openai.
An Oxlo.ai API key from https://portal.oxlo.ai. The free tier gives you 60 requests per day and a 7-day full-access trial, which is plenty for testing.

Step 1: Set Up the Oxlo.ai Client

I always verify the connection before I write business logic. This snippet points the OpenAI SDK at Oxlo.ai and makes a quick call with qwen-3-32b. I chose that model because Oxlo.ai lists it as strong for multilingual reasoning and agent workflows, but you can swap in llama-3.3-70b or kimi-k2.6 without changing any other code.

from openai import OpenAI

client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")

MODEL = "qwen-3-32b"

response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "user", "content": "Translate 'The server is down' to French."}
    ]
)
print(response.choices[0].message.content)

Step 2: Define the Translation System Prompt

The system prompt is the only real training we do. I keep it rigid so the model does not add conversational fluff around the translation.

SYSTEM_PROMPT = """You are a professional translation engine. Your task is to translate the user's text exactly into the requested target language.

Rules:
1. Preserve all markdown formatting, code blocks, URLs, and HTML tags.
2. Do not add explanations, preambles, or apologies.
3. Maintain the original tone (formal, casual, technical, etc.).
4. If a term is ambiguous, choose the most common technical or conversational usage unless context suggests otherwise.
5. Output only the translated text."""

print("System prompt ready.")

Step 3: Build the Core Translate Function

Next I wrap the API call into a clean function. I set temperature to 0.3 because translation rewards consistency more than creativity.

def translate(text: str, source_lang: str, target_lang: str, model: str = MODEL) -> str:
    user_message = (
        f"Translate the following text from {source_lang} to {target_lang}.\n\n"
        f"{text}"
    )

    response = client.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": user_message},
        ],
        temperature=0.3,
    )
    return response.choices[0].message.content.strip()

# Sanity check
sample = "The quick brown fox jumps over the lazy dog."
print(translate(sample, "English", "German"))

Step 4: Add a Glossary Layer for Domain Terms

Product and technical docs have terms that should stay consistent. I add an optional glossary dictionary that injects locked-in translations directly into the prompt.

def translate_with_glossary(
    text: str,
    source_lang: str,
    target_lang: str,
    glossary: dict[str, str] | None = None,
    model: str = MODEL
) -> str:
    glossary_section = ""
    if glossary:
        glossary_section = "\nUse the following glossary for domain terms:\n"
        for term, translation in glossary.items():
            glossary_section += f"- {term} -> {translation}\n"

    user_message = (
        f"Translate the following text from {source_lang} to {target_lang}."
        f"{glossary_section}\n\n"
        f"{text}"
    )

    response = client.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": user_message},
        ],
        temperature=0.3,
    )
    return response.choices[0].message.content.strip()

# Test with a software glossary
gloss = {"API key": "Clave API", "endpoint": "punto final"}
print(translate_with_glossary(
    "Store your API key securely and point your client to the correct endpoint.",
    "English",
    "Spanish",
    glossary=gloss
))

Step 5: Batch Process a Markdown File

Because Oxlo.ai charges per request rather than per token, I can pass large paragraphs or even full sections in a single call without watching the meter run. This function splits a markdown file by paragraph, translates each chunk, and reassembles it.

def translate_markdown_file(filepath: str, source_lang: str, target_lang: str) -> str:
    with open(filepath, "r", encoding="utf-8") as f:
        paragraphs = f.read().split("\n\n")

    translated_paragraphs = []
    for para in paragraphs:
        if not para.strip():
            translated_paragraphs.append("")
            continue
        translated = translate(para, source_lang, target_lang)
        translated_paragraphs.append(translated)

    return "\n\n".join(translated_paragraphs)

# Usage:
# result = translate_markdown_file("docs/readme.md", "English", "Japanese")
# with open("docs/readme_ja.md", "w", encoding="utf-8") as f:
#     f.write(result)

Run It

Here is the full script. Save it as translator.py, export your OXLO_API_KEY, and run python translator.py.

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.oxlo.ai/v1",
    api_key=os.environ["OXLO_API_KEY"]
)

MODEL = "qwen-3-32b"

SYSTEM_PROMPT = """You are a professional translation engine. Your task is to translate the user's text exactly into the requested target language.

Rules:
1. Preserve all markdown formatting, code blocks, URLs, and HTML tags.
2. Do not add explanations, preambles, or apologies.
3. Maintain the original tone (formal, casual, technical, etc.).
4. If a term is ambiguous, choose the most common technical or conversational usage unless context suggests otherwise.
5. Output only the translated text."""

def translate(text: str, source_lang: str, target_lang: str, model: str = MODEL) -> str:
    user_message = (
        f"Translate the following text from {source_lang} to {target_lang}.\n\n"
        f"{text}"
    )
    response = client.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": user_message},
        ],
        temperature=0.3,
    )
    return response.choices[0].message.content.strip()

if __name__ == "__main__":
    text = (
        "Oxlo.ai offers flat per-request pricing for LLM inference. "
        "Unlike token-based providers, your cost stays the same even for long-context workloads."
    )
    print("Spanish:")
    print(translate(text, "English", "Spanish"))
    print("\nJapanese:")
    print(translate(text, "English", "Japanese"))

Example output:

Spanish:
Oxlo.ai ofrece precios fijos por solicitud para la inferencia de LLM. A diferencia de los proveedores basados en tokens, su costo se mantiene igual incluso para cargas de trabajo de contexto largo.

Japanese:
Oxlo.aiは、LLM推論のためのフラットなリクエスト単位の料金体系を提供しています。トークンベースのプロバイダーとは異なり、長いコンテキストのワークロードでもコストは変わりません。

Wrap-Up and Next Steps

From here, you can add a bilingual glossary file in JSON and load it at runtime, or wire the translator into a GitHub Action that auto-translates your documentation on every push. If you need higher throughput or dedicated capacity, Oxlo.ai's Enterprise plan offers custom pricing and dedicated GPUs. You can compare plans at https://oxlo.ai/pricing.