DEV Community

German Yamil
German Yamil

Posted on

How I Translated a Technical Ebook from Spanish to English with Semantic QA in Python

When I decided to publish a technical ebook in English after writing it in Spanish, my first instinct was to paste chapters into a translation interface and call it done. That lasts about ten minutes before you notice that every code block came back garbled with translated variable names and broken syntax. Here is the pipeline I actually ended up building.

The Core Problem: Prose and Code Are Not the Same

A translation API should never see your code blocks. Variable names, function signatures, and string literals are not natural language — sending them through a translation model at best returns them unchanged, and at worst invents plausible-sounding nonsense.

The solution is a fence detector that splits each chapter into typed segments before anything touches an API.

import re
from dataclasses import dataclass
from typing import List

@dataclass
class Segment:
    content: str
    is_code: bool

def split_segments(text: str) -> List[Segment]:
    """
    Split markdown text into alternating prose and fenced code segments.
    Fence patterns: ```

lang ...

 ``` or ~~~ ... ~~~
    """
    pattern = re.compile(
        r'(```

[\w]*\n[\s\S]*?

```|~~~[\w]*\n[\s\S]*?~~~)',
        re.MULTILINE
    )
    segments: List[Segment] = []
    last_end = 0

    for match in pattern.finditer(text):
        # prose before this code block
        prose = text[last_end:match.start()]
        if prose.strip():
            segments.append(Segment(content=prose, is_code=False))
        # the code block itself
        segments.append(Segment(content=match.group(), is_code=True))
        last_end = match.end()

    # any trailing prose
    remainder = text[last_end:]
    if remainder.strip():
        segments.append(Segment(content=remainder, is_code=False))

    return segments
Enter fullscreen mode Exit fullscreen mode

This gives you a clean list of segments where is_code=True segments pass through untouched.

Calling the Translation API

The actual translation call is straightforward. I am showing a generic pattern here — swap in whatever provider you use:

import httpx
import os

TRANSLATE_URL = "https://api.your-provider.com/translate"
API_KEY = os.environ["TRANSLATION_API_KEY"]

def translate_text(text: str, source_lang: str = "es", target_lang: str = "en") -> str:
    """
    Send a single prose segment to the translation API.
    Returns the translated string.
    """
    response = httpx.post(
        TRANSLATE_URL,
        headers={"Authorization": f"Bearer {API_KEY}"},
        json={
            "text": text,
            "source": source_lang,
            "target": target_lang,
        },
        timeout=30,
    )
    response.raise_for_status()
    return response.json()["translated_text"]
Enter fullscreen mode Exit fullscreen mode

For chapters longer than a few thousand words, chunk the prose segments further to stay within token limits. I found 1,500-word chunks safe for most providers.

QA: Two Modes Depending on What You Have Available

After translation, you need to verify the output is not garbage. I implemented two modes: a semantic check using embeddings if OPENAI_API_KEY is present, and a word-count ratio fallback if it is not.

import os
import math

def qa_segment(original: str, translated: str) -> dict:
    """
    Returns {"mode": str, "score": float, "passed": bool}.
    Mode is "cosine" when OPENAI_API_KEY is set, "word_ratio" otherwise.
    """
    if os.getenv("OPENAI_API_KEY"):
        return _cosine_qa(original, translated)
    return _word_ratio_qa(original, translated)


def _cosine_qa(original: str, translated: str) -> dict:
    """
    Embed both strings and compute cosine similarity.
    Requires: pip install openai
    Threshold: 0.75 (cross-lingual embeddings stay high for equivalent content).
    """
    from openai import OpenAI
    client = OpenAI()

    def embed(text: str):
        resp = client.embeddings.create(
            model="text-embedding-3-small",
            input=text[:8000],  # truncate to model limit
        )
        return resp.data[0].embedding

    vec_a = embed(original)
    vec_b = embed(translated)

    dot = sum(a * b for a, b in zip(vec_a, vec_b))
    mag_a = math.sqrt(sum(a ** 2 for a in vec_a))
    mag_b = math.sqrt(sum(b ** 2 for b in vec_b))
    similarity = dot / (mag_a * mag_b)

    return {"mode": "cosine", "score": round(similarity, 4), "passed": similarity >= 0.75}


def _word_ratio_qa(original: str, translated: str) -> dict:
    """
    Fallback: compare word counts. A good translation of ES→EN prose
    typically expands by 0–25%. Flag anything outside 0.6–1.5x.
    """
    orig_words = len(original.split())
    trans_words = len(translated.split())
    ratio = trans_words / orig_words if orig_words else 0
    passed = 0.6 <= ratio <= 1.5
    return {"mode": "word_ratio", "score": round(ratio, 4), "passed": passed}
Enter fullscreen mode Exit fullscreen mode

Running in cosine mode gives you semantic confidence. Running in word-ratio mode gives you a cheap structural sanity check. Both are useful; knowing which mode you are in matters for interpreting the output.

Reassembling the Document

After translation and QA, stitch the segments back together in order:

def translate_chapter(text: str) -> str:
    segments = split_segments(text)
    output_parts = []
    failures = []

    for i, seg in enumerate(segments):
        if seg.is_code:
            output_parts.append(seg.content)
            continue

        translated = translate_text(seg.content)
        qa_result = qa_segment(seg.content, translated)

        if not qa_result["passed"]:
            failures.append({
                "segment_index": i,
                "mode": qa_result["mode"],
                "score": qa_result["score"],
                "original_preview": seg.content[:120],
            })

        output_parts.append(translated)

    if failures:
        print(f"[WARN] {len(failures)} segment(s) failed QA:")
        for f in failures:
            print(f"  [{f['mode']}] segment {f['segment_index']} "
                  f"score={f['score']}: {f['original_preview']!r}")

    return "\n".join(output_parts)
Enter fullscreen mode Exit fullscreen mode

Failed segments get logged with enough context to review them manually. The book still assembles — you decide whether to re-translate or accept.

What This Solved

Running this against a 60,000-word technical ebook produced output where every code block came through intact and 94% of prose segments passed QA on the first pass. The remaining 6% were mostly segments with heavy domain jargon that the word-ratio check flagged; a single re-read was enough to confirm they were fine.

The pipeline adds maybe 90 minutes of compute time for a full book, runs unattended, and gives you a concrete QA report rather than a vague "it looks okay."


If you want to see how this fits into a full ebook production pipeline — from raw manuscript through translation, formatting, and distribution — I documented the entire workflow in a guide available at https://germy5.gumroad.com/l/xhxkzz for $19.99, with a 30-day refund if it does not deliver value.

Top comments (0)