Diogo Heleno

Posted on May 16 • Originally published at m21global.com

Building Translation Workflows: Implementation Guide for Software Teams

#i18n #webdev #productivity #tutorial

Building Translation Workflows: Implementation Guide for Software Teams

Choosing between human translation and machine translation post-editing (MTPE) is just the first step. The real challenge for development teams is implementing workflows that can handle both approaches efficiently, especially when you're dealing with UI strings, documentation, and marketing content that each require different quality levels.

This guide covers the technical implementation side of translation workflows, from API integrations to automation pipelines that can scale with your release cycle.

Setting Up Content Classification Automation

Before any translation happens, you need to classify your content automatically. Manual content sorting doesn't scale when you're pushing updates weekly.

Here's a basic content classifier that routes different content types to appropriate translation workflows:

import re
from enum import Enum

class TranslationType(Enum):
    HUMAN = "human"
    MTPE = "mtpe"
    SKIP = "skip"

class ContentClassifier:
    def __init__(self):
        self.ui_patterns = [
            r'^(btn|button)_',
            r'_error$',
            r'_alert$',
            r'^modal_'
        ]

        self.legal_patterns = [
            r'terms',
            r'privacy',
            r'license',
            r'gdpr'
        ]

        self.docs_patterns = [
            r'faq_',
            r'help_',
            r'changelog',
            r'release_notes'
        ]

    def classify(self, key: str, content: str, context: dict) -> TranslationType:
        # Critical UI elements always get human translation
        if any(re.search(pattern, key, re.IGNORECASE) for pattern in self.ui_patterns):
            return TranslationType.HUMAN

        # Legal content needs human review
        if any(re.search(pattern, key, re.IGNORECASE) for pattern in self.legal_patterns):
            return TranslationType.HUMAN

        # Documentation can use MTPE if it's repetitive
        if any(re.search(pattern, key, re.IGNORECASE) for pattern in self.docs_patterns):
            if len(content) > 500 and context.get('update_frequency') == 'high':
                return TranslationType.MTPE
            return TranslationType.HUMAN

        # Default to human for safety
        return TranslationType.HUMAN

Integration Patterns for Translation APIs

Most translation management systems (TMS) provide REST APIs, but the integration patterns vary significantly. Here's a flexible wrapper that works with multiple providers:

import requests
from abc import ABC, abstractmethod

class TranslationProvider(ABC):
    @abstractmethod
    def submit_job(self, content: dict, target_languages: list, workflow_type: str) -> str:
        pass

    @abstractmethod
    def get_status(self, job_id: str) -> dict:
        pass

    @abstractmethod
    def download_results(self, job_id: str) -> dict:
        pass

class PhraseProvider(TranslationProvider):
    def __init__(self, api_token: str, project_id: str):
        self.api_token = api_token
        self.project_id = project_id
        self.base_url = "https://api.phrase.com/v2"

    def submit_job(self, content: dict, target_languages: list, workflow_type: str) -> str:
        headers = {"Authorization": f"token {self.api_token}"}

        # Upload source content
        upload_data = {
            "file_format": "json",
            "locale_id": "en",  # source locale
            "tags": f"workflow:{workflow_type}"
        }

        response = requests.post(
            f"{self.base_url}/projects/{self.project_id}/uploads",
            headers=headers,
            data=upload_data,
            files={"file": ("content.json", json.dumps(content))}
        )

        return response.json()["id"]

Building Quality Gates

You need automated quality checks before translated content hits production. This is especially important for MTPE workflows where error rates are higher.

import re
from typing import List, Dict

class QualityChecker:
    def __init__(self):
        self.critical_checks = [
            self._check_placeholder_consistency,
            self._check_html_tags,
            self._check_length_limits
        ]

        self.warning_checks = [
            self._check_terminology,
            self._check_tone_markers
        ]

    def validate_translation(self, source: str, target: str, context: dict) -> Dict:
        issues = {
            "critical": [],
            "warnings": []
        }

        # Run critical checks first
        for check in self.critical_checks:
            result = check(source, target, context)
            if not result["passed"]:
                issues["critical"].append(result["message"])

        # Only run warning checks if critical ones pass
        if not issues["critical"]:
            for check in self.warning_checks:
                result = check(source, target, context)
                if not result["passed"]:
                    issues["warnings"].append(result["message"])

        return {
            "passed": len(issues["critical"]) == 0,
            "issues": issues
        }

    def _check_placeholder_consistency(self, source: str, target: str, context: dict) -> Dict:
        source_placeholders = set(re.findall(r'\{\{[^}]+\}\}', source))
        target_placeholders = set(re.findall(r'\{\{[^}]+\}\}', target))

        return {
            "passed": source_placeholders == target_placeholders,
            "message": f"Placeholder mismatch: {source_placeholders - target_placeholders}"
        }

    def _check_length_limits(self, source: str, target: str, context: dict) -> Dict:
        if context.get("content_type") == "ui":
            max_length = context.get("max_length", len(source) * 1.5)
            return {
                "passed": len(target) <= max_length,
                "message": f"Target too long: {len(target)}/{max_length}"
            }
        return {"passed": True, "message": ""}

CI/CD Integration

Translation workflows should integrate with your existing CI/CD pipeline. Here's a GitHub Actions workflow that handles both human and MTPE content:

name: Translation Workflow

on:
  push:
    paths:
      - 'src/locales/en/**'
      - 'docs/**'

jobs:
  extract-and-classify:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Extract translatable content
        run: |
          python scripts/extract_content.py --format json --output translations/source.json

      - name: Classify content
        run: |
          python scripts/classify_content.py \
            --input translations/source.json \
            --output-human translations/human.json \
            --output-mtpe translations/mtpe.json

      - name: Submit to translation service
        env:
          TRANSLATION_API_KEY: ${{ secrets.TRANSLATION_API_KEY }}
        run: |
          python scripts/submit_translations.py \
            --human translations/human.json \
            --mtpe translations/mtpe.json \
            --languages "es,fr,de,pt"

      - name: Create tracking issue
        uses: actions/github-script@v6
        with:
          script: |
            github.rest.issues.create({
              owner: context.repo.owner,
              repo: context.repo.repo,
              title: `Translation job for ${context.sha.substring(0, 7)}`,
              body: `Tracking translation progress for commit ${context.sha}`
            })

Monitoring Translation Quality

You need metrics to track whether your workflow choices are working. This means monitoring both speed and quality across different content types:

class TranslationMetrics:
    def __init__(self, db_connection):
        self.db = db_connection

    def track_job_completion(self, job_id: str, workflow_type: str, 
                           languages: List[str], completion_time: int):
        self.db.execute(
            """
            INSERT INTO translation_jobs 
            (job_id, workflow_type, languages, completion_time, created_at)
            VALUES (?, ?, ?, ?, datetime('now'))
            """,
            (job_id, workflow_type, ','.join(languages), completion_time)
        )

    def track_quality_score(self, job_id: str, language: str, 
                          quality_score: float, error_count: int):
        self.db.execute(
            """
            INSERT INTO quality_metrics 
            (job_id, language, quality_score, error_count, measured_at)
            VALUES (?, ?, ?, ?, datetime('now'))
            """,
            (job_id, language, quality_score, error_count)
        )

    def get_workflow_performance(self, workflow_type: str, days: int = 30) -> Dict:
        cursor = self.db.execute(
            """
            SELECT 
                AVG(completion_time) as avg_completion,
                AVG(q.quality_score) as avg_quality,
                COUNT(*) as job_count
            FROM translation_jobs t
            LEFT JOIN quality_metrics q ON t.job_id = q.job_id
            WHERE t.workflow_type = ? 
            AND t.created_at > datetime('now', '-{} days')
            """.format(days),
            (workflow_type,)
        )

        return dict(cursor.fetchone())

Making the Right Trade-offs

The technical implementation is straightforward, but the workflow decisions require balancing speed, cost, and quality. Based on the original analysis of human translation vs post-editing, here are the technical considerations that matter most:

For rapid iteration cycles: Build separate pipelines for UI strings (human translation) and documentation (MTPE). UI translation can be batched and done less frequently, while docs can be automated.

For quality monitoring: Implement A/B testing for non-critical content to measure the actual impact of translation quality on user behavior.

For cost control: Use content similarity detection to avoid retranslating unchanged content, especially important for documentation that gets frequent updates.

The goal is building systems that can scale your translation decisions, not just your translation volume. When your workflow can automatically route content to the right translation approach based on risk and exposure, you can focus on the product instead of managing translation logistics.

Next Steps

Start with content classification automation. Most translation bottlenecks come from manual decision-making about what needs human review. Once you can classify content automatically, the rest of the workflow becomes much more manageable.

Then focus on quality gates that match your actual risk tolerance. Perfect translations aren't always necessary, but you need to know when they are.

DEV Community

Building Translation Workflows: Implementation Guide for Software Teams

Building Translation Workflows: Implementation Guide for Software Teams

Setting Up Content Classification Automation

Integration Patterns for Translation APIs

Building Quality Gates

CI/CD Integration

Monitoring Translation Quality

Making the Right Trade-offs

Next Steps

Top comments (0)