DEV Community

Cover image for How to translate content programmatically using AI and TransformersPHP
Roberto B.
Roberto B.

Posted on

How to translate content programmatically using AI and TransformersPHP

In this article, I'll show you how to translate content programmatically with PHP using the TransformersPHP library.
Translating text is essential for reaching a global audience and ensuring your content is accessible to speakers of different languages.

Step 1: Set up the project

To get started, please make sure you have the TransformersPHP library installed. You can install it via Composer by running:

composer require codewithkyrian/transformers
Enter fullscreen mode Exit fullscreen mode

During the installation, you have to answer a question:

Do you trust "codewithkyrian/transformers-libsloader" to execute code and wish to enable it now? (writes "allow-plugins" to composer.json) [y,n,d,?]
Enter fullscreen mode Exit fullscreen mode

You'll need to answer yes to enable the Composer plugin to download all shared libraries necessary for TransformersPHP.

Once installed, require the autoload file to load all necessary classes and dependencies:

<?php
require "./vendor/autoload.php";
Enter fullscreen mode Exit fullscreen mode

Step 2: Import the necessary classes

Next, you’ll need to import the relevant classes and functions that handle translation:

use Codewithkyrian\Transformers\Transformers;
use function Codewithkyrian\Transformers\Pipelines\pipeline;
Enter fullscreen mode Exit fullscreen mode
  • Transformers: This class manages the setup and configuration for translation models.
  • pipeline: This function initializes your specific translation pipeline.

Step 3: Initialize the Transformers class

Before translating content, you must configure the Transformers class:

Transformers::setup()->setCacheDir("./models")->apply();
Enter fullscreen mode Exit fullscreen mode
  • setCacheDir(): This method defines the directory for caching models, which speeds up the process by avoiding repeated downloads.
  • apply(): Finalizes the setup and applies the configuration.

Step 4: Set Up the Translation Pipeline

The next step is to create a pipeline for translation using a pre-trained model:

$translationPipeline = pipeline("translation", 'Xenova/nllb-200-distilled-600M');
Enter fullscreen mode Exit fullscreen mode
  • pipeline("translation", 'Xenova/nllb-200-distilled-600M'): This function sets up a translation pipeline using the specified model, Xenova/nllb-200-distilled-600M, which is capable of handling multiple languages efficiently.

The model used for translations in this example is https://huggingface.co/Xenova/nllb-200-distilled-600M

Step 5: Provide content for translation

Define the sentences you want to translate:

$inputs = [
    "The quality of tools in the PHP ecosystem has greatly improved in recent years",
    "Some developers don't like PHP as a programming language",
    "I appreciate Laravel as a development tool",
    "Laravel is a framework that improves my productivity",
    "Using an outdated version of Laravel is not a good practice",
    "I love Laravel",
];

Enter fullscreen mode Exit fullscreen mode

This array contains English sentences that will be translated into Italian.

Step 6: Translate the content

Loop through each sentence and translate it:

foreach ($inputs as $input) {
    $output = $translationPipeline(
        $input,
        maxNewTokens: 256,
        tgtLang: 'ita_Latn'
    );
    echo "🇬🇧 " . $input . PHP_EOL;
    echo "🇮🇹 " . trim($output[0]["translation_text"]) . PHP_EOL;
    echo PHP_EOL;
}

Enter fullscreen mode Exit fullscreen mode
  • $translationPipeline($input, maxNewTokens: 256, tgtLang: 'ita_Latn'): This function call translates each English sentence into Italian, with maxNewTokens limiting the length of the translation and tgtLang specifying the target language as Italian (ita_Latn).
  • trim($output[0]["translation_text"]): Cleans up the translated text by removing any leading or trailing whitespace.

The model supports a lot of languages. To define the target language with the tgtLang parameter, you must use the language code FLORES-200. Here there is a list: https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200

In the first execution of the script, the pipeline() function will download all the model files into the directory: models/Xenova/nllb-200-distilled-600M. Be patient, the model is huge, more than 800 MB.

Performing programmatically translations with PHP

Conclusion

With TransformersPHP, translating content programmatically is a streamlined process. By setting up the environment, initializing the necessary classes, and defining a translation pipeline, you can easily convert text from one language to another. This is particularly useful for creating multilingual websites, applications, or content, allowing you to reach a broader audience effectively.

References

Top comments (2)

Collapse
 
cviniciussdias profile image
Vinicius Dias

Thanks for this article! Very clear.
Since I know nothing about AI, how can I find the models I could use for each task.
For example, how could I have found this model that you used for translation?

Collapse
 
robertobutti profile image
Roberto B.

Hi.@cviniciussdias , thank you for the feedback.
Here you can find the task implemented with TransformersPHP:
github.com/CodeWithKyrian/transfor...

Typically, I start using the default defined in the TransformersPHP.
Then, I try to explore and try additional models by Xenova that are compatible with Transformer.js. For example:
https://huggingface.co/models?library=transformers.js&sort=trending&search=Xenova

Not all the models work, but yes, you have to search for the proper model that works with TransformersPHP.