Build an AI Audio Translator in Python on Telnyx Inference

#ai #stt #tts #telnyx

A lot of AI apps are starting to mix voice, language models, and generated audio.

I built a small Python example that shows that full loop:

take an audio file
transcribe it
translate the transcript with an LLM
generate translated speech

Repo: https://github.com/team-telnyx/telnyx-code-examples/tree/main/ai-content-translator-python

What it does

The app exposes a Flask API for translating spoken content.

You send it an audio file and a target language. It returns:

the original transcript
the translated text
generated translated audio

So instead of only translating text, the example shows a practical speech-to-speech style workflow.

Why this pattern is useful

This kind of flow can be useful for apps that need multilingual voice experiences, like:

customer support tools
education apps
internal enablement content
voice agents
media localization
accessibility workflows
product tutorials in multiple languages

The important part is that each step stays understandable. Speech-to-text, translation, and text-to-speech are separate pieces, so you can debug or replace one part without rewriting the whole app.

How the example works

The app uses Telnyx APIs for the voice and AI parts of the workflow.

At a high level:

Upload source audio
Transcribe the audio
Send the transcript to an LLM for translation
Generate speech from the translated text
Return text plus audio output

That gives you a clean starting point for building your own multilingual AI workflow.

Try it

Clone the repo:

git clone https://github.com/team-telnyx/telnyx-code-examples.git cd telnyx-code-examples/ai-content-translator-python

Install dependencies and set up your environment:

pip install -r requirements.txt cp .env.example .env python app.py

Then call the translation endpoint with an audio file and target language. Check the README for the exact request shape:
https://github.com/team-telnyx/telnyx-code-examples/tree/main/ai-content-translator-python

Why I like this example

It is a useful pattern for anyone building AI apps where the interface is not just text. Text-only LLM demos are helpful, but a lot of real user experiences involve audio: people speaking, systems responding, and content moving across languages.

This example keeps the workflow small enough to understand, while still showing how speech-to-text, LLM translation, and text-to-speech can fit together in one app.
The Telnyx code examples repo is also structured to be agent-readable, so coding agents can inspect the examples, understand the API patterns, and help you extend them into fuller applications.

Resources:
Code example
Telnyx Developer Docs