Read the original article:Practical Guide: How to Implement Text-to-Speech (TTS) in HarmonyOS
Preface
Hello everyone, I am Ruocheng. This series aims to help developers quickly implement commonly used features in HarmonyOS apps, providing ready-to-use code examples.
In this article, we will focus on how to implement text-to-speech (TTS) capability in HarmonyOS using the textToSpeech API, enabling your app to convert written text into spoken audio.
Effect Preview
As shown below, by clicking the Read Text button in the demo, the text on the page will be read aloud:

Read Text Function Encapsulation
To improve reusability and maintainability, create a new file named textToSpeech.ets in the utils folder to encapsulate text-to-speech-related functions:

Core Utility Class
Below is the complete TextToSpeechManager utility class. It follows the singleton pattern and can be directly copied and used:
import { textToSpeech } from '@kit.CoreSpeechKit';
import { BusinessError } from '@kit.BasicServicesKit';
/**
* Text-to-speech
* Use: Core Speech Kit
* Supports converting up to 10,000 characters of mixed Chinese/English text (Simplified Chinese, Traditional Chinese, numbers, English) into speech with selectable voice styles.
* Use cases
* Enables devices such as phones and tablets to perform text-to-speech conversion even offline, supporting playback features (like screen reading) for visually impaired users or situations where reading is inconvenient.
*/
export class TextToSpeechManager{
private static instance: TextToSpeechManager;
private constructor() {}
public static getInstance(): TextToSpeechManager {
if (!TextToSpeechManager.instance) {
TextToSpeechManager.instance = new TextToSpeechManager();
}
return TextToSpeechManager.instance;
}
// Create a TextToSpeechEngine instance
private ttsEngine: textToSpeech.TextToSpeechEngine|null = null;
// Set playback parameters
private extraParam: Record<string, Object>|null = null;
// Instantiate the SpeakParams object
private speakParams: textToSpeech.SpeakParams|null = null;
// SpeakListener object; set the callback information of speak
private speakListener: textToSpeech.SpeakListener|null = null;
/**
* Create a TextToSpeechEngine instance using the createEngine API.
* The createEngine API supports two call forms; this article demonstrates one of them; for other methods, refer to the API documentation.
* For additional methods, see: https://developer.huawei.com/consumer/cn/doc/harmonyos-references/hms-ai-texttospeech
*/
createEngine(){
// Set engine creation parameters
let extraParam: Record<string, Object> = {"style": 'interaction-broadcast', "locate": 'CN', "name": 'EngineName'};
let initParamsInfo: textToSpeech.CreateEngineParams = {
language: 'zh-CN',
person: 0,
online: 1,
extraParams: extraParam
};
// Call the createEngine method
textToSpeech.createEngine(initParamsInfo, (err: BusinessError, textToSpeechEngine: textToSpeech.TextToSpeechEngine) => {
if (!err) {
console.info('Succeeded in creating engine');
// Receive the created engine instance
this.ttsEngine = textToSpeechEngine;
} else {
console.error(`Failed to create engine. Code: ${err.code}, message: ${err.message}.`);
}
});
}
/**
* After obtaining the TextToSpeechEngine instance, instantiate the SpeakParams and SpeakListener objects. Then, pass in the text to be synthesized and spoken (originalText), and call the speak API to perform text-to-speech playback.
*/
initParam(){
// Set the speak callback information
this.speakListener = {
// Callback for when playback starts
onStart(requestId: string, response: textToSpeech.StartResponse) {
console.info(`onStart, requestId: ${requestId} response: ${JSON.stringify(response)}`);
},
// Callback for when synthesis and playback are completed
onComplete(requestId: string, response: textToSpeech.CompleteResponse) {
console.info(`onComplete, requestId: ${requestId} response: ${JSON.stringify(response)}`);
},
// Callback for when playback stops
onStop(requestId: string, response: textToSpeech.StopResponse) {
console.info(`onStop, requestId: ${requestId} response: ${JSON.stringify(response)}`);
},
// Callback for returning audio data
onData(requestId: string, audio: ArrayBuffer, response: textToSpeech.SynthesisResponse) {
console.info(`onData, requestId: ${requestId} sequence: ${JSON.stringify(response)} audio: ${JSON.stringify(audio)}`);
},
// Error callback
onError(requestId: string, errorCode: number, errorMessage: string) {
console.error(`onError, requestId: ${requestId} errorCode: ${errorCode} errorMessage: ${errorMessage}`);
}
};
// Set the callback
this.ttsEngine?.setListener(this.speakListener);
}
/**
* Call the playback method
* Developers can actively set the playback policy by modifying speakParams.
*/
speak(text:string){
// Set playback parameters
this.extraParam= {"queueMode": 0, "speed": 1, "volume": 0.1, "pitch": 1, "languageContext": 'zh-CN',
"audioType": "pcm", "soundChannel": 3, "playType": 1 };
this.speakParams = {
requestId: new Date().getTime().toString(), //'123456', // requestId can only be used once within the same instance, do not reuse.
extraParams: this.extraParam
};
// Call the playback method
// Developers can modify speakParams to actively define the playback policy.
this.ttsEngine?.speak(text, this.speakParams);
}
/**
* Stop calling the playback method
* When it is necessary to terminate speech synthesis and playback, call the stop API.
*/
stop(){
// When it is necessary to check whether the text-to-speech service is busy,
// then stop the playback
if(this.ttsEngine?.isBusy()){
this.ttsEngine?.stop();
}
}
}
Practical Implementation Steps
Step 1: Import the utility class.
In the page where you need the text-to-speech feature, import the encapsulated class:
import {TextToSpeechManager} from "../utils/textToSpeech"
Step 2: Initialize the TTS engine.
Initialize in the page lifecycle method. When the page opens, create the TextToSpeechEngine instance and initialize related parameters:
private textToSpeechManger = TextToSpeechManager.getInstance();
aboutToAppear(): void {
this.textToSpeechManger.createEngine();
this.textToSpeechManger.initParam();
}
Step 3: Implement text-to-speech feature.
Next, write the business function for text-to-speech. When the user clicks the button, simply pass the text to be played back into the function to implement text-to-speech.
// Text-to-Speech
textToSpeech(txt:string){
this.textToSpeechManger.speak(txt);
}
Feature Highlights
- Multi-language support:Supports mixed Chinese-English text playback
- Offline capability:Works without a network connection
- Text limit: Supports up to 10,000 characters per request
- Voice options:Multiple voice styles available
- Adjustable parameters:Customize speed, volume, and pitch
- Status callbacks:Provides complete playback status monitoring
Usage Recommendations
- Resource management:Call stop() when the page is destroyed to release resources
- Error handling:You can perform corresponding error handling based on the error information returned in the callback function
- User experience:For long texts, split into segments to avoid long waiting times
- Permission configuration:Ensure the app has the required audio permissions
Summary
Through this article, we learned how to implement text-to-speech feature in HarmonyOS apps:
- Utility class encapsulation:The TextToSpeechManager class is encapsulated using the singleton pattern for global access
- Engine initialization:Create the TTS engine instance with the createEngine() method
- Parameter configuration:Set playback parameters and listeners with the initParam() method
- Text-to-speech implementation:Implement text-to-speech using the speak() method
- Resource control:Manage playback state using the stop() method
This text-to-speech solution is highly encapsulated and easy to integrate, allowing quick adoption in your HarmonyOS projects. Whether for news reading, learning assistance, or accessibility features, it can deliver an excellent user experience. That concludes this guide — now go and try it out.
Top comments (0)