DEV Community

MuraliAkula1
MuraliAkula1

Posted on

Integration of Text to Speech feature of Huawei ML Kit in Book Reading Android app (Kotlin) - Part 4

Image description
Introduction
In this article, we can learn how to integrate the Text to Speech feature of Huawei ML Kit in Book Reading app. Text to speech (TTS) can convert text information into human voice in real time. This service uses Deep Neural Networks in order to process the text and create a natural sound, rich timbers are also supported to enhance the result. TTS is widely used in broadcasting, news, voice navigation, and audio reading. For example, TTS can convert a large amount of text into speech output and highlight the content that is being played to free users eyes, bringing interests to users. TTS records a voice segment based on navigation data, and then synthesizes the voice segment into navigation voice, so that navigation is more personalized.

Precautions

  1. The text in a single request can contain a maximum of 500 characters and is encoded using UTF-8.
  2. Currently, TTS in French, Spanish, German, Italian, Russian, Thai, Malay, and Polish is deployed only in China, Asia, Africa, Latin America, and Europe.
  3. TTS depends on on-cloud APIs. During commissioning and usage, ensure that the device can access the Internet.
  4. Default specifications of the real-time output audio data are as follows: MP3 mono, 16-bit depth, and 16 kHz audio sampling rate.

Requirements

  1. Any operating system (MacOS, Linux and Windows).
  2. Must have a Huawei phone with HMS 4.0.0.300 or later.
  3. Must have a laptop or desktop with Android Studio, Jdk 1.8, SDK platform 26 and Gradle 4.6 and above installed.
  4. Minimum API Level 24 is required.
  5. Required EMUI 9.0.0 and later version devices.

How to integrate HMS Dependencies

  • First register as Huawei developer and complete identity verification in Huawei developers website, refer to register a Huawei ID.
  • Create a project in android studio, refer Creating an Android Studio Project.
  • Generate a SHA-256 certificate fingerprint.
  • To generate SHA-256 certificate fingerprint. On right-upper corner of android project click Gradle, choose Project Name > Tasks > android, and then click signingReport, as follows.

Image description
Note: Project Name depends on the user created name.

Image description

  • Enter SHA-256 certificate fingerprint and click **Save **button, as follows.

Image description

  • Click Manage APIs tab and enable ML Kit.

Image description

  • Add the below maven URL in build.gradle(Project) file under the repositories of b*uildscript, dependencies* and allprojects, refer Add Configuration. maven { url 'http://developer.huawei.com/repo/' } classpath 'com.huawei.agconnect:agcp:1.6.0.300'
  • Add the below plugin and dependencies in build.gradle(Module) file. `apply plugin: id 'com.huawei.agconnect'

dataBinding {
enabled = true
}

// Huawei AGC
implementation 'com.huawei.agconnect:agconnect-core:1.6.0.300'
// ML Kit - Text to Speech
implementation 'com.huawei.hms:ml-computer-voice-tts:3.3.0.305'
// Data Binding
implementation 'androidx.databinding:databinding-runtime:7.1.1'
`

  • Now Sync the gradle.
  • Add the required permission to the AndroidManifest.xml file. <uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" /> <uses-permission android:name="android.permission.INTERNET" />

Let us move to development
I have created a project on Android studio with empty activity let us start coding.
In the ListActivity.kt to find the button click.
`class ListActivity : AppCompatActivity() {

override fun onCreate(savedInstanceState: Bundle?) {
    super.onCreate(savedInstanceState)
    setContentView(R.layout.activity_list)

    btn_voice.setOnClickListener {
        val intent = Intent(this@ListActivity, TranslateActivity::class.java)
        startActivity(intent)
    }

}
Enter fullscreen mode Exit fullscreen mode

}`

In the TranslateActivity.kt to find the business logic for text translation.
`class TranslateActivity : AppCompatActivity() {

private lateinit var binding: ActivityTranslateBinding
private lateinit var ttsViewModel: TtsViewModel
private var sourceText: String = ""
private lateinit var mlTtsEngine: MLTtsEngine
private lateinit var mlConfigs: MLTtsConfig
private val TAG: String = TranslateActivity::class.java.simpleName
private var callback: MLTtsCallback = object : MLTtsCallback {
    override fun onError(taskId: String, err: MLTtsError) {
    }
    override fun onWarn(taskId: String, warn: MLTtsWarn) {
    }
    override fun onRangeStart(taskId: String, start: Int, end: Int) {
        Log.d("", start.toString())
        img_view.setImageResource(R.drawable.on)
    }
    override fun onAudioAvailable(p0: String?, p1: MLTtsAudioFragment?, p2: Int, p3: android.util.Pair<Int, Int>?, p4: Bundle?) {
    }
    override fun onEvent(taskId: String, eventName: Int, bundle: Bundle?) {
        if (eventName == MLTtsConstants.EVENT_PLAY_STOP) {
            Toast.makeText(applicationContext, "Service Stopped", Toast.LENGTH_LONG).show()
        }
        img_view.setImageResource(R.drawable.off)
    }
}

override fun onCreate(savedInstanceState: Bundle?) {
    super.onCreate(savedInstanceState)
    setContentView(activity_translate)

    binding = setContentView(this, activity_translate)
    binding.lifecycleOwner = this
    ttsViewModel = ViewModelProvider(this).get(TtsViewModel::class.java)
    binding.ttsViewModel = ttsViewModel
    setApiKey()
    supportActionBar?.title = "Text to Speech Conversion"
    ttsViewModel.ttsService.observe(this, Observer {
        startTtsService()
    })
    ttsViewModel.textData.observe(this, Observer {
        sourceText = it
    })

}

private fun startTtsService() {
    mlConfigs = MLTtsConfig()
        .setLanguage(MLTtsConstants.TTS_EN_US)
        .setPerson(MLTtsConstants.TTS_SPEAKER_FEMALE_EN)
        .setSpeed(1.0f)
        .setVolume(1.0f)
    mlTtsEngine = MLTtsEngine(mlConfigs)
    mlTtsEngine.setTtsCallback(callback)
    // ID to use for Audio Visualizer.
    val id = mlTtsEngine.speak(sourceText, MLTtsEngine.QUEUE_APPEND)
    Log.i(TAG, id)
}

private fun setApiKey(){
    MLApplication.getInstance().apiKey = "DAEDAOB+zyB7ajg1LGcp8F65qxZduDjQ1E6tVovUp4lU/PywqhT4g+bxBCtStYAa33V9tUQrKvUp89m+0Gi/fPwfNN6WCJxcVLA+WA=="
}

override fun onDestroy() {
    super.onDestroy()
    mlTtsEngine.shutdown()
}
override fun onPause() {
    super.onPause()
    mlTtsEngine.stop()
}
Enter fullscreen mode Exit fullscreen mode

}`

In the activity_list.xml we can create the UI screen.
`<?xml version="1.0" encoding="utf-8"?>
xmlns:app="http://schemas.android.com/apk/res-auto"
xmlns:tools="http://schemas.android.com/tools"
android:layout_width="match_parent"
android:layout_height="match_parent"
android:orientation="vertical"
android:paddingTop="10dp"
android:paddingBottom="10dp"
tools:context=".ListActivity">

    android:id="@+id/btn_voice"
    android:layout_width="310dp"
    android:layout_height="wrap_content"
    android:layout_marginTop="50dp"
    android:textAlignment="center"
    android:layout_gravity="center_horizontal"
    android:textSize="20sp"
    android:textColor="@color/black"
    android:padding="8dp"
    android:textAllCaps="false"
    android:text="Text to Voice" />
Enter fullscreen mode Exit fullscreen mode

`

In the activity_translate.xml we can create the UI screen.
`<?xml version="1.0" encoding="utf-8"?>
xmlns:app="http://schemas.android.com/apk/res-auto"
xmlns:tools="http://schemas.android.com/tools">

<data>
    <variable
        name="ttsViewModel"
        type="com.example.huaweibookreaderapp1.TtsViewModel" />
</data>

<androidx.constraintlayout.widget.ConstraintLayout
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    android:background="@color/white"
    tools:context=".TranslateActivity">

    <Button
        android:id="@+id/btn_click"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:onClick="@{() -> ttsViewModel.callTtsService()}"
        android:text="@string/speak"
        android:textSize="20sp"
        app:layout_constraintBottom_toBottomOf="parent"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintHorizontal_bias="0.498"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toTopOf="parent"
        app:layout_constraintVertical_bias="0.43" />
    <EditText
        android:id="@+id/edt_text"
        android:layout_width="409dp"
        android:layout_height="wrap_content"
        android:layout_marginBottom="36dp"
        android:ems="10"
        android:textSize="20sp"
        android:hint="@string/enter_text_here"
        android:inputType="textPersonName"
        android:onTextChanged="@{ttsViewModel.noDataChangedText}"
        android:paddingStart="70dp"
        app:layout_constraintBottom_toTopOf="@+id/btn_click"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintHorizontal_bias="1.0"
        app:layout_constraintStart_toStartOf="parent"
        android:autofillHints="@string/enter_text_here" />
    <ImageView
        android:id="@+id/img_view"
        android:layout_width="100dp"
        android:layout_height="100dp"
        android:layout_marginTop="7dp"
        app:layout_constraintBottom_toTopOf="@+id/edt_text"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintHorizontal_bias="0.498"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toTopOf="parent"
        app:layout_constraintVertical_bias="0.8"
        app:srcCompat="@drawable/off"
        android:contentDescription="@string/speaker" />
</androidx.constraintlayout.widget.ConstraintLayout>
Enter fullscreen mode Exit fullscreen mode

`

Demo

Image description

Image description

Image description

Image description

Tips and Tricks

  1. Make sure you are already registered as Huawei developer.
  2. Set minSDK version to 24 or later, otherwise you will get AndriodManifest merge issue.
  3. Make sure you have added the agconnect-services.json file to app folder.
  4. Make sure you have added SHA-256 fingerprint without fail.
  5. Make sure all the dependencies are added properly.

Conclusion
In this article, we have learned how to integrate the Text to Speech feature of Huawei ML Kit in Book Reading app. Text to speech (TTS) can convert text information into human voice in real time. This service uses Deep Neural Networks in order to process the text and create a natural sound, rich timbers are also supported to enhance the result.

I hope you have read this article. If you found it is helpful, please provide likes and comments.

Reference
ML Kit – Text to Speech
ML Kit – Training Video

Top comments (0)