DEV Community

Suji Matts
Suji Matts

Posted on

1

Configure Google Cloud Speech-to-Text API

Overview

The Speech-to-Text API enables easy integration of Google speech recognition technologies into developer applications. It allows you to send audio and receive a text transcription from the service.

What we'll cover
In this lab, you will learn how to:

  • Create an API key
  • Create a Speech-to-Text API request
  • Call the Speech-to-Text API

Step 1: Create an API Key

In the Google Cloud Console, navigate to Navigation menu > APIs & services > Credentials.
Click on Create credentials and select API key.
Copy the generated key and click Close.

Save API Key as Environment Variable

Connect to your VM instance via SSH.
In the command line, set the environment variable

export API_KEY=<YOUR_API_KEY>

Enter fullscreen mode Exit fullscreen mode

Step 2: Create Your Speech-to-Text API Request

Create a new file named request.json:

touch request.json

Enter fullscreen mode Exit fullscreen mode

Open the file in a text editor and add the following JSON configuration, specifying the audio file’s URI:

{
  "config": {
    "encoding": "FLAC",
    "languageCode": "en-US"
  },
  "audio": {
    "uri": "gs://cloud-samples-tests/speech/brooklyn.flac"
  }
}
Enter fullscreen mode Exit fullscreen mode

Step 3: Call the Speech-to-Text API

curl -s -X POST -H "Content-Type: application/json" --data-binary @request.json "https://speech.googleapis.com/v1/speech:recognize?key=${API_KEY}"
Enter fullscreen mode Exit fullscreen mode

The response will include the transcript and a confidence score.

Save Response to a File

curl -s -X POST -H "Content-Type: application/json" --data-binary @request.json "https://speech.googleapis.com/v1/speech:recognize?key=${API_KEY}" > result.json

Conclusion

Congratulations! You have successfully used the Speech-to-Text API to transcribe an audio file. This hands-on lab demonstrated how to create an API key, construct a request, and call the Speech-to-Text service.

Read More: https://codelabs.developers.google.com/codelabs/cloud-speech-text-python3#0

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read more

Top comments (0)

Billboard image

Create up to 10 Postgres Databases on Neon's free plan.

If you're starting a new project, Neon has got your databases covered. No credit cards. No trials. No getting in your way.

Try Neon for Free →

AWS GenAI Live!

GenAI LIVE! is a dynamic live-streamed show exploring how AWS and our partners are helping organizations unlock real value with generative AI.

Tune in to the full event

DEV is partnering to bring live events to the community. Join us or dismiss this billboard if you're not interested. ❤️