DEV Community

Revathi Joshi for AWS Community Builders

Posted on • Updated on


AWS service - AWS Transcribe

In this article, I am going to show you how to use Amazon Transcribe (automatic speech recognition service), to create a text transcript of a pre-recorded speech file in English, after uploading it to a S3 bucket using the AWS Management Console.

Amazon Transcribe

  • It is a very easy and a useful tool for creating transcriptions of audio data, either a media file uploaded in an Amazon S3 bucket or a media stream, and converts it to text data.

  • It can transcribe public speeches, business meeting notes, customer calls, broadcast TV, on-demand videos, class lectures and perform medical transcription in real-time.

  • You can use Amazon Transcribe as a standalone service or to add speech-to-text capabilities to any application.

  • You can transcribe from these languages list

  • Transcription jobs are of 2 types:

  • Batch transcription jobs - Media files stored in an Amazon S3 bucket

  • Streaming transcription jobs - Media streams in real time.

Please visit my GitHub Repository for S3 articles on various topics being updated on constant basis.

Let’s get started!


1. Create a S3 bucket

2. Upload an audio file into S3 bucket

3. Create transcription job

4. Review transcription results


  • AWS user account with admin access, not a root account.
  • Create an IAM role, with AmazonS3FullAccess

Resources Used:

Amazon Transcribe

IAM Access Policy

S3 Bucket

Steps for implementation to this project:

1. Create a S3 bucket

On Amazon S3 console / Create bucket / Under General configuration /

Bucket name: - oprah-audio

AWS Region: - US East (N. Virginia) us-east-1

  • Take all defaults and Create bucket

Image description

2. Upload an audio file into S3 bucket

  • Click on your bucket’s name to navigate to the bucket / On the Buckets Home page / Select Upload / Add files / Upload the oprah-audio.mp3 file


Image description

  • Select oprah-audio.mp3 file / Under Properties / For Object overview / Copy the S3 URL / Save it for future use


Image description

3. Create transcription job

  • From the top menu bar, select Services then begin typing Transcribe in the search bar and select Amazon Transcribe to open the service console.

  • On the Amazon Transcribe Console / Transcription jobs page, click Create job / Under Specify job details / Job settings /

Name: - oprah-audio-transcribe-job

Language: - English,US (en-US)

Input data / Input file location on S3: - s3://oprah-audio/oprah-audio.mp3

Output data location type: take the default - Service-managed S3 bucket.

Subtitle type format: -

  • Amazon Transcribe supports WebVTT (VTT) and SubRip (SRT) file types.

  • In the Subtitle file format field, you can choose either or both file types for output.

  • If you select both types, you get two files that are exported to the same S3 bucket.

  • I am not not using either formats.


  • On the Configure page / Under Customization / Custom vocabulary

This feature helps you to recognize words and phrases that are specific to your application. I am not choosing this feature as I am not using any application.

  • Create job

4. Review transcription results

  • After the Transcription jobs - shows Complete / Click on oprah-audio-transcribe-job / Under Transcription preview / Text

  • you can see the following transcribed text

  • If the transcribed text is long, you have to to scroll down to the Transcription panel to view the whole transcription job output.

Image description


  • Delete the audio file - oprah-audio.mp3

  • Delete the S3 bucket

  • Delete the Transcription job

What we have done so far

We have used Amazon Transcribe to create a text transcript of a pre-recorded speech file in English, after uploading it to a S3 bucket using the AWS Management Console.

Top comments (0)

An Animated Guide to Node.js Event Loop

Node.js doesn’t stop from running other operations because of Libuv, a C++ library responsible for the event loop and asynchronously handling tasks such as network requests, DNS resolution, file system operations, data encryption, etc.

What happens under the hood when Node.js works on tasks such as database queries? We will explore it by following this piece of code step by step.