In this article, I am going to show you how to use Amazon Transcribe (automatic speech recognition service), to create a text transcript of a pre-recorded speech file in English, after uploading it to a S3 bucket using the AWS Management Console.
Amazon Transcribe
It is a very easy and a useful tool for creating transcriptions of audio data, either a media file uploaded in an Amazon S3 bucket or a media stream, and converts it to text data.
It can transcribe public speeches, business meeting notes, customer calls, broadcast TV, on-demand videos, class lectures and perform medical transcription in real-time.
You can use Amazon Transcribe as a standalone service or to add speech-to-text capabilities to any application.
You can transcribe from these languages list
Transcription jobs are of 2 types:
Batch transcription jobs - Media files stored in an Amazon S3 bucket
Streaming transcription jobs - Media streams in real time.
Please visit my GitHub Repository for S3 articles on various topics being updated on constant basis.
Let’s get started!
Objectives:
1. Create a S3 bucket
2. Upload an audio file into S3 bucket
3. Create transcription job
4. Review transcription results
Pre-requisites:
- AWS user account with admin access, not a root account.
- Create an IAM role, with AmazonS3FullAccess
Resources Used:
Steps for implementation to this project:
1. Create a S3 bucket
On Amazon S3 console / Create bucket / Under General configuration /
Bucket name: - oprah-audio
AWS Region: - US East (N. Virginia) us-east-1
- Take all defaults and Create bucket
2. Upload an audio file into S3 bucket
- Click on your bucket’s name to navigate to the bucket / On the Buckets Home page / Select Upload / Add files / Upload the oprah-audio.mp3 file
Upload
- Select oprah-audio.mp3 file / Under Properties / For Object overview / Copy the S3 URL / Save it for future use
s3://oprah-audio/oprah-audio.mp3
3. Create transcription job
From the top menu bar, select Services then begin typing Transcribe in the search bar and select Amazon Transcribe to open the service console.
On the Amazon Transcribe Console / Transcription jobs page, click Create job / Under Specify job details / Job settings /
Name: - oprah-audio-transcribe-job
Language: - English,US (en-US)
Input data / Input file location on S3: - s3://oprah-audio/oprah-audio.mp3
Output data location type: take the default - Service-managed S3 bucket.
Subtitle type format: -
Amazon Transcribe supports WebVTT (VTT) and SubRip (SRT) file types.
In the Subtitle file format field, you can choose either or both file types for output.
If you select both types, you get two files that are exported to the same S3 bucket.
I am not not using either formats.
Next
- On the Configure page / Under Customization / Custom vocabulary
This feature helps you to recognize words and phrases that are specific to your application. I am not choosing this feature as I am not using any application.
- Create job
4. Review transcription results
After the Transcription jobs - shows Complete / Click on oprah-audio-transcribe-job / Under Transcription preview / Text
you can see the following transcribed text
If the transcribed text is long, you have to to scroll down to the Transcription panel to view the whole transcription job output.
Cleanup
Delete the audio file - oprah-audio.mp3
Delete the S3 bucket
Delete the Transcription job
What we have done so far
We have used Amazon Transcribe to create a text transcript of a pre-recorded speech file in English, after uploading it to a S3 bucket using the AWS Management Console.
Top comments (0)