DEV Community

Cover image for Tutorial: Play with a Speech-to-Text API using Node.js
Yongchang He
Yongchang He

Posted on • Edited on

Tutorial: Play with a Speech-to-Text API using Node.js

Play with an API from Deepgram converting an audio file or audio stream into written text

The purpose of building this blog is to write down the detailed operation history and my memo for learning Node.js.
If you are also interested and want to get hands dirty, just follow these steps below and have fun!~

Prerequisite

  • Have installed Node.js
  • Have Command Line Interface (CLI / Terminal)
  • Have your favourite code IDE (e.g. VSCode)
  • Have created a Deepgram account.

Getting started

We should first navigate to our favored directory, and create a folder(e.g. named sttApp) using this command:

mkdir sttApp
Enter fullscreen mode Exit fullscreen mode

Then open the folder using your favourite IDE. Mine is VS code. We can see now the directory is empty with no files.

Image description

Next step let's use our terminal, navigate to your current directory /sttApp :

cd sttApp
Enter fullscreen mode Exit fullscreen mode

And run the following code to initialize a new application:

npm init
Enter fullscreen mode Exit fullscreen mode

Press enter several times to leave these parameters with default configuration, and then your CLI should get a result like this:

Image description

Next, we install the Deepgram Node.js SDK using the following:

npm install @deepgram/sdk
Enter fullscreen mode Exit fullscreen mode

Image description

Till now if all the previous steps are correct, you should get a similar directory in your code IDE like the following:

Image description

Now in the current directory of your code IDE (/sttAPP) create a file named index.js , and copy and paste the following code to index.js and save your file:

const { Deepgram } = require('@deepgram/sdk');
const fs = require('fs');

// The API key you created in step 1
const deepgramApiKey = 'YOUR_API_KEY';

// Replace with your file path and audio mimetype
const pathToFile = 'SOME_FILE.wav';
const mimetype = 'audio/wav';

// Initializes the Deepgram SDK
const deepgram = new Deepgram(deepgramApiKey);

console.log('Requesting transcript...')
console.log('Your file may take up to a couple minutes to process.')
console.log('While you wait, did you know that Deepgram accepts over 40 audio file formats? Even MP4s.')
console.log('To learn more about customizing your transcripts check out developers.deepgram.com.')

deepgram.transcription.preRecorded(
  { buffer: fs.readFileSync(pathToFile), mimetype },
  { punctuate: true, language: 'en-US' },
)
.then((transcription) => {
  console.dir(transcription, {depth: null});
})
.catch((err) => {
  console.log(err);
});
Enter fullscreen mode Exit fullscreen mode

Image description

The next step is to log in to your Deepgram, navigate to your Dashboard , and choose to Get a Transcript via API or SDK:

Image description

Click reveal Key and copy your API KEY SECRET:

Image description

Image description

In the next step, paste your API KEY SECRET into line 5 of your index.js, like the following:

Image description

Then let's replace line 8 and 9 with our voice file path and mime-type
(Hint: use a new CLI to navigate to the directory where your voice file is located and use pwd to acquire absolute path):

Image description

Now lastly let's run our application with the following command (Make sure you are at /sttApp):

node index.js
Enter fullscreen mode Exit fullscreen mode

And you’ll receive a JSON response including a transcript that you want, and including word arrays, timings, and confidence scores:

Image description

Image description

Pretty COOL!

If you still get confused with the content above, please feel free to leave messages below or refer to my git repository here for the whole project: linkToGit

References

https://console.deepgram.com/project/850abca5-449a-47fa-8c40-6a463e59ad00/mission/transcript-via-api-or-sdk
https://dev.to/devteam/join-us-for-a-new-kind-of-hackathon-on-dev-brought-to-you-by-deepgram-2bjd

Overview of My Submission

A tutorial for beginners to learn node.js using STT API from Deepgram.

Submission Category:

Analytics Ambassadors

Link to Code on GitHub

linkToGit

Additional Resources / Info

None

Top comments (10)

Collapse
 
mtwn105 profile image
Amit Wani

Nice one!

Hope you have revoked your API keys :)

Collapse
 
yongchanghe profile image
Yongchang He

Thank you for the reminder! I have covered these places where have shown the KEYs. (through they will expire within 8 hours based on Deepgram's policy).

Collapse
 
jzombie profile image
jzombie

8 hours is plenty of time to run up a bill w/ concurrent users using your keys. Don't let that time policy be your only safeguard.

Thread Thread
 
yongchanghe profile image
Yongchang He

Thanks for letting me know this!

Collapse
 
ramanbansal profile image
Raman Bansal

Best tutorial for intermediate programmers

Collapse
 
yongchanghe profile image
Yongchang He

Very glad to hear that. Thank you!

Collapse
 
marcomoscatelli profile image
Marco Moscatelli

Great tutorial, thank you!

Collapse
 
yongchanghe profile image
Yongchang He

Great to hear that, Thank you!

Collapse
 
senthil524 profile image
Senthil

Nice post, thank you.

Collapse
 
yongchanghe profile image
Yongchang He

You are welcome!