DEV Community

Moustafa Abdelhamid
Moustafa Abdelhamid

Posted on

🚀 Getting Started with Deepgram Nova-3 for Real-Time Speech-to-Text

Introduction

Deepgram’s Nova-3 is the latest evolution in speech-to-text AI, offering real-time multilingual transcription, improved accuracy, and instant vocabulary updates. If you’re working on AI-driven transcription, you’ll want to explore this model.

At Transgate.ai, we took our first look at Nova-3, and the results were promising! (Check out our insights here: Transgate’s article).

In this guide, let’s get started with Nova-3 by setting up a simple transcription pipeline using Deepgram’s API.


Step 1: Set Up Your Deepgram API Key

First, sign up at Deepgram and grab your API key.

Then, install the Deepgram SDK:

npm install @deepgram/sdk
Enter fullscreen mode Exit fullscreen mode

Step 2: Basic Real-Time Transcription

Using Node.js, we’ll create a simple WebSocket connection to transcribe live audio.

import { Deepgram } from '@deepgram/sdk';
import WebSocket from 'ws';
import fs from 'fs';

// Replace with your Deepgram API key
const deepgramApiKey = 'YOUR_DEEPGRAM_API_KEY';
const audioFile = 'sample.wav'; // Path to your audio file

const deepgram = new Deepgram(deepgramApiKey);

const ws = new WebSocket('wss://api.deepgram.com/v1/listen', {
  headers: { Authorization: `Token ${deepgramApiKey}` },
});

ws.on('open', () => {
  console.log('Connected to Deepgram WebSocket');
  const stream = fs.createReadStream(audioFile);
  stream.on('data', (chunk) => ws.send(chunk));
  stream.on('end', () => ws.close());
});

ws.on('message', (message) => {
  const transcript = JSON.parse(message);
  console.log('Transcript:', transcript.channel.alternatives[0].transcript);
});

ws.on('close', () => console.log('Connection closed'));
Enter fullscreen mode Exit fullscreen mode

What This Script Does:

✅ Connects to Deepgram’s real-time transcription API

âś… Streams an audio file for processing

âś… Logs transcriptions in real time


Step 3: Customizing the Transcription

Nova-3 supports custom vocabulary and language models. To enhance accuracy, pass custom parameters like this:

const ws = new WebSocket('wss://api.deepgram.com/v1/listen?model=nova-3&language=en&keywords=AI,transcription');
Enter fullscreen mode Exit fullscreen mode

This boosts accuracy for domain-specific terms like AI, medical jargon, or industry-specific words.


Final Thoughts

Deepgram’s Nova-3 is fast, multilingual, and highly customizable. It’s a powerful tool for anyone building real-time voice applications.

🚀 Next Steps:

  • Try it with your own audio files 🎙️
  • Experiment with different languages 🌍
  • Fine-tune with custom vocabulary 🔧

Check out Transgate.ai’s first impressions: Read here.

What do you think about Nova-3? Let’s discuss in the comments! 👇

API Trace View

How I Cut 22.3 Seconds Off an API Call with Sentry đź•’

Struggling with slow API calls? Dan Mindru walks through how he used Sentry's new Trace View feature to shave off 22.3 seconds from an API call.

Get a practical walkthrough of how to identify bottlenecks, split tasks into multiple parallel tasks, identify slow AI model calls, and more.

Read more →

Top comments (0)

AWS Security LIVE!

Join us for AWS Security LIVE!

Discover the future of cloud security. Tune in live for trends, tips, and solutions from AWS and AWS Partners.

Learn More

đź‘‹ Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay