Introduction
NVIDIA's Parakeet TDT 0.6B v3 is a state-of-the-art automatic speech recognition (ASR) model that delivers exceptional accuracy for English transcription. With 600 million parameters, this model combines the FastConformer architecture with the Token-and-Duration Transducer (TDT) decoder to provide:
- Automatic punctuation and capitalization
- Word-level timestamp predictions
- Processing of audio segments up to 24 minutes in a single pass
- Impressive speed: RTFx of 3380 on the HF-Open-ASR leaderboard
This guide walks you through setting up Parakeet TDT 0.6B v3 on an AWS EC2 Ubuntu instance, similar to how you would deploy Whisper, but optimized for NVIDIA's cutting-edge ASR technology.
Choosing the Right EC2 Instance
For running Parakeet TDT 0.6B v3, you need an EC2 instance with NVIDIA GPU support. Here are your options:
Recommended Instance Types
g6.2xlarge (Recommended)
- GPU: 1x NVIDIA L4 with 24 GB memory
- vCPUs: 8
- RAM: 32 GiB
- Performance: 2x better for deep learning inference compared to g4dn instances
- Cost: ~$0.98/hour (us-east-1)
- Best for: Production workloads with modern GPU architecture
g4dn.xlarge (Budget Option)
- GPU: 1x NVIDIA T4 with 16 GB memory
- vCPUs: 4
- RAM: 16 GiB
- Cost: Lower cost entry point
- Best for: Development and testing
Hardware Requirements:
- Minimum 2GB RAM for model loading
- Supports NVIDIA Volta, Ampere, Hopper, and Blackwell architectures
- At least 30-40 GB disk space
Step 1: Launch EC2 Instance on AWS
1.1 Create the EC2 Instance
The g6.2xlarge instance is one of the most cost-effective options for running speech recognition models on AWS. Follow these steps to launch your instance:
Step-by-step launch process:
- Open the AWS Console and navigate to the EC2 Dashboard
- Click the "Launch instance" button in the top section
- Enter an instance name (e.g., "parakeet-asr-instance")
- In the Application and OS Images (Amazon Machine Image) section:
- Search for "Ubuntu"
- Select Ubuntu Server 22.04 LTS (or Ubuntu 24.04 LTS for newer releases)
- Verify the AMI is marked as "Free tier eligible" if applicable
- In the Instance type section:
- Search for or select g6.2xlarge
- This instance provides 1x NVIDIA L4 GPU with 24GB memory, 8 vCPUs, and 32 GiB RAM
- In the Key pair (login) section:
- Select an existing key pair or create a new one
-
Important: Download and securely save the
.pemfile if creating a new key pair - This key is required for SSH access
- In the Network settings section:
- Leave default VPC settings
- Allow SSH traffic from your IP address (or 0.0.0.0/0 for development, but restrict in production)
- In the Storage (Root volume) section:
- Expand the storage configuration
- Change the EBS Volume size from 100 GB
- Keep the volume type as gp3 (General Purpose SSD)
- Leave all other settings at their default values
- Review your configuration and click "Launch instance"
1.2 Monitor Instance Launch
After clicking "Launch instance":
- You'll see a confirmation page with your Instance ID
- Click on your instance ID to view the instance details page
- Wait 1-2 minutes for the instance to reach the "Running" state
- Once running, note the Public IPv4 address displayed on the instance page
- The instance will automatically assign an elastic IP; this is your connection address
1.3 Connect to Your Instance via SSH
Once the instance is running, connect using SSH:
# Use the following command (replace the path and DNS/IP accordingly)
ssh -i /path/to/your-key.pem ubuntu@your-instance-public-dns
Example:
ssh -i ~/Downloads/my-parakeet-key.pem ubuntu@ec2-54-123-45-67.compute-1.amazonaws.com
Or using the public IPv4 address:
ssh -i ~/Downloads/my-parakeet-key.pem ubuntu@54.123.45.67
Expected output on first connection:
The authenticity of host '...' can't be established.
ECDSA key fingerprint is ...
Are you sure you want to continue connecting (yes/no/[fingerprint])?
Type yes and press Enter to add the host to your known hosts.
Success indicator:
ubuntu@ip-xxx-xxx-xxx-xxx:~$
If you see this prompt, your SSH connection is successful! Your EC2 instance is ready for software installation.
Step 2: Assign IAM Role to EC2 Instance
To allow your EC2 instance to access AWS S3 buckets (for storing audio files and transcription results), you need to assign an IAM role. This is more secure than using hardcoded AWS credentials.
2.1 Create an IAM Role
Create the role:
- Open the AWS IAM Console (https://console.aws.amazon.com/iam/)
- In the left sidebar, click "Roles"
- Click the "Create role" button
- In the "Trusted entity type" section:
- Select "AWS Service"
- In the "Service or use case" section:
- Search for and select "EC2" from the list
- This allows EC2 instances to use this role
- Click "Next" to proceed to permissions
- In the "Permissions policies" section:
- Search for "S3"
- Select "AmazonS3FullAccess"
- Note: For production environments, create a custom policy that restricts access to specific S3 buckets and operations instead of granting full S3 access
- Click "Next" to review
- In the "Role name" field, enter a descriptive name:
- Example:
asr-ec2-roleorparakeet-s3-access-role
- Example:
- Optionally add a description: "Role for EC2 instance to access S3 for ASR audio files"
- Click "Create role"
2.2 Attach the Role to Your EC2 Instance
Now attach this role to your running EC2 instance:
- Go back to the EC2 Dashboard
- Click "Instances" in the left sidebar
- Find and click on your instance (the g6.2xlarge instance you just created)
- You'll see the instance details page
- Click the "Actions" button (top-right corner)
- Hover over "Security" in the dropdown menu
- Click "Modify IAM role"
- In the dropdown menu that appears:
- Select the role you just created (e.g.,
asr-ec2-role)
- Select the role you just created (e.g.,
- Click "Update IAM role"
Verification:
- Refresh the instance details page
- Scroll to the "Details" tab
- Look for the "IAM instance profile" field
- You should see your role name displayed there
Your EC2 instance can now access S3 without requiring explicit AWS credentials!
2.3 Verify IAM Role Access (Optional)
To confirm the role is working correctly, you can test S3 access from your instance:
# Connect to your instance via SSH (if not already connected)
ssh -i your-key.pem ubuntu@your-instance-public-dns
# Test listing S3 buckets
aws s3 ls
# Example output (your actual buckets will be listed):
# 2025-11-02 12:34:56 my-asr-audio-files
# 2025-11-02 12:35:12 my-transcription-results
If you see your S3 buckets listed, the IAM role is properly configured!
Step 3: System Update and Basic Dependencies
Once connected to your EC2 instance via SSH, start with system updates:
# Update package lists
sudo apt update
# Install Python pip and basic tools
sudo apt install python3-pip -y
Step 4: Install NVIDIA Drivers and CUDA Toolkit
4.1 Install NVIDIA Driver
For Ubuntu, install the appropriate NVIDIA driver version:
# Install NVIDIA driver (version 525 or later)
sudo apt install nvidia-driver-525 -y
# Reboot the system to load the driver
sudo reboot
After reboot, reconnect to your instance and verify the driver installation:
nvidia-smi
You should see output showing your GPU (NVIDIA L4 for g6 or T4 for g4dn instances).
4.2 Install CUDA Toolkit
# Install NVIDIA CUDA toolkit
sudo apt install nvidia-cuda-toolkit -y
4.3 Install FFmpeg
FFmpeg is required for audio processing:
sudo apt install ffmpeg -y
Verify installation:
ffmpeg -version
Step 5: Set Up Python Environment with Miniconda
Using Conda helps manage dependencies and avoid conflicts.
5.1 Download and Install Miniconda
# Download Miniconda installer
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
# Run the installer
bash Miniconda3-latest-Linux-x86_64.sh
Follow the prompts:
- Press ENTER to review the license
- Type
yesto accept - Press ENTER to confirm the installation location
- Type
yeswhen asked to initialize Miniconda3
5.2 Initialize Conda
# Initialize conda for bash
/home/ubuntu/miniconda3/bin/conda init
# Reload your shell configuration
source ~/.bashrc
5.3 Accept Conda Terms of Service
# Accept TOS for main channel
conda config --set channel_priority strict
conda config --add channels conda-forge
Step 6: Create and Configure Conda Environment
6.1 Create New Environment
# Create conda environment with Python 3.11
conda create -n parakeet_env python=3.11 -y
# Activate the environment
conda activate parakeet_env
6.2 Install GCC Libraries (Important)
Some dependencies require updated GCC libraries:
# Install GCC libraries via conda
conda install -c conda-forge libgcc-ng libstdcxx-ng -y
Step 7: Install PyTorch and NeMo Toolkit
7.1 Install PyTorch
# Install PyTorch with CUDA support
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
7.2 Install Core Dependencies
# Install numpy and other core dependencies
pip install numpy packaging Cython
7.3 Install NeMo Toolkit with ASR Support
# Install NeMo toolkit with ASR (Automatic Speech Recognition) support
pip install nemo_toolkit['asr']
This installation includes:
- NeMo core framework
- ASR-specific modules
- FastConformer and TDT decoder components
- All necessary dependencies for Parakeet models
Step 8: Verify Installation
8.1 Check GPU Access
Create a test script to verify PyTorch can access the GPU:
python3 << EOF
import torch
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"CUDA version: {torch.version.cuda}")
if torch.cuda.is_available():
print(f"GPU device: {torch.cuda.get_device_name(0)}")
print(f"GPU memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.2f} GB")
EOF
Expected output:
PyTorch version: 2.x.x
CUDA available: True
CUDA version: 11.8
GPU device: NVIDIA L4 (or NVIDIA T4)
GPU memory: 22.35 GB (or 16.00 GB)
8.2 Verify NeMo Installation
python3 << EOF
import nemo
import nemo.collections.asr as nemo_asr
print(f"NeMo version: {nemo.__version__}")
print("NeMo ASR module loaded successfully!")
EOF
Step 9: Load and Test Parakeet TDT 0.6B v3
9.1 Create Inference Script
Create a file called parakeet_inference.py:
import nemo.collections.asr as nemo_asr
import torch
# Check GPU availability
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
print(f"Using GPU: {torch.cuda.get_device_name(0)}")
# Load the Parakeet TDT 0.6B v3 model
print("Loading Parakeet TDT 0.6B v3 model...")
asr_model = nemo_asr.models.ASRModel.from_pretrained("nvidia/parakeet-tdt-0.6b-v3")
# Test with a sample audio file
# Download a sample audio file
import os
if not os.path.exists("sample_audio.wav"):
os.system("wget https://dldata-public.s3.us-east-2.amazonaws.com/2086-149220-0033.wav -O sample_audio.wav")
# Transcribe the audio
print("\nTranscribing audio...")
transcription = asr_model.transcribe(["sample_audio.wav"])
# Print results
print("\nTranscription result:")
print(transcription[0])
9.2 Run the Inference Script
python3 parakeet_inference.py
The first run will:
- Download the model from Hugging Face (~2.5 GB)
- Load it into GPU memory
- Download a sample audio file
- Transcribe the audio
Expected output:
CUDA available: True
Using GPU: NVIDIA L4
Loading Parakeet TDT 0.6B v3 model...
[NeMo I ...] Instantiating model from pre-trained checkpoint
Transcribing audio...
Transcription result:
he tells us that at this festive season of the year with...
Troubleshooting
Issue 1: SSH Connection Refused
Possible causes and solutions:
- Instance is still starting up (wait 1-2 minutes)
- Security group doesn't allow SSH on port 22
- Incorrect key permissions:
chmod 400 /path/to/your-key.pem
- Wrong username (should be
ubuntufor Ubuntu AMIs)
Issue 2: IAM Role Not Working
Solution: Verify role attachment:
# From within the EC2 instance
aws sts get-caller-identity
# Should show the role ARN
Next Step
- We will explore different deploy options once we confirmed which open source model we are going to use.
Conclusion
You now have a fully functional Parakeet TDT 0.6B v3 ASR system running on AWS EC2 with S3 integration.
Happy transcribing! 🎙️
Top comments (0)