DEV Community

RenderIO
RenderIO

Posted on • Originally published at renderio.dev

Extract Audio from Video in n8n

Pull audio from video without touching a terminal

You have video interviews to transcribe. Or podcast episodes recorded as video. Or a music library trapped in MP4 files. You need the audio track extracted.

FFmpeg does this in one command. n8n can trigger that command automatically whenever a new video appears. No manual steps. No terminal. No server.

The problem: n8n can't extract audio natively

n8n doesn't have an audio extraction node. The cloud version doesn't allow shell commands. Even self-hosted, running FFmpeg inside n8n blocks the worker and risks crashes on large files.

The solution: send the extraction command to RenderIO's API via n8n's HTTP Request node. RenderIO runs FFmpeg in an isolated container. Your n8n instance stays responsive.

Use the RenderIO n8n node

RenderIO has a partner-verified community node on the n8n marketplace. Install from Settings → Community Nodes → search "renderio". It provides a visual interface for FFmpeg commands, including audio extraction.

The node handles authentication and request formatting automatically. The extraction examples below use HTTP Request nodes for full flexibility, but the same FFmpeg commands work with the native node.

Basic extraction: MP4 to MP3

The simplest workflow: video URL in, MP3 URL out.

HTTP Request node configuration:

  • Method: POST
  • URL: https://renderio.dev/api/v1/run-ffmpeg-command
  • Authentication: Header Auth (X-API-KEY)
  • Body:
{
  "ffmpeg_command": "-i {{in_video}} -vn -acodec libmp3lame -q:a 2 {{out_audio}}",
  "input_files": {
    "in_video": "{{ $json.videoUrl }}"
  },
  "output_files": {
    "out_audio": "extracted.mp3"
  }
}
Enter fullscreen mode Exit fullscreen mode

-vn disables video. -q:a 2 sets MP3 quality (0=best, 9=worst, 2 is high quality at ~190kbps).

Poll for completion, then use the output URL.

Extraction formats

MP3 (most compatible)

{
  "ffmpeg_command": "-i {{in_video}} -vn -acodec libmp3lame -q:a 2 {{out_audio}}",
  "input_files": { "in_video": "{{ $json.videoUrl }}" },
  "output_files": { "out_audio": "audio.mp3" }
}
Enter fullscreen mode Exit fullscreen mode

Best for: sharing, podcast distribution, general use.

WAV (lossless)

{
  "ffmpeg_command": "-i {{in_video}} -vn -acodec pcm_s16le -ar 44100 {{out_audio}}",
  "input_files": { "in_video": "{{ $json.videoUrl }}" },
  "output_files": { "out_audio": "audio.wav" }
}
Enter fullscreen mode Exit fullscreen mode

Best for: transcription services (they often prefer WAV), audio editing, archival.

AAC (Apple/streaming)

{
  "ffmpeg_command": "-i {{in_video}} -vn -acodec aac -b:a 192k {{out_audio}}",
  "input_files": { "in_video": "{{ $json.videoUrl }}" },
  "output_files": { "out_audio": "audio.m4a" }
}
Enter fullscreen mode Exit fullscreen mode

Best for: Apple devices, streaming platforms, smaller files than MP3 at same quality.

FLAC (lossless compressed)

{
  "ffmpeg_command": "-i {{in_video}} -vn -acodec flac {{out_audio}}",
  "input_files": { "in_video": "{{ $json.videoUrl }}" },
  "output_files": { "out_audio": "audio.flac" }
}
Enter fullscreen mode Exit fullscreen mode

Best for: archival when you want lossless but smaller than WAV (typically 50-60% of WAV size).

OGG/Opus (web)

{
  "ffmpeg_command": "-i {{in_video}} -vn -acodec libopus -b:a 128k {{out_audio}}",
  "input_files": { "in_video": "{{ $json.videoUrl }}" },
  "output_files": { "out_audio": "audio.ogg" }
}
Enter fullscreen mode Exit fullscreen mode

Best for: web applications, voice recordings, VoIP.

Complete workflow: Extract and transcribe

Combine audio extraction with a transcription service:

Google Drive Trigger (new video)
  → HTTP Request: Extract audio (RenderIO)
  → Wait + Poll
  → HTTP Request: Download audio
  → HTTP Request: Send to Whisper API / AssemblyAI / Deepgram
  → Google Sheets: Write transcript
  → Slack: Notify team
Enter fullscreen mode Exit fullscreen mode

Node 1: Google Drive Trigger
Watches a "Videos" folder for new uploads.

Node 2: Extract audio (HTTP Request)

{
  "ffmpeg_command": "-i {{in_video}} -vn -acodec pcm_s16le -ar 16000 -ac 1 {{out_audio}}",
  "input_files": { "in_video": "{{ $json.downloadUrl }}" },
  "output_files": { "out_audio": "for_transcription.wav" }
}
Enter fullscreen mode Exit fullscreen mode

Note: -ar 16000 -ac 1 converts to 16kHz mono. This is the format most transcription APIs prefer. Smaller files, faster uploads, same transcription quality.

Node 3-5: Poll and get result

Standard polling loop.

Node 6: Send to transcription

{
  "method": "POST",
  "url": "https://api.openai.com/v1/audio/transcriptions",
  "headers": { "Authorization": "Bearer {{ $credentials.openAiApi.apiKey }}" },
  "body": {
    "model": "whisper-1",
    "file": "{{ $json.output_files.out_audio.storage_url }}"
  }
}
Enter fullscreen mode Exit fullscreen mode

Batch extraction from a video library

Process an entire folder of videos:

Step 1: Get video list

Use a Code node or fetch from a spreadsheet:

const videos = [
  { url: "https://example.com/interview1.mp4", name: "interview1" },
  { url: "https://example.com/interview2.mp4", name: "interview2" },
  { url: "https://example.com/interview3.mp4", name: "interview3" },
];

return videos.map(v => ({ json: v }));
Enter fullscreen mode Exit fullscreen mode

Step 2: Split in Batches (size: 5)

Step 3: Submit extraction for each

{
  "ffmpeg_command": "-i {{in_video}} -vn -acodec libmp3lame -q:a 2 {{out_audio}}",
  "input_files": { "in_video": "{{ $json.url }}" },
  "output_files": { "out_audio": "{{ $json.name }}.mp3" }
}
Enter fullscreen mode Exit fullscreen mode

Step 4: Poll and collect URLs

Step 5: Write results to spreadsheet

Video Audio URL Status
interview1 https://media.renderio.dev/interview1.mp3 extracted
interview2 https://media.renderio.dev/interview2.mp3 extracted

Audio processing after extraction

Once you have the audio, you can process it further:

Normalize volume:

-i {{in_audio}} -af loudnorm=I=-16:TP=-1.5:LRA=11 {{out_audio}}
Enter fullscreen mode Exit fullscreen mode

Trim silence from start/end:

-i {{in_audio}} -af silenceremove=start_periods=1:start_silence=0.5:start_threshold=-50dB,areverse,silenceremove=start_periods=1:start_silence=0.5:start_threshold=-50dB,areverse {{out_audio}}
Enter fullscreen mode Exit fullscreen mode

Convert sample rate:

-i {{in_audio}} -ar 44100 {{out_audio}}
Enter fullscreen mode Exit fullscreen mode

Chain these into your workflow as additional processing steps after extraction.

Error handling

Common extraction failures:

No audio track: Some screen recordings or animations have no audio. FFmpeg returns an error. Handle with an IF node that checks the error message for "does not contain any stream."

Corrupted audio: Add -err_detect ignore_err before -i to attempt extraction despite minor corruption.

Very long videos: Extraction is fast (typically 10-30 seconds regardless of video length) because it only copies/transcodes the audio stream, not the video.

Get started

The Starter plan at $9/mo includes 500 commands -- enough to set up and test your audio extraction workflow.

Top comments (0)