Short summary: This writeup collects the practical problems I hit while building a lightweight Node.js CLI downloader and the concrete solutions I used. It’s intended as a hands-on guide you can copy into your project: detecting formats, downloading video-only
+ audio-only
, merging with ffmpeg
, why merged audio can sound noisy, and how to clean it. At the end I add a short note about a ready tool that implements these ideas.
1. Problem overview: what actually goes wrong (and why you should care)
When you try to download YouTube content at a specific quality you want, you’ll immediately encounter a few recurring issues:
- Choosing quality does not always give the expected result. Many combined (audio+video) formats are only available at low resolutions (e.g., 360p). Higher resolutions are offered as separate streams (video-only + audio-only). If your downloader blindly picks “the best combined format”, you end up with 360p even when 1080p is available as video-only.
-
Merging separate streams is required if you want higher resolution video + audio. That means download two streams and run an external tool like
ffmpeg
to compose them. - Post-merge audio quality can degrade. After merging, some outputs exhibit background noise/hiss or other artifacts. This is surprising because you downloaded the original audio stream — the merging/processing step introduced or exposed artifacts.
-
Cross-platform packaging and dependencies. Bundling or requiring
ffmpeg
across Windows/macOS/Linux has pitfalls unless you handle binaries carefully. - Playlists, rate limits and UX. Handling playlists, retries and user feedback (progress bars) requires additional design.
If you or your team plan to build a downloader, solving these correctly makes the difference between a toy and a reliable tool.
2. Strategy & architecture (high level)
- Enumerate formats for a URL (video-only, audio-only, combined).
- Decide: if user requested resolution is available as combined format → download combined; otherwise download video-only + audio-only.
- Download streams concurrently (with retries, backoff, and progress tracking).
-
Merge using
ffmpeg
. Prefer stream copy when possible (-c copy
) to avoid re-encoding, but re-encode if you must normalize or filter audio. - Post-process audio with a light noise-reduction filter if artifacts are present.
- Provide a good UX: interactive mode and flags, progress bars, verbose logs for debugging.
-
Bundle or locate ffmpeg reliably (e.g.,
ffmpeg-static
or provide clear instructions).
3. Practical code snippets & notes
3.1 — Detect available formats with ytdl-core
(Node.js)
// Example: list formats and pick best video-only + audio-only
const ytdl = require('ytdl-core');
async function listFormats(url) {
const info = await ytdl.getInfo(url);
const formats = info.formats;
const videoOnly = formats.filter(f => f.hasVideo && !f.hasAudio)
.sort((a,b) => (b.height || 0) - (a.height || 0));
const audioOnly = formats.filter(f => f.hasAudio && !f.hasVideo)
.sort((a,b) => (b.audioBitrate || 0) - (a.audioBitrate || 0));
const combined = formats.filter(f => f.hasVideo && f.hasAudio)
.sort((a,b) => (b.qualityLabel || '').localeCompare(a.qualityLabel || ''));
return { videoOnly, audioOnly, combined };
}
Tip: show these options to the user and allow explicit selection (or implement “smart selection” that picks the best available combination).
3.2 — Downloading streams (concurrent with progress)
Use streaming writes and a progress library (e.g., cli-progress
) so the user sees download progress:
const fs = require('fs');
const ytdl = require('ytdl-core');
function downloadFormat(url, format, outputPath) {
return new Promise((resolve, reject) => {
const stream = ytdl(url, { format });
stream.pipe(fs.createWriteStream(outputPath));
stream.on('progress', (chunkLength, downloaded, total) => {
// update progress bar
});
stream.on('end', () => resolve());
stream.on('error', err => reject(err));
});
}
Retries: wrap downloads with retry/backoff to handle transient network or YouTube throttling issues.
3.3 — Merge with FFmpeg (basic merge, copy streams)
When formats are compatible (same container/codecs), prefer -c copy
.
ffmpeg -i video.mp4 -i audio.m4a -c copy merged.mp4
When to re-encode: if audio codec/container mismatch or you want to apply filters, you must re-encode:
ffmpeg -i video.mp4 -i audio.m4a -c:v copy -c:a aac -b:a 192k merged.mp4
Note: re-encoding audio may slightly change the quality; choose sensible defaults (AAC 128–192 kbps or keep original bitrate if possible).
3.4 — The audio noise problem and solutions
Symptom: merged audio has extra background noise / hiss that wasn’t obvious in the original stream.
Possible causes:
- Mismatch in sample rates or codecs requiring ffmpeg to resample/re-encode.
- Bad muxing path when using copy and then later rewrapping.
- Inadvertent transforms (volume normalization, resampling) when using filters or wrong flags.
- Source streams themselves may be compressed with artifacts that become more noticeable after merging.
Practical fixes:
- Prefer native ffmpeg denoisers if artifacts are present. Example:
# afftdn: frequency-domain denoiser
ffmpeg -i merged.mp4 -af "afftdn=nf=-25" final_clean.mp4
- Highpass/lowpass to remove low-frequency rumble:
ffmpeg -i merged.mp4 -af "highpass=f=70, lowpass=f=12000" final_clean.mp4
-
Custom filters / internal filter
If you have a custom filter or tuned profile (for me, named
mp.rnnn
internally), run it as an-af
parameter:
ffmpeg -i merged.mp4 -af "mp.rnnn" final_clean.mp4
Advice: start with
afftdn
and simple highpass/lowpass. Only add complex or proprietary filters after testing. Always keep the original audio saved so you can A/B test.
- Ensure consistent sampling Force a sample rate to avoid hidden resampling artifacts:
ffmpeg -i merged.mp4 -ar 48000 -ac 2 -af "afftdn=nf=-25" final_clean.mp4
- Automate detection + conditional filtering Run a quick audio quality check (e.g., compute RMS or detect noise floor) and apply the filter conditionally when noise passes a threshold.
4. Playlist handling & ranges
- List extraction: get playlist video IDs, then loop over them with pagination.
-
Range downloads: accept
"start-end"
strings and map them to indices. - Concurrency limits: for large playlists, download in small batches (e.g., 3 concurrent downloads) to avoid bandwidth spikes or hitting YouTube rate limits.
- Resume support: keep partial files and resume if possible (ytdl supports range requests in many cases).
5. Packaging ffmpeg & cross-platform considerations
- Use
ffmpeg-static
as a convenient bundled binary for most platforms. It simplifies CI and end-user installs. - If you need full control or want smaller distribution, document how to install a system
ffmpeg
and allow an environment variable override (e.g.,FFMPEG_PATH
). - Watch license:
ffmpeg
has LGPL/GPL components depending on build; pick builds accordingly and document licenses.
6. UX & CLI tips
- Interactive mode for beginners: prompt for URL, quality, format.
-
Flags for automation:
--url
,--type
,--quality
,--output
,--no-progress
,--verbose
. - Verbose logs: include format selection details (which format IDs you picked and why). This is invaluable for debugging format/quality issues.
- Progress bars: show per-stream progress and an overall task progress for playlists.
- Exit codes: make exit codes meaningful (network error, ffmpeg error, invalid URL, permission denied).
7. Testing & validation
- A/B test: compare original audio stream and final audio after merge + filter. Keep both and allow quick playback.
- Sample videos: maintain a test set containing different encodings/resolutions (360p combined, 720p+audio separate, DASH formats).
-
Automated CI: run unit tests on format selection logic and integration tests for small merges. For
ffmpeg
steps, you can run smoke checks (exists file, duration matches input within small delta).
8. Troubleshooting checklist (quick)
-
If the downloader keeps giving 360p:
- Inspect available formats (
ytdl-core
info) and look for video-only formats at desired resolution.
- Inspect available formats (
-
If merge fails:
- Check
ffmpeg
stdout, container/codec mismatches, and try re-encoding instead of-c copy
.
- Check
-
If audio noisy after merge:
- Try
afftdn
, highpass/lowpass; check sample rates and re-encode audio to a stable sample rate (e.g., 48k).
- Try
-
If downloads slow or fail intermittently:
- Implement retries with exponential backoff; reduce concurrency.
9. Example end-to-end shell flow (what the tool automates)
# download best video-only at requested resolution
ytdl 'https://youtube.com/watch?v=ID' --format videoonly -o video.mp4
# download best audio-only
ytdl 'https://youtube.com/watch?v=ID' --format audioonly -o audio.m4a
# merge (prefer copy if compatible)
ffmpeg -i video.mp4 -i audio.m4a -c:v copy -c:a copy merged.mp4
# if artifacts exist -> filter audio
ffmpeg -i merged.mp4 -af "afftdn=nf=-25" final_clean.mp4
11. Small section: about my ready tool that i built
If you’d rather try a ready implementation of the above ideas, I created a small CLI called Downtube that automates the workflow described here: format detection, video-only + audio-only downloading, ffmpeg
merging, and an optional noise-filter step. It also supports playlists, range downloads, progress indicators, and a simple interactive CLI.
Try it / see source:
https://github.com/ahegazy0/downtube
Top comments (0)