Ingo Steinke, web developer

Posted on Sep 12 • Edited on Oct 30

Microphone Choice and Cancelling Noise

#productivity

Don't you love Canva, Capcut, Insta, and TikTok? So-called social media giving ordinary people a chance to become creative producing multimedia content?

Canva Questions are the new Printer Support

As in "IT/CS professionals assigned as printer supporters" for their friends and family, we became unpaid Instagram husbands and Canva consultants in our spare time.

I've heard there are paid Instagram husbands for hire, but I wouldn't consider to apply for that job. (I've got at least two profiles on Insta though, openmindculture and webdeveloperingo.)

Canva/Capcut Confusion

Apps aren't inherently bad. Scarce screen space and lack of traditional technical knowledge made app designers reinvent UI interaction, putting controls on top or on bottom, in this burger menu or that third level advanced settings of one of their more or less intuitive assistants. There's just too much functionality to fit in a simplistic interface.

Microphone Choice

Mobile hardware has also advanced incredibly. However, you can't make every device smaller and lighter without sacrificing storage, battery, and quality. Did you ever go to an electronics store to bring "a microphone like the influencers use" for a family member? Their prices range from 15,- to 500,- € when I last checked.

Clip-on-microphones often consist of two gadgets, the actual mic with a furry, brush-like, "dead cat" attachment, and a built it transmitter, plus a separate receiver device connected to the phone by plug or cable. Fixed microphones are usually much larger and ready to fit more add-ons like an anti-pop screen and an eleastic fixture, all for the sake of natural noise reduction.

You might consider deciding about which clip-on mic to use for video recording also partially based on what it looks like. Some resemble cosmetic brushed, while other, technically similar, ones look more like a feather duster for cleaning.

Double check that you really use the external microphone, like by speaking directly into it and into your smartphone to compare which one you hear loud and clear.

Microphone technology and sound engineering are professional skills that you can't expect to pick up in half an hour, still everybody tries to achieve just that. Ask ChatGPT or Perplexity, buy some stuff and "go pro". Welcome to the content creators' community!

Unless your home is but a bunk bed surrounded by walls, you'll have different places to record your videos. The most obvious choice includes lighting, background, and some space to move. When you add more professional equipment, like a tripod and a gigantic soft light, you'll probably increase the distance between yourself and your camera, and stand a little too close to that spot of your wall where there's no cluttered cupboards or hangers full of clothes - two types of furnishing that reduce ambient sound naturally and don't look like a recording studio.

Noise Cancelling

Sooner or later, you might end up with a video that either has too much echo, or, after figuring out to use the Open Camera app and its hidden advanced video setting to "use external microphone if present", loud lisping sounds in every word that contains the letter S.

I mentioned "natural noise reduction" happening at recording time, because it's still much easier to start with good quality than to edit and try to improve afterwards. Still, there are ways to try.

Third-party services promise AI-assisted improvement of sound and video, while traditional open-source software can be enhanced by free plugins like Noise Gate for Audacity. Apps like Capcut offer to export and import sound tracks, but Handbrake or ffmpeg can do the same and might feel more convenient for developers.

Separate the Sound

Download the video to your computer.

Inspect the audio codec.

ffmpeg -i input.mp4

will show you something like

Stream #0:0[0x1](eng): Audio: aac (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 191 kb/s

so we will use .aac as our audio file suffix.

ffmpeg -i multimedia.mp4 -vn -acodec copy output-audio.aac

Install and open Audacity. Go to the tools menu, open the plugin manager and ensure you have the Noise Gate plugin installed and activated. Audacity 3.4.2 should ship with Noise Gate already.

Reduce the Noise

What's "noise" after all? Noise is everything we don't consider part of the signal, like creaking wooden floorboards, barking dogs, or coughing people. While all of this can become tricky to remove, there's a kind of noise that seems more easy to detect and filter.

Ambient sound, background hiss, and echos usually sound much quieter than speaking or singing. Noise reduction software will remove or reduce that kind of sound while retaining the louder parts. Sounds simple, still it's incredibly complicated in practice.

The first step is relatively easy: select some "noisy silence" where nobody is speaking, as a baseline for the Noise Reduction effect. Then select all of the audio, choose noise reduction dB, sensitivity, and frequency smoothing bands. The inverse residue mode strips the signal and keeps the noise so that we can preview the sounds that will be removed.

Simply removing everything supposed to be noise would make a voice sound unnatural and introduces akward "artificial silence" between words or sentences, so we must keep a little rest of the unwanted sounds and fade our cutting-off effect in and out to prevent abrupt transitions.

If you don't have much time, just use the presets, select some noise, then select everything and apply Noise Reduction, before finally first selecting the whole track and then opening the effects menu to apply Noise Gate.

Fine-Tuning Parameters

Noise Gate has a preview button so that we can quickly decide and readjust without having to apply and undo the effect on the whole sound track. Preview can also give us a sort of intuitive understanding of the parameters:

Gate treshold too high will reduce in missing words or syllables (while still not reducing the echo-like room round).
Level reduction too negative might cause unnatural silence while nobody speaks.

Envelope curves are common in synthetic sound engineering and that's the way echo removal algorithms achieve softer transitions as well. Understanding the principle makes it easier to tune advanced parameters if you have to. Attack, hold (sustain), and decay can be specified in milliseconds for fine tuning Noise Gate.

A longer hold can prevent cut-off words or syllables, and so can a slower (lower) attack value.

In the above example, I increased the cut-off effect drastically with extreme level reduction and gate treshold values, that I compensated by a slower attack and a much longer hold value:

Gate treshold default -40 dB, changed to 25 dB;
Level reduction default -24 dB changed to -82 dB;
Attack default 10 ms changed to 5 ms;
Hold default 50 ms cahanged to 350 ms;

Despite a visual change in the waveform visualization, the supposedly optimized output sounds quite similar as the original. Hopefully you'll get a better result.

As I owe most of my noise reduction sound engineering expertise to one blogger, it's only fair that you read his original post:

How to Remove Echo in Audacity: Step by Step Guide using Audacity.

Reassembling Sound and Video

The multiplexing process puts back together sound and video. So we can be sure that the visual parts are unaltered and, because we didn't change speed or duration, audio and video still perfectly match down to the millisecond.

Either use a dubbing function like importing a sound track to replace the existing one in Canva or another content creation app, or use ffmpeg on the command line again to overwrite the audio track with the noise-reduced version.

ffmpeg -i multimedia.mp4 -i audio.aac -c:v copy -map 0:v:0 -map 1:a:0 optimized.mp4

Harware, Software, and AI Alternatives

I researched and tried the noise reduction process described above, and I used an AI trial plan for comparison. Both approaches failed to turn a wooden echo ambience into a clean studio sound. Noise reduction microphone hardware is better, but can lead to other undesirable effects especially with small and cheap solutions.

The AI-assisted online service performed so badly that I would never ever pay anything for similar automated services.

Conclusion

Recording in a real studio or a similar environment with closed windows and a lot of soft tissue around, like a sofa or a carpet is, in my experience, the simplest and most successful solution for clean sound in a video.

Top comments (1)

leob • Sep 15

Pretty impressive, very thorough!