<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Baojian Yuan | 袁保健</title>
    <description>The latest articles on DEV Community by Baojian Yuan | 袁保健 (@baojian_yuan).</description>
    <link>https://dev.to/baojian_yuan</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3695993%2Fd62d4bf0-0c4d-4989-a593-d4c6425c4e37.png</url>
      <title>DEV Community: Baojian Yuan | 袁保健</title>
      <link>https://dev.to/baojian_yuan</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/baojian_yuan"/>
    <language>en</language>
    <item>
      <title>Moving FFmpeg to the Browser: How I Saved 100% on Server Costs Using WebAssembly</title>
      <dc:creator>Baojian Yuan | 袁保健</dc:creator>
      <pubDate>Tue, 06 Jan 2026 14:45:22 +0000</pubDate>
      <link>https://dev.to/baojian_yuan/moving-ffmpeg-to-the-browser-how-i-saved-100-on-server-costs-using-webassembly-4l9f</link>
      <guid>https://dev.to/baojian_yuan/moving-ffmpeg-to-the-browser-how-i-saved-100-on-server-costs-using-webassembly-4l9f</guid>
      <description>&lt;p&gt;&lt;strong&gt;By Baojian Yuan&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I am a half indie developer and a dad based in Shanghai. When I’m not changing diapers for my 3-year-old daughter, I am usually building AI tools or optimizing workflows.&lt;/p&gt;

&lt;p&gt;Recently, I needed a simple tool to convert massive audio files (WAV to MP3) for a local ASR (Automatic Speech Recognition) project. I looked at existing online converters and immediately hit two roadblocks:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Privacy:&lt;/strong&gt; Uploading private meeting recordings to a random server feels wrong. I have no idea where that data goes or how long it stays there.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Speed:&lt;/strong&gt; Uploading a 500MB file takes forever before the actual processing even starts.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I sat there looking at my laptop's specs—an Intel Core Ultra 9 with 32GB of RAM—and thought: &lt;strong&gt;"Why am I paying AWS for computing power when the user has a perfectly good CPU sitting idle?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub-production-user-asset-6210df.s3.amazonaws.com%2F28180652%2F532324618-4185f29d-c3ac-4747-9806-d31c63a32fe3.jpeg%3FX-Amz-Algorithm%3DAWS4-HMAC-SHA256%26X-Amz-Credential%3DAKIAVCODYLSA53PQK4ZA%252F20260106%252Fus-east-1%252Fs3%252Faws4_request%26X-Amz-Date%3D20260106T095938Z%26X-Amz-Expires%3D300%26X-Amz-Signature%3De0e8b79c8d8c181b01924d32c557fc896565c3396b464d2abc778547b7377755%26X-Amz-SignedHeaders%3Dhost" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub-production-user-asset-6210df.s3.amazonaws.com%2F28180652%2F532324618-4185f29d-c3ac-4747-9806-d31c63a32fe3.jpeg%3FX-Amz-Algorithm%3DAWS4-HMAC-SHA256%26X-Amz-Credential%3DAKIAVCODYLSA53PQK4ZA%252F20260106%252Fus-east-1%252Fs3%252Faws4_request%26X-Amz-Date%3D20260106T095938Z%26X-Amz-Expires%3D300%26X-Amz-Signature%3De0e8b79c8d8c181b01924d32c557fc896565c3396b464d2abc778547b7377755%26X-Amz-SignedHeaders%3Dhost" alt="Architecture Comparison" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;So, I decided to port FFmpeg to the browser using &lt;strong&gt;WebAssembly (WASM)&lt;/strong&gt;. The goal was simple: Zero server uploads, 100% privacy, and $0 server bills.&lt;/p&gt;

&lt;p&gt;Here is how I built &lt;a href="https://localaudioconvert.com" rel="noopener noreferrer"&gt;&lt;strong&gt;LocalAudioConvert.com&lt;/strong&gt;&lt;/a&gt;, the technical hurdles I faced, and why I believe "Local First" is the future of utility apps.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Challenge: The Browser is Not an OS
&lt;/h2&gt;

&lt;p&gt;Running FFmpeg—a heavy, complex C library—inside Chrome isn't straightforward. While tools like &lt;code&gt;ffmpeg.wasm&lt;/code&gt; (powered by Emscripten) exist, making them production-ready for large files requires solving several engineering nightmares.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. The SharedArrayBuffer Headache
&lt;/h3&gt;

&lt;p&gt;To make video/audio conversion bearable in a browser, you need multi-threading. FFmpeg needs to utilize multiple cores. However, enabling &lt;code&gt;SharedArrayBuffer&lt;/code&gt; (which allows threads to share memory) in modern browsers requires strict security isolation to prevent Spectre attacks.&lt;/p&gt;

&lt;p&gt;If you just drop the WASM file in, it won't work. You have to configure your static file server (Nginx, Vercel, or Netlify) to send specific response headers:&lt;/p&gt;

&lt;p&gt;HTTP&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Cross-Origin-Embedder-Policy: require-corp
Cross-Origin-Opener-Policy: same-origin
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The Trade-off:&lt;/strong&gt; Enforcing these headers isolates your document process. This broke my external image loading for a while (e.g., loading avatars from a CDN). I had to proxy those resources or ensure they were served from the same origin. It’s a classic security vs. convenience trade-off, but necessary for performance.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. The Memory OOM (Out of Memory)
&lt;/h3&gt;

&lt;p&gt;Browsers are notoriously stingy with WebAssembly memory allocation. In my early tests, when I dragged in a 1GB WAV file, the WASM instance would immediately crash with an &lt;code&gt;Out of Memory&lt;/code&gt; error. Chrome doesn't just let you allocate unlimited heap size.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Strategy:&lt;/strong&gt; Instead of loading the entire file into the WASM virtual file system (MEMFS) at once, I implemented a &lt;strong&gt;chunking mechanism&lt;/strong&gt;.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Read the file from the user's disk in small chunks.&lt;/li&gt;
&lt;li&gt;Feed the buffer into the WASM heap.&lt;/li&gt;
&lt;li&gt;Process and flush.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub-production-user-asset-6210df.s3.amazonaws.com%2F28180652%2F532326636-0045e7ee-521d-4336-ba50-bfc6fba95b85.JPEG%3FX-Amz-Algorithm%3DAWS4-HMAC-SHA256%26X-Amz-Credential%3DAKIAVCODYLSA53PQK4ZA%252F20260106%252Fus-east-1%252Fs3%252Faws4_request%26X-Amz-Date%3D20260106T100325Z%26X-Amz-Expires%3D300%26X-Amz-Signature%3Da8aed9f91c07c41268b1765f1421056f0e365a6dbbc69baabc8c8466c656c336%26X-Amz-SignedHeaders%3Dhost" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub-production-user-asset-6210df.s3.amazonaws.com%2F28180652%2F532326636-0045e7ee-521d-4336-ba50-bfc6fba95b85.JPEG%3FX-Amz-Algorithm%3DAWS4-HMAC-SHA256%26X-Amz-Credential%3DAKIAVCODYLSA53PQK4ZA%252F20260106%252Fus-east-1%252Fs3%252Faws4_request%26X-Amz-Date%3D20260106T100325Z%26X-Amz-Expires%3D300%26X-Amz-Signature%3Da8aed9f91c07c41268b1765f1421056f0e365a6dbbc69baabc8c8466c656c336%26X-Amz-SignedHeaders%3Dhost" alt="Solving Memory Crash" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This keeps the memory footprint low, regardless of the input file size.&lt;/p&gt;




&lt;h2&gt;
  
  
  Performance: Native vs. WASM vs. Cloud
&lt;/h2&gt;

&lt;p&gt;The biggest question I get is: &lt;em&gt;"Is it slower than native FFmpeg?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The short answer is: &lt;strong&gt;Yes.&lt;/strong&gt; WebAssembly is fast, but it still has overhead compared to native C++ running directly on the OS.&lt;/p&gt;

&lt;p&gt;However, the more important question is: &lt;em&gt;"Is it slower than Cloud Converters?"&lt;/em&gt; The answer is: &lt;strong&gt;No.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Here is a rough benchmark for a &lt;strong&gt;100MB WAV to MP3&lt;/strong&gt; conversion:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Native FFmpeg (M1 Mac):&lt;/strong&gt; ~0.8 seconds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;WASM (Browser):&lt;/strong&gt; ~4.5 seconds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Traditional Online Tool:&lt;/strong&gt; ~45 seconds (30s Upload + 5s Process + 10s Download)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The User Experience Win:&lt;/strong&gt; The "perceived latency" is significantly lower because processing starts &lt;em&gt;instantly&lt;/em&gt;. There is no progress bar for uploading. For a user on a slow coffee shop Wi-Fi, the WASM solution is infinitely faster than the cloud solution.&lt;/p&gt;




&lt;h2&gt;
  
  
  The "Killer" Feature: Batch Processing
&lt;/h2&gt;

&lt;p&gt;Most online converters limit you to 1 or 2 files at a time. This isn't a technical limitation; it's a financial one. They don't want you hogging their server CPU.&lt;/p&gt;

&lt;p&gt;Since I am using &lt;em&gt;your&lt;/em&gt; CPU, I don't care how many files you convert.&lt;/p&gt;

&lt;p&gt;I built a queue system using &lt;strong&gt;Web Workers&lt;/strong&gt;. This allows users to drop in &lt;strong&gt;100+ files&lt;/strong&gt; at once. The main UI thread remains responsive (no freezing) while the worker threads churn through the audio queue in the background.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5aeii9nt6cxa6eah9p6k.JPEG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5aeii9nt6cxa6eah9p6k.JPEG" alt="Batch Processing" width="800" height="871"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This effectively turns your browser into a desktop-grade batch processor.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next? (VAD &amp;amp; ASR)
&lt;/h2&gt;

&lt;p&gt;Now that I have a stable audio processing pipeline running entirely client-side, I'm experimenting with more advanced AI features.&lt;/p&gt;

&lt;p&gt;I am currently working on running &lt;strong&gt;VAD (Voice Activity Detection)&lt;/strong&gt; and &lt;strong&gt;Whisper (ASR)&lt;/strong&gt; directly in the browser. Imagine being able to transcribe sensitive legal or medical recordings without the audio data ever leaving your laptop. That is the future I want to build. &lt;/p&gt;

&lt;p&gt;Follow me on X (@YuanAudio) to see the progress.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Try It Out
&lt;/h2&gt;

&lt;p&gt;I’ve bundled all this technology into a free tool. There are no ads, no sign-ups, and absolutely no data tracking.&lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;Live Demo:&lt;/strong&gt;&lt;a href="https://localaudioconvert.com/" rel="noopener noreferrer"&gt;localaudioconvert.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;(Note: It works entirely offline once the page is loaded.)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I’d love to hear your feedback on the implementation or discuss any WASM edge cases you've encountered!&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>performance</category>
      <category>privacy</category>
    </item>
  </channel>
</rss>
