DEV Community

Cover image for I Built a Local Video Processing Workstation with AI — Here's the Complete Journey
mayu888
mayu888

Posted on

I Built a Local Video Processing Workstation with AI — Here's the Complete Journey

From idea to release in 3 weeks, using Claude Code to build ClipForge — a cross-platform desktop app powered by Electron and FFmpeg.

The Problem

Most video processing tools force you to:

  • Upload files to the cloud (privacy concerns )
  • Deal with file size limits
  • Pay for premium features

I wanted a fully local, feature-rich, good-looking video processing tool. So I built ClipForge.

What is ClipForge?

ClipForge is a desktop app that handles 20+ video/audio operations locally:

  • Transcode — MP4, WebM, MKV, MOV, AVI, GIF
  • Visual — Crop, watermark removal, rotate, color adjust, denoise
  • Speed — 0.25x to 4x, reverse, boomerang, loop, fade
  • Audio — Extract, mute, volume, normalize
  • Composite — Concatenate, side-by-side, picture-in-picture, overlay, subtitles

Three modes: Single operation, Stack (chain multiple ops), Batch processing.

Built with Electron, React, FFmpeg, and Zustand. Ships for Windows, macOS, and Linux.

The Tech Stack

Layer Tech Why
Desktop Electron 42 Cross-platform, Node.js for FFmpeg
UI React 18 + TypeScript Component ecosystem
Build Vite 5 + Electron Forge Fast HMR, clean packaging
State Zustand Simple, no boilerplate
Styling Tailwind CSS Rapid UI development
Video FFmpeg (bundled) Industry-standard processing
AI Claude Code Pair programming assistant

Development Journey

Step 1: Scaffold

npm create electron-app clipforge
Enter fullscreen mode Exit fullscreen mode

Electron Forge generated the boilerplate: main process, preload script, renderer with Vite.

Step 2: UI Layout

Built a 4-panel layout:

  • Left: Media pool + operation library
  • Center: Real-time preview canvas
  • Right: Parameter inspector
  • Bottom: Stack/Batch queue + logs

Dark theme with Tailwind CSS.

Step 3: FFmpeg Integration (The Hard Part)

This is the core challenge — wrapping FFmpeg's CLI into visual operations.

Architecture:

Renderer (React)
    │  invoke('process:start', request)
    ▼
Preload (IPC bridge)
    │
    ▼
Main Process (Node.js)
    │  composeArgs(request) → ffmpeg args array
    ▼
FFmpeg (child_process.spawn)
    │  progress parsing from stderr
    ▼
Events back to renderer
Enter fullscreen mode Exit fullscreen mode

Example: Watermark Removal

Instead of FFmpeg's delogo filter (which has boundary restrictions — x≥1, y≥1, no edge support), I used a crop + blur + overlay approach:

case 'delogo': {
  const x = Math.max(0, Math.round(Number(p.x) || 0));
  const y = Math.max(0, Math.round(Number(p.y) || 0));
  const w = Math.max(10, Math.round(Number(p.w) || 10));
  const h = Math.max(10, Math.round(Number(p.h) || 10));

  args.push('-filter_complex',
    `[0:v]split[a][b];` +
    `[b]crop=${w}:${h}:${x}:${y},gblur=sigma=30,format=rgba,colorchannelmixer=aa=0.7[b2];` +
    `[a][b2]overlay=${x}:${y}[out]`
  );
  args.push('-map', '[out]', '-map', '0:a?');
  args.push(...videoCodec(outExt));
  break;
}
Enter fullscreen mode Exit fullscreen mode

The filter graph:

  1. crop — extract the watermark region
  2. gblur — Gaussian blur (more natural than boxblur)
  3. colorchannelmixer=aa=0.7 — semi-transparent blend for smooth integration

Step 4: Real-time Preview

Users need to see changes immediately, not after processing completes.

Solution: Canvas-based preview simulation. Instead of running FFmpeg, read frames from the <video> element and apply operations on a <canvas>:

useEffect(() => {
  const render = () => {
    drawPreview(ctx, video, previewOps, { width: rect.width, height: rect.height });
  };
  render(); // immediate draw

  if (playing) {
    const loop = () => { render(); raf = requestAnimationFrame(loop); };
    raf = requestAnimationFrame(loop);
  }
  return () => cancelAnimationFrame(raf);
}, [playing, playhead, JSON.stringify(previewOps)]);
Enter fullscreen mode Exit fullscreen mode

Adjusting brightness, crop region, or rotation shows instant feedback.

Step 5: Mouse Region Selection

For watermark removal, users drag to select the area. Screen coordinates must convert to video pixel coordinates (accounting for letterbox scaling):

function screenToVideo(localX, localY, container, videoW, videoH) {
  const { scale, ox, oy } = getVideoMapping(container, videoW, videoH);
  return {
    x: Math.max(0, Math.min(Math.round((localX - ox) / scale), videoW)),
    y: Math.max(0, Math.min(Math.round((localY - oy) / scale), videoH)),
  };
}
Enter fullscreen mode Exit fullscreen mode

Bug I hit: The onUp callback captured stale state from useState. Fixed by using useRef for live coordinates during drag.

Step 6: Packaging & CI/CD

Electron packaging is tricky — FFmpeg binaries can't go inside the asar archive, and Linux needs lowercase executable names.

forge.config.ts:

packagerConfig: {
  asar: { unpackDir: 'src/main/ffmpeg' },
  extraResource: ['src/main/ffmpeg'],
  executableName: 'clipforge',
}
Enter fullscreen mode Exit fullscreen mode

GitHub Actions builds all three platforms in parallel:

jobs:
  build:
    strategy:
      matrix:
        os: [macos-latest, ubuntu-latest, windows-latest]
    runs-on: ${{ matrix.os }}
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
      - run: npm ci
      - run: npm run make
Enter fullscreen mode Exit fullscreen mode

Push a tag → auto-build → auto-publish to GitHub Releases.

Lessons Learned

AI Coding is "Efficient Coding", Not "No Coding"

Claude Code handled tedious work (Electron packaging, FFmpeg arg mapping, IPC boilerplate), but I still needed to:

  • Make architectural decisions
  • Review generated code
  • Debug edge cases (stale closures, boundary conditions)

Ship the Core Flow First

Got "open file → select operation → process → output" working before adding preview, batch mode, or i18n.

Packaging is the Last Minefield

Binary files, asar compression, platform-specific naming — expect to spend time here. Automate with CI early.

The Result

  • 3 weeks of part-time work
  • 20+ operations across 6 categories
  • 3 platforms supported
  • Fully local processing
  • Open source: github.com/mayu888/clipforge

License: MIT + Commons Clause (free for personal use, commercial use requires authorization).

What's Next

  • Drag-and-drop operations between panels
  • More filter effects (LUTs, stabilization)
  • Plugin system for custom operations

Built with Electron, FFmpeg, and a lot of help from AI. The future of indie development is here.

electron #ffmpeg #react #typescript #ai-coding #desktop-app #video-processing #indie-hacker

Top comments (0)