DEV Community

Cover image for I built a bird flight simulator you control by flapping your arms (Three.js + MediaPipe)
Mathias Leonhardt
Mathias Leonhardt

Posted on • Originally published at ki-mathias.de

I built a bird flight simulator you control by flapping your arms (Three.js + MediaPipe)

My 6-year-old son wanted a game where you "fly like a real bird." I'm a web developer, not a game dev — but I had Claude Code as a pair programmer.

The result: A 3D bird flight simulator running entirely in your browser. You control the bird by moving your arms in front of your webcam.

How it works

Arms up → glide. Arms down → dive. Flap → climb. No controller, no download.

The webcam feeds into MediaPipe's PoseLandmarker, which detects 33 body landmarks in real-time. I map shoulder-to-wrist vectors to flight controls:

  • Hands above shoulders → lift increases
  • Hands below shoulders → pitch down
  • Rapid up-down motion → flap detected → altitude boost

Also works on mobile with gravity-based gesture controls (tilt to steer).

The tech stack

Component Tech
3D engine Three.js + WebGL
Pose detection MediaPipe PoseLandmarker
Terrain Custom GLSL shader, 5 texture layers by altitude
Ocean FFT-based (Tessendorf method) — same math JPEG uses
Trees InstancedMesh × 100,000
Culling Octree-based frustum culling
Physics Custom aerodynamics (lift, drag, stall)

The interesting parts

FFT ocean waves

The ocean uses the same Fourier transform that powers JPEG compression — but in reverse. A Phillips spectrum describes wave energy per frequency, and an inverse FFT turns that into a height map that deforms the water mesh. 512×512 spectral points = 262,144 overlapping wave frequencies per frame.

Octree frustum culling

With 100k trees, you can't render everything. An octree partitions 3D space recursively, and frustum culling discards entire branches that fall outside the camera's view. Result: ~3,000–5,000 trees rendered per frame instead of 100k.

The AI pair programming workflow

Claude Code handled the boilerplate, shader code, and physics implementation. I focused on:

  • Product decisions ("the bird needs to turn faster")
  • Visual taste (sunset colors, water transparency)
  • Architecture ("use an octree, not a flat array")

The hard part wasn't the code — it was knowing what to build. My 6-year-old was surprisingly good at that. His requirements were non-negotiable: "there need to be sharks."

Try it yourself

Built with Three.js, MediaPipe, and a very opinionated 6-year-old. Questions welcome.

Top comments (0)