Erica

Posted on Dec 22, 2025

AI Engineering: Advent of AI with goose Day 11 - Photo AI Filter Accessible Application - Spatial Intelligence & Subagents

#ai #goose #google #adventofai

Day 11: The Photo Booth AI Application - Real‑Time Filters, Spatial Intelligence & Subagents

What if you could build a full AR‑style photo booth - camera access, face detection, real‑time filters, capture, download, and QR sharing - all in a single day? And what if you didn’t have to build it alone?

That’s exactly what Day 11 challenged me to do.

Using goose subagents, I built a complete Fun House Photo Booth web app with festive filters, MediaPipe face tracking, mobile support, and a full capture pipeline. It feels like having a small engineering team working in parallel - because that’s exactly what subagents simulate.

Day 11: Photo Booth AI Application 📸

The Challenge: Build a Real-Time Filter App in One Day
The festival director wanted a magical selfie booth:

Open on your phone
See yourself with fun filters
Filters track your face
Switch between effects
Capture the photo
Download it
Share it

This is where subagents shine.

Enter: The Fun House Photo Booth (Built with Subagents)

I split the work into specialized subagents - just like a real dev team:

Subagent 1 - Core App Builder

Built the HTML/CSS/JS structure
Implemented camera access
Created the live video preview
Added capture + download
Made everything mobile‑responsive

Subagent 2 - Filter Engineer

Integrated MediaPipe Face Landmarker
Implemented 468‑point face mesh
Built the real‑time filter system
Anchored filters to specific landmarks
Added filter switching

Optional Subagents I Added

Stylist - polished the UI (FilterSense branding)
Documentation Writer - created usage notes
Performance Optimizer - ensured smooth tracking

Subagents let me parallelize the work and keep the build clean and modular.

Tech Stack

goose Subagents task orchestration
Sonnet 4.5 by Anthropic
HTML/CSS/JS core app
MediaPipe Face Landmarker local spatial intelligence
Canvas API rendering filters + mesh
SessionStorage storing captured images
QR workflow for sharing
Mobile‑first UI responsive layout

No backend. No server. Everything runs locally.

My Experience (From Camera to AR Filters)
I started by building a clean UI - a glowing camera icon, a “FilterSense” title, and an Enter button. Once inside, the app activates the camera, loads MediaPipe, and begins tracking the user’s face in real time.

Then the fun begins:

Select a filter
Watch it attach to your face
Move, tilt, smile - it follows
Capture the moment
Download or share

The entire experience feels like a lightweight AR app running directly in the browser.

What My Application Does

Opens the camera instantly
Tracks the face using MediaPipe
Renders a 468‑point mesh
Applies filters anchored to landmarks
Lets users switch filters
Captures a clean photo
Stores it safely
Redirects to an export page
Supports download + QR sharing
Works smoothly on mobile

It’s a complete photo booth system.

Spatial Intelligence (MediaPipe Face Landmarker)

One of the most advanced parts of this build is the spatial intelligence. Instead of sending video frames to a server, the entire face‑tracking pipeline runs in the browser using MediaPipe’s Face Landmarker.

Why this matters

Real‑time performance
Low latency
Offline capability
Privacy‑preserving
No external compute required

How it works
I load the FaceLandmarker and FilesetResolver modules, which return:

468 face landmarks
3D positional data
Stable tracking across movement
Mesh topology
Mesh can be removed at any time These landmarks drive the entire filter system.

Mesh Rendering
I implemented a full tessellation renderer using the MediaPipe FACEMESH_TESSELATION array. It draws:

glowing neon nodes
connecting edges
animated mesh movement

This visualizes the underlying AI in real time.

Filter Anchoring
Each filter is mapped to a specific landmark:

'Crown': { landmark: 10, offsetY: -60 }
'Beard': { landmark: 152, offsetY: 40 }
'Reindeer Eyelashes': { landmark: 159, offsetY: -10 }

This ensures perfect alignment as the user moves.

Clean Capture Pipeline
To avoid tainted canvases, I built a safe capture flow:

Create a fresh canvas
Draw the video frame
Draw only the mesh (no external PNGs)
Export as PNG
Store in sessionStorage
Redirect to export page

This guarantees consistent captures across browsers.

Technical Highlights
The app uses structured subagents to divide responsibilities cleanly. The Core App Builder handles UI, camera access, capture, and mobile responsiveness. The Filter Engineer manages MediaPipe initialization, mesh rendering, and filter anchoring. The system uses a clean canvas pipeline to avoid CORS issues and ensures safe PNG export.

Spatial intelligence runs entirely on‑device, enabling real‑time AR effects without external compute. Filters follow the user’s face with sub‑pixel accuracy thanks to landmark‑driven positioning. The UI is fully responsive, and the workflow supports capture, download, and QR‑based sharing.

Insights

Subagents feel like having a real dev team although it was myself, my code and my design
MediaPipe’s local inference is incredibly powerful
Clean capture pipelines matter
Spatial intelligence unlocks AR‑level experiences
Declarative workflows scale beautifully
Mobile‑first design is essential for real‑world use

Powered By

goose by Block powered by Sonnet 4.5 by Anthropic
MediaPipe by Google
HTML/CSS/JS
My own design + engineering workflow

My Final Thoughts
This was one of the most fun builds so far. Using subagents, I created a full AR‑style photo booth with real‑time filters, spatial intelligence, and a polished UI all running locally in the browser. The combination of MediaPipe, canvas rendering, and goose orchestration made it possible to build something that feels magical.

Day 11: Solved Filter Sense Photo Booth: Delivered. Festival magic: Activated.

This post is part of my Advent of AI journey AI Engineering: Advent of AI with goose Day 11.

Follow along for more AI adventures with Eri!