Everyone has that moment where they look at an existing task, squint, and think: "I wish a computer could do this for me." For me, that moment came while cropping hundreds of student ID photos at work — and the existing solutions were either painfully slow, wildly inaccurate, or locked behind a subscription.
So I built Face Crop Studio: an open-source, GPU-accelerated face detection and cropping tool written almost entirely in Rust. Here's the story of why, what went wrong, and what I learned.
The Problem
I work in IT for a school. Every year, I need to process student photos — headshots for ID cards, passport-format images for our student database. The tools available fell into two camps:
- Online services that upload your images to someone else's server (a non-starter for student data).
- Desktop tools that choke on batch jobs or produce inconsistent results.
I needed something that could handle hundreds of images locally, produce deterministic results, and do it fast.
Why Rust?
First, the ecosystem. Rust's crate ecosystem has reached near-Python levels of richness. Need GPU compute? There's wgpu. Face detection inference? Build it with ndarray and image crates. GUI? egui. Batch data ingestion from CSV, Excel, Parquet, SQLite? All covered with mature, well-maintained crates. I rarely hit a wall where I needed to write bindings or roll my own solution — the building blocks were already there.
Second, and this is the underrated one: Rust is a great language for vibe coding. When you're building with an LLM as your copilot, the compiler's error messages become a superpower. Rust doesn't just tell you something went wrong — it tells you exactly what went wrong, where, and often how to fix it. Feed a Rust compiler error to an LLM, and it can resolve it in one shot. Try that with a segfault in C or a runtime panic in a dynamically typed language, and you're playing twenty questions. The tight feedback loop between Rust's compiler and AI-assisted development made me dramatically more productive than I would have been in any other systems language.
The Architecture
Face Crop Studio is built around a few key pieces:
YuNet for face detection — a lightweight neural network that's fast enough for real-time use. I implemented the inference pipeline from scratch with custom WGSL compute shaders, rather than relying on ONNX Runtime. This gave me full control over the GPU pipeline and eliminated a heavy dependency.
Seven custom compute shaders handle everything from image pre-processing to face detection inference to post-processing enhancements. The entire pipeline stays on the GPU when possible, avoiding expensive CPU↔GPU data transfers.
A full enhancement pipeline — auto colour correction, exposure, brightness, contrast, saturation, sharpening, skin smoothing, red-eye removal, and portrait background blur. Each has both a GPU and CPU path, with automatic fallback.
Batch processing with data mapping — import CSV, Excel, Parquet, or SQLite files to drive batch naming. Feed in a spreadsheet of student names and photo filenames, and the tool handles the rest.
The Hard Parts
VRAM Management
GPU memory isn't like system RAM — you can't just allocate freely and let the OS page things out. When processing large batches, I had to carefully manage buffer lifetimes and implement a staging system that processes images in chunks without exhausting VRAM. Getting this wrong meant either crashes or silently falling back to the CPU for everything.
Multi-Face Detection
Single-face cropping worked beautifully from day one. Multi-face detection? That produced hilarious 3-pixel-wide vertical strips for weeks. The issue turned out to be in how bounding box coordinates were being translated into crop rectangles — a subtle off-by-one in the aspect ratio calculation that only manifested with multiple detections.
Cross-Platform GPU Support
wgpu abstracts the graphics backends, but "abstracts" doesn't mean "eliminates differences." Shader behaviour, texture format support, and buffer alignment requirements vary across Vulkan, Metal, and DirectX. Testing on one platform and assuming it works on others is a trap I fell into more than once.
What I Shipped
Face Crop Studio today includes:
- 6+ crop presets: LinkedIn, Passport, Instagram, ID Card, Avatar, Headshot, and fully custom dimensions
- Quality scoring: Laplacian-variance sharpness analysis categorises each crop as Low, Medium, or High quality
- Native GUI built with egui — live preview, undo/redo, and processing history
- CLI mode for scripting and automation
- 4 export formats with configurable quality settings
- MIT licensed and fully open source
The codebase is 97% Rust.
What I Learned
Write the GPU path first, not last. If you design around CPU processing and bolt on GPU acceleration later, you end up with awkward data flow and unnecessary copies. Design for GPU from the start and add CPU fallback where needed.
Batch processing exposes every edge case. A tool that works on 10 images will find new and creative ways to fail on 1,000. Memory leaks that are invisible in single-image mode become showstoppers at scale.
Deterministic output matters more than you think. When processing official documents like ID photos, getting slightly different crops from the same input is unacceptable. Floating-point reproducibility across GPU and CPU paths took real effort to achieve.
Try It Yourself
Face Crop Studio is free, open source, and available for Windows. If you work with batch photo processing — schools, HR departments, photography studios — give it a look.
→ GitHub: github.com/gregorycarnegie/face_crop_studio
→ Website: facecropstudio.com
If you found this interesting, I'd love to connect and hear your vibe-coding stories.
Top comments (0)