DEV Community

jg-noncelogic
jg-noncelogic

Posted on • Originally published at understudy-ai.github.io

Show HN: I made an AI that reviews iPhone apps – 1h of autonomous GUI work

I ran a GUI agent for ~1 hour that installs and reviews iPhone apps on a real device. Understudy: browses the App Store in Chrome, mirrors a real iPhone via macOS, explores the app, records screenshots/video, stitches a narrated review with FFmpeg, uploads to YouTube.

Key architecture: split the work into typed child sessions so context doesn’t explode. Six stages: scrape listing, install via mirroring, exploratory testing, targeted checks, media capture, compose+upload+cleanup. Workers = deterministic I/O. Skills = agentic decisions.

Robustness wins here. Re‑ground every action from the live screenshot so unexpected dialogs don’t kill the run. Keep device control deterministic, collect screenshots + UI dumps + logs, and compose media locally (FFmpeg) for an auditable pipeline. MIT license.

Takeaway for builders: long GUI agents need clear separation — deterministic workers for device/browser ops, agentic skills for discovery, session isolation, and artifact bundling + human review before publish. Repo + process vids: https://understudy-ai.github.io/understudy/

Top comments (0)