TL;DR
I built AikitPros — a creative hub that takes a single brief and orchestrates Midjourney, Flux, Suno, Sora, Luma, GPT and Claude into one campaign output (script + images + music + video). Total cost in production: $0.35 per campaign. Live demo: https://x.com/aikitpros/status/2046596943023890780
This post is the architectural walk-through.
The problem
Most "all-in-one AI" tools are thin wrappers that just expose 7 buttons. The real value is coordination: the music tempo has to match the video cut, the on-screen copy has to fit the image composition, the voice-over script has to land in 8 seconds. Calling 7 APIs in parallel and concatenating the outputs gets you ~40% usable rate. Not good enough.
Architecture
Brief -> Planner (Claude) produces script, shot list, audio direction, brand constraints. Then a fan-out routes to ImageRouter (Midjourney V8 / Flux / GPT Image 2), MusicRouter (Suno), VideoRouter (Sora / Luma / Hailuo), and CopyRouter (GPT + Claude). A judge model (GPT-4o-mini) reviews the assembled output and either delivers it or re-runs only the failing modality.
The judge step is the one decision that took 40% usable to 90%+ usable.
Three things I would do differently next time
- Cache the planner output. Same brief = same plan. I burned a lot of Claude tokens regenerating identical decompositions.
- Stream partial results. Users tolerate 60s of waiting if they see the script appear at 5s, the image at 20s, the music at 35s.
- Pre-warm the video model. Cold starts on Sora/Luma can add 30s. A keep-alive ping during the image step parallelizes that latency away.
Try it
Hub is live at https://aikitpros.com. First credits are on me — if you build something with it, I would love to see it.
If you are building something similar and want to compare notes on judge models or routing heuristics, drop a comment.
Top comments (0)