I Built an Open-Source LLM Harness — AI Agents Interview, Plan, Build & Deploy Entire Apps From One Prompt | Any LLM | CLI

#devtools #agents #ai #opensource

We hit submit on the Amazon Nova Hackathon with 2 minutes to spare.

12 days. 30,000 lines of Python. 9 playable games — all built by AI agents, zero human code. That's right, agent-inception!

See what it built →

## One Prompt In, Deployed App Out

Build a tower defense with 6 tower types and chain lightning

Nova Forge interviews you, decomposes the work into parallel tasks, assigns AI agents, runs adversarial quality review, and deploys. That prompt produced an 802-line playable game
in 341 seconds.

https://forge.herakles.dev/demos/

What Blew Our Minds

We wanted forge to produce a working app. Our test game had 5 bugs. We pointed the framework at its own output. Nova Pro found every bug, fixed them all in 26 seconds — including a
structural refactor using a tool we 'invented' that morning (replace_lines).

That debug session is now proof-of-work alongside the demo. The model fixing its own mistakes became our best feature-> zero manual debug loops.

The 6 Bugs Every Agent Framework Will Hit

We found these the hard way. Saving you the pain:

Agents describe code instead of writing it — your prompt must say "call write_file," not "complete the task"
Specs get summarized to death — "pseudo-3D racer with ACCEL=0.9" becomes "build a game" → wrong output
Building agents never see the spec — only task summaries reach them. Inject the full spec.
Verify phase kills multi-file builds — agent writes 80 lines, enters verify mode, wastes all turns
Missing tools are silent failures — without think, models dump reasoning as text
No path guidance = wrong directory — models love creating src/ when you want project root

Each one took hours to trace. We built fast, and broke things. But we finished the race! https://github.com/herakles-dev/nova-forge

The Stack