My First “AI-Generated Game” Looked Great on Paper and Boring on Screen

#gamedev #ai #webdev #discuss

I’ve been building web stuff long enough to see a few hype cycles come and go.

When the “AI will generate your game from one sentence” wave started, I told myself I’d stay calm.

I didn’t.

We’re building SeaGames – a browser-based gaming platform – and at some point the idea of an “AI Game Creator” became too tempting:

Player types a prompt, AI spits out a playable WebGL game, no downloads, runs in the browser.

On a whiteboard it looked beautiful. In my editor, not so much.

Prompts vs real design

The first reality check came from prompts that sounded fine to humans but were useless as actual specs.

Someone typed:

“Make a fast game with monsters and cool skills.”

Great mood. Zero structure.

From a game-design point of view, that sentence doesn’t tell you:

what “fast” means,
how you lose,
what the core loop is,
how input works on mobile vs desktop.

The model happily generated pages of “design”: enemies, abilities, item drops, zones, you name it. None of it hung together. It felt like reading notes from a very enthusiastic intern who never had to ship anything.

The more prompts I tried, the more I realized: LLMs are excellent at describing games, terrible at committing to one.

AI-written logic is the wrong kind of “almost working”

Next I tried letting the model produce small pieces of logic: movement rules, scoring, a bit of enemy AI.

The code looked decent at a glance. Variable names made sense. Comments were grammatically correct. And then you actually run it:

enemies get stuck in corners,
difficulty spikes out of nowhere,
the player loses health “every few seconds while standing still” for no clear reason.

Debugging this kind of code is surprisingly tiring. It’s not obviously broken; it’s just not thought through. You end up doing code review for a ghost contributor who never explains why anything exists.

At some point I realized I was spending more time fixing “smart” AI suggestions than I would have spent writing the logic myself.

Variety without a spine

The content side was similar.

Letting the model invent enemies, obstacles and level layouts produced a lot of variety and almost no identity. One level behaved like a bullet-hell shooter, the next like a slow puzzle, the next like a physics toy that forgot it was supposed to be a game.

It confirmed something I already knew but had conveniently ignored:

good games are mostly constraints. Pace, rhythm, failure states, repetition, small bits of friction in the right places. A model that’s rewarded for novelty has no reason to respect any of that unless you force it to.

And I hadn’t.

The real failure wasn’t technical

Eventually we did ship a working pipeline:

take the prompt,
turn it into a structured config,
feed it into a WebGL template,
build and host the result on SeaGames.

No errors, no crashes. From the outside it looked like the demo everyone tweets about.

Then I sat down and played the first fully AI-generated game.

It ran. It responded. It just… wasn’t fun. It felt like a collection of features that had never met each other.

That was the moment I stopped calling it a “technical experiment” and started calling it a failure. The stack was fine; the experience wasn’t.

Where AI actually fits now

After that, I changed how I use these models.

I don’t ask them to be designers or programmers. I treat them like very fast assistants:

The core mechanics, camera, input model and failure conditions are designed by humans.
AI helps with variations: extra levels, small modifiers, flavor text, names, sometimes a rough first pass at a pattern I’ll later rewrite.
Everything lives inside opinionated templates that already feel good to play before AI ever touches them.

The goal is no longer “this game was made by AI”.

The goal is “this game was made faster because AI handled some of the boring parts”.

It’s a much less magical sentence, but it actually ships.

If you want the long version

This is the short, Medium-friendly cut of the story.

In a longer write-up I go into more detail about the pipeline, the WebGL side, and the specific things that broke in production.

If you’re curious about that version, it’s here:

👉 Full story on Hashnode