Every new AI model release raises the same question:
Can it actually build something real?
So I ran a small experiment: vibe coding classic arcade games inside VS Code using GPT-5.4, and comparing the results with a previous attempt using GPT-5.3 Codex.
The rules were simple:
- Minimal prompting (true vibe coding)
- Classic games: Tetris, Snake, Pac-Man, Space Invaders
- Everything generated inside a .NET Razor Pages project
- Code must run locally, not just inside a chat sandbox
In other words: real source code, not a demo trick.
Watch Full Video
The Setup
The project shell was already created in Visual Studio Code using a simple .NET Razor Pages frontend.
Why?
Because it avoids:
- installing frameworks
- setting up build tools
- dependency chaos
The AI just has to generate the games.
Sounds easy… right?
Flashback: GPT-5.3 Codex
I had already tried this challenge with GPT-5.3 Codex.
The result?
Not great.
Problems included:
- weak UI design
- games not working first time
- missing mechanics
- and the highlight…
Pac-Man wasn’t Pac-Man.
It produced something vaguely yellow and circular that moved around a screen. That was about it.
Enter GPT-5.4
For this run I used GPT-5.4 through the Codex extension in VS Code.
First task: build Tetris.
The result was… interesting.
The game worked, but the layout looked like a blog article instead of a game.
Seriously.
The page looked like it was designed for reading documentation. The play area required scrolling, which is not exactly ideal for a fast-paced arcade game.
A quick prompt fixed the layout, and the game itself actually played fine.
Already better than GPT-5.3.
Pac-Man: Redemption Arc
Pac-Man was the real test.
GPT-5.3 completely failed here.
GPT-5.4 did much better, but it still wasn’t perfect.
Problems included:
- ghosts not actively chasing Pac-Man
- weird map boundaries
- extremely basic graphics
The overall vibe felt like:
“Impressive school project from 1998.”
Still… at least this time it actually resembled Pac-Man.
Progress.
The Simpler Games
Games like Snake and Space Invaders worked reasonably well.
Which makes sense.
AI tends to do well when the logic is simple:
- player movement
- collisions
- scoring
Once you add enemy AI and complex mechanics, things start to wobble.
The Unexpected Part
GPT-5.4 took about 90 minutes to generate the full implementation.
That’s much slower than the flashy demos you often see online.
Why?
Because those demos usually run inside web chat environments, where the AI can generate quick prototypes.
In this challenge the model had to produce real source code that compiles and runs locally inside a project.
That’s a very different problem.
Plot Twist: The Best Result So Far
Surprisingly, the best implementation I've seen for this arcade challenge is still:
GitHub Copilot + Claude Sonnet 4.5
That combination produced the most accurate games overall.
Better UI.
Better gameplay.
Less weird behavior.
Final Verdict
GPT-5.4 is better than GPT-5.3 Codex.
But only slightly in this experiment.
Improvements:
- more complete implementations
- fewer obvious missing features
- Pac-Man actually resembles Pac-Man
Still not perfect:
- UI needs manual fixing
- complex mechanics struggle
- generation time is long
The Takeaway
AI coding tools are great at generating code.
But generating correct systems is still harder.
Vibe coding works surprisingly well for:
- prototypes
- small apps
- experiments
But for anything complex…
You’ll still want a human in the loop.
Curious to hear from other devs:
What’s the most impressive AI-generated project you’ve seen so far?




Top comments (0)