This is a submission for the Google AI Studio Multimodal Challenge
Demo Video
A live demo of the applet is right here: Selfsprite Maze Demo
What I Built
Generic gaming avatars are dead. I built Selfsprite Maze to fix the disconnect between player and character.
It's a retro game that uses multimodal GenAI to rip your actual face from a selfie and mint a custom, animated 8-bit animated sprite. You are the hero.
The gameplay loop is brutally simple:
- 📸 Create Your Hero: Snap a selfie, pick a class like 'Wizard' or 'Cyberpunk', and the AI spits out a personalized sprite sheet. Done in seconds.
- 😈 Design Your Enemy: Here's the twist. Run the process again, but this time you're creating the enemy guards. Now you can literally fight your friends, a celebrity, or a weird alternate-reality version of yourself.
- 🏃 Escape the Maze: You're dropped into a procedurally generated maze. The goal? Hit the exit. The problem? The guards you just made are running pathfinding algorithms to hunt you down. No pressure.
This game itself is a creative engine.
- Infinite Replayability: AI-driven level-gen means you'll never play the same maze twice.
- Smart Enemies: Guards use line-of-sight and A* pathfinding. They aren't stormtroopers; they will find you.
- Digital Swag: Beat the level and you get to download your character as a high-quality GIF and PNG frames. Your new profile pic is waiting.
- Offline Mode: Use existing sprite sheet to save API, or use google AI studio to build sprite sheets.
Demo
A live demo of the applet is right here: This Demo is unfortunately without paid API, for testing, Please use upload option in the game, and with a generation of a sprite sheet directly from google AI studio for free, you dont need to use API.
Demo Video
- Intro: Kicks off with a retro instruction screen. You'll know how to play in 10 seconds.
- Character Creation: Watch the full flow: snap a selfie, select the 'character' class, and fire it off to the AI.
- Motion Creation: Watch the full flow: select the 'motion' class, and fire it off to the AI.
- Real-Time Generation: The AI-generated sprite sheet appears and gets sliced into animation frames on the fly. Pure visual feedback.
- Showcase: A classic "VS" screen showcases your hero and the enemy you designed, building hype for the showdown.
Gameplay: The real deal. My sprite navigating a maze, getting spotted, and a tense chase kicking off with NPC AI Players.
Victory & Download: Make it to the exit, get the "Level Complete" screen, and one-click download a ZIP file with your GIF and all the frames. Ship it.
How I Used Google AI Studio
This entire project runs on Google AI. I used AI Studio for rapid-fire prompt engineering, and the final build uses the @google/genai
SDK exclusively. The magic is in how it orchestrates two different Gemini models.
- (Nano Banana) for Multimodal Generation: This is the creative engine. It takes two inputs—an image prompt (your selfie) and a text prompt (my instructions for style, class, etc.)—and fuses them into a brand new sprite sheet. This image-plus-text-to-image pipeline is the core feature.
-
Vision-Based Analysis with
gemini-2.5-flash
: After an image is generated, I need its grid dimensions. Instead of guessing, I just show the image back to a vision model and ask, "How many columns and rows?" I useresponseSchema
to force the output into a clean JSON object. The AI becomes a reliable, automated data-processing tool.
Multimodal Features
Multimodal isn't just a feature here; it's the entire foundation of the app.
-
Deep Personalization through Image-to-Image Transformation: This isn't just
text-to-image
; it'simage-plus-text-to-image
. The user's photo is the actual foundational reference, not just a loose inspiration. Seeing an 8-bit version animated of yourself being chased through a dungeon hits different than playing as a generic knight. - Vision-Powered Automation: I built a closed-loop pipeline. The AI generates a creative asset, and then another AI analyzes that asset to provide the technical data needed for the next step (Image -> JSON). It bridges the gap between the creative and technical, making a complex process feel instant and overcome hallucination limitations.
- Creativity as a Gameplay Mechanic: The AI is so fast that the creation process is part of the gameplay. The user is both the hero designer and the monster designer. This dual role is a novel gameplay loop that's only possible with powerful and flexible multimodal AI.
Top comments (0)