This is a submission for the Google AI Studio Multimodal Challenge
What I Built
I built Mecha Morph: Gundam Genesis!
It’s a dream tool for every fan of anime, mecha, and model kits.
Have you ever looked at your favorite character and wondered...
"What would they look like as a giant, epic robot?"
Now, you can find out.
My applet takes any character image you provide.
It then uses the incredible power of Gemini's multimodal AI to completely reimagine them.
The result? A stunning, entirely original, battle-ready Gundam, designed right before your eyes.
But it doesn't stop there.
To capture the true spirit of the "Gunpla" hobby, the AI also generates the collectible box art and packaging for your new mecha.
It's more than just a filter; it's a creative partner. A playground for generating unique, high-quality concept art that feels like it fell right out of a hobby shop in Akihabara.
Demo
You can try out the live applet here: Live App Link Here
Screenshots
Here's a quick look at the process.
1. Uploading the Character
The journey begins with a simple image.
2. Customizing the Build
The user fine-tunes their creation, choosing colors, weapons, and art style.
3. The Final Masterpiece!
The AI delivers the final, incredible result: the mecha posed heroically next to its custom packaging.
How I Used Google AI Studio
Google AI Studio was my command center for this project.
It was absolutely essential for prototyping the core creative logic of Mecha Morph.
I spent countless hours in the Studio, rapidly experimenting with different multimodal prompts. My main goal was to perfect the complex set of instructions given to the AI. I needed to ensure it understood the nuanced request:
- Transform the character into a mecha.
- Pose the mecha outside of the box.
- Design the box art separately.
- Include accessories on the box art.
The model of choice was, of course, gemini-2.5-flash-image-preview
. It's an absolute powerhouse for this kind of creative, image-based task.
The ability to quickly iterate in the Studio was a complete game-changer. I could tweak a single sentence in my prompt, upload a test image, and see the new result in seconds. This tight feedback loop allowed me to refine the instructions from a vague idea into a precise, repeatable, and magical creative process.
Multimodal Features
The entire soul of Mecha Morph is built upon Gemini's deep multimodal capabilities.
The core magic lies in its sophisticated Image + Text -> Image pipeline.
Input Modality 1: The Image
The user provides the visual foundation—the character. The AI doesn't just see pixels; it understands the essence of the character. It analyzes the design, the color palette, the silhouette, and even the perceived personality to inform the mecha's final look.Input Modality 2: The Text
This is where the user becomes the art director. The detailed text prompt, dynamically built from the user's selections (e.g., Primary Color: Crimson Red, Weapon: Heat Hawk Axe, Box Art Style: 80s Vintage Anime), provides the AI with specific, creative constraints.Output Modality: The Fused Image
The final generated image is the most impressive part. It isn't just a filtered version of the original. It is a true synthesis of the two input modalities. It's a completely new piece of art that understands and masterfully combines the visual cues from the input image with the explicit instructions from the text prompt.
This deep fusion of image and text is what makes Mecha Morph feel like magic. It creates a profoundly engaging and personalized experience that a text-only or image-only model could never hope to achieve.
Top comments (0)