DEV Community

Cover image for Try ChromaFlip Chronicles
Arunav Maitra
Arunav Maitra

Posted on

Try ChromaFlip Chronicles

This is a submission for the Google AI Studio Multimodal Challenge

What I Built

I built ChromaFlip Chronicles, a digital experience that breathes new life into the classic photo album.

Imagine a scrapbook, but one that's alive.
One that's interactive.
And one that's powered by your own imagination.

That's ChromaFlip Chronicles.

It's a beautifully designed, hand-drawn style notebook that you can flip through, page by page. But here's the magic: it's not just a gallery. It's a creative canvas. Each page allows you to take a photo—a memory, a piece of art, a random snapshot—and completely remix it using the power of generative AI.

It solves a simple but profound problem: our digital photos often sit stagnant in folders. This applet turns passive viewing into an active, creative process, allowing anyone to become a digital artist and storyteller.

It’s your AI-powered visual diary, where memories are not just stored, but wonderfully reimagined.

Demo

Here's a look at the enchanting world of ChromaFlip Chronicles.

A live demo link for you to try

Screenshots:

Image descri ption
A glimpse of the main notebook interface, where users can navigate through their visual diary.

Image descrip tion

Image dcription
Here, you can see the intuitive controls for remixing an image with a simple text prompt.

How I Used Google AI Studio

Google AI Studio was the creative engine behind this project. The star of the show is the Gemini 2.5 Flash Image Preview model (also known as nano-banana).

My entire application is built around its unique multimodal capabilities.

Here’s the technical breakdown:

  1. The Request: When a user wants to "remix" an image, I send a request to the Gemini API using the @google/genai library.
  2. Multimodal Input: This isn't just a text prompt. The request is multimodal because it sends two distinct pieces of information together:
    • The user's existing image (as a base64 encoded string).
    • The user's creative text prompt (e.g., "make this black and white photo burst with color").
  3. The Magic: The gemini-2.5-flash-image-preview model understands how to interpret the text prompt as a set of instructions to edit the provided image.
  4. The Response: The model then sends back a brand new, AI-generated image, which my app seamlessly displays on the notebook page.

It was surprisingly simple to implement, yet incredibly powerful in its results.

Multimodal Features

The core of ChromaFlip Chronicles is its multimodal functionality. It's not just a feature; it's the entire premise.

Why does this enhance the user experience?

  • It's Personal: Instead of generating images from scratch, users start with their own photos. This makes the creative process deeply personal and grounded in their own memories. You're not just creating art; you're transforming a piece of your own life.

  • It's Intuitive: The interaction is as simple as talking. You just tell the AI what you want to change about your picture. This removes the barrier of complex photo editing software and opens up creative expression to everyone.

  • It's A Creative Partnership: The multimodality—combining an image (what you have) with a text prompt (what you imagine)—creates a beautiful partnership between the user and the AI. It feels less like using a tool and more like collaborating with a creative partner who can instantly bring your ideas to life.

This fusion of image and text input is what makes ChromaFlip Chronicles a truly magical and engaging experience.

Top comments (0)