DEV Community

Arion Dev.ed
Arion Dev.ed

Posted on

Stellar Scenes using ai.dev

This post is my submission for DEV Education Track: Build Apps with Google AI Studio.

What I Built

I built StellarScenes, an AI-powered application that generates a gallery of rich, detailed film scenes from a single user prompt. It uses Google's Gemini API to brainstorm at least 10 complete scene concepts—including titles, settings, actions, and character descriptions—and then uses the Imagen 3 API to generate a stunning, cinematic image for each concept.

The app also features AI-powered prompt suggestions to help spark creativity. The core of the application relies on two key prompts: one to generate the scene concepts in a structured JSON format, and another to generate the images.

Key Gemini Prompt for Scene Generation:

You are an expert creative screenwriter and concept artist. Your task is to generate at least 10 compelling and imaginative film scenes... Output the entire response as a single, valid JSON object with a "scenes" key...
IMPORTANT: Base the entire set of scenes on the following user idea...
Enter fullscreen mode Exit fullscreen mode

Key Imagen Prompt for Image Generation:
This prompt is dynamically generated by Gemini as the imagePrompt field for each scene, ensuring a unique and highly detailed visual description for every concept.

Demo

StellarScenes provides a dynamic and interactive experience. Users can input their own idea, get AI-powered suggestions, and generate a beautiful grid of scene concepts.

Visit the site

Screenshots

Initial State & Prompt Suggestions:
A screenshot showing the main UI with the text area and clickable AI-generated suggestion chips.

Imai

Generating Scenes:
A screenshot showing the loading state after the user has clicked "Generate Scenes".

Scene Gallery:
A screenshot showing the final grid of generated scenes, each with a title, image, and details.

My Experience

Working with the Google AI SDK was a fantastic experience. The ability of the Gemini model to consistently return structured JSON based on a natural language prompt is incredibly powerful. It significantly streamlined the process of generating complex, related data for the application. I was surprised by the creativity and coherence of the generated scenes; they weren't just random collections of ideas but often felt like parts of a larger, unseen narrative.

One of the key learnings was the importance of robust error handling, specifically when parsing JSON from the AI. Initially, the app would fail if the model returned a slightly malformed string. I learned to implement a more resilient parsing function that cleans the response (e.g., stripping markdown fences) before parsing, which made the application much more reliable.

Integrating Imagen 3 was seamless. The synergy between Gemini creating a descriptive prompt and Imagen visualizing it is the core magic of this app. It truly feels like a collaborative creative process between the user, the language model, and the image generation model. The speed of generating 10+ images in parallel was also impressive and essential for a good user experience.

Top comments (0)