DEV Community

Cover image for Visualize the Blueprint of Life with GenoCraft AI
AI Bug Slayer 🐞
AI Bug Slayer 🐞

Posted on

Visualize the Blueprint of Life with GenoCraft AI

This is a submission for the Google AI Studio Multimodal Challenge

Image dmdfescription

What I Built

I built GenoCraft, a sleek and futuristic web application designed to spark creativity for artists, designers, and storytellers.

At its core, GenoCraft is an AI-powered concept generator. It tackles the "blank canvas problem" by transforming simple text ideas into rich, multimodal DNA profiles for imaginary organisms.

Instead of just getting a single image, users receive a set of three unique variations, each with:

  • A stunning, AI-generated visual of the DNA helix.
  • A compelling, imaginative title.
  • A descriptive paragraph hinting at a potential business or scientific application.

It's a tool for turning a flicker of an idea into a tangible, visual, and narrative starting point. Imagine a game designer creating new alien species, a writer visualizing a key plot device, or a branding expert developing a biotech company's identity—GenoCraft is their launchpad.

Demo

Check out the GenoCraft app in action!

Link

Here’s a glimpse of the user journey:

Step 1: The Idea Spark
A user selects a base organism and provides a simple, creative prompt.

Imag e  escription

Step 2: AI-Powered Creation
The app generates three distinct, visually stunning variations of the concept, complete with titles and detailed descriptions.

Image de scription

Step 3: Refine and Edit
The user can then select any profile and use natural language to perform powerful image edits, like changing colors or adding new elements.

Image descriptio n

How I Used Google AI Studio

GenoCraft is powered entirely by the Gemini API, orchestrated through the @google/genai library. I leveraged a suite of models to create a seamless, multimodal experience.

  • gemini-2.5-flash for Text and Logic: This model is the creative brain of the operation. I use it for all text-based tasks, most crucially by providing it a responseSchema. This allows me to ask for complex, structured JSON output in a single API call, receiving perfectly formatted titles, descriptions, and even unique image prompts for each variation. This is incredibly efficient and reliable.

  • imagen-4.0-generate-001 for Image Generation: This model is the artist. It takes the detailed text prompts generated by gemini-2.5-flash and renders the beautiful, abstract DNA visualizations that form the core of the app's output. Its ability to interpret artistic and scientific concepts is key.

  • gemini-2.5-flash-image-preview for Image Editing: This model provides the "magic" editing feature. It can take an existing image and a text prompt (e.g., "make it more purple") and return a modified image, making the creative process truly iterative.

Multimodal Features

Multimodality isn't just a feature in GenoCraft; it's the entire foundation. The app thrives on the interplay between text and images.

  1. Conceptual Text-to-Image Generation: The primary user flow is a perfect example of multimodality. The user's text prompt is first interpreted and expanded by gemini-2.5-flash to create richer, more detailed prompts, which are then used by imagen-4.0-generate-001 to create the final images. It's a two-step process where one model creatively directs the other.

  2. Text-Guided Image Editing: The edit functionality is a powerful demonstration of multimodal input. The user provides both an image (the DNA profile they want to change) and text (their desired edit). gemini-2.5-flash-image-preview understands the context of both inputs to produce a new image that seamlessly incorporates the change. This creates a fluid, intuitive editing experience that feels like a conversation with a creative partner.

  3. Synchronized Content Generation: The app generates and presents image and text pairs that are contextually linked. The title and description for each DNA profile aren't generic; they are specifically crafted by the AI to match the visual it also helped create. This ensures a cohesive and immersive final output for the user.

Thanks for checking out my project!

Top comments (0)