DEV Community

Cover image for Artisan Social
Ha3k
Ha3k

Posted on

Artisan Social

This is a submission for the Google AI Studio Multimodal Challenge

What I Built

I built Artisan Social, your personal AI design partner for brainstorming social media applications!

Ever had a brilliant idea for a new app but struggled to visualize it? Artisan Social is here to help. It's a creative studio that bridges the gap between a simple text idea and stunning, tangible design concepts.

Here's the magic:

  1. You start with a spark—a simple idea, like "a social network for urban gardeners."
  2. Our AI, powered by Gemini, brainstorms ten unique design angles for you.
  3. Then, it brings each concept to life by generating a high-quality visual representation.
  4. Finally, you can dive in and iteratively edit any design using simple text commands, truly making it your own.

Artisan Social is designed to crush creative blocks and accelerate the journey from imagination to visualization.

Demo

You can find a live demo of the applet here:
Link to Deployed Applet

Here’s a quick walkthrough of the experience:

Step 1: The Spark of an Idea
A user enters their social app concept into a clean, inviting interface.

Image descrip tion

Step 2: AI-Powered Ideation
In moments, the app displays a gallery of ten distinct visual concepts generated by the AI, each with a unique name and description.

Image descr iption

Image descr iption

Step 3: The Multimodal Editor
The user selects a design and enters the editor. By providing a text prompt like "change the color scheme to dark mode with neon green accents," they can instantly see their vision come to life in a new, edited image.

Image descri ption

Image descri ption

How I Used Google AI Studio

Google AI Studio and the Gemini models are the heart and soul of Artisan Social. I used the @google/genai SDK to orchestrate a trio of powerful models, each playing a specialized role.

  • gemini-2.5-flash for Structured Brainstorming:
    I used this model for the initial ideation phase. The goal wasn't just to get text, but to get structured data. By defining a responseSchema, I instructed Gemini to return a clean JSON array of design ideas, each with a name, description, and a visual_prompt. This makes the output reliable and easy to parse, avoiding messy string manipulation.

  • imagen-4.0-generate-001 for Visual Creation:
    This is the artist. It takes the detailed visual_prompt generated by gemini-2.5-flash and transforms it into a beautiful, high-resolution concept image. The results are vibrant, professional, and truly capture the essence of the idea.

  • gemini-2.5-flash-image-preview for Multimodal Magic:
    This is where the true collaboration happens. This model's ability to understand both an image and a text prompt simultaneously is the core of the editing feature. It's not just applying a filter; it's comprehending a visual context and a linguistic instruction to create something entirely new.

Multimodal Features

The star of the show is the AI Design Editor, a powerful multimodal tool that makes visual editing feel like a conversation.

This feature accepts two different types of input—or modalities—at once:

  1. An Image: The existing design concept the user wants to tweak.
  2. Text: A natural language command describing the desired change.

The result is a seamless, iterative workflow. Instead of having to write a brand new, complex prompt to make a small change, the user can simply refine what's already there.

It's the difference between hiring a new artist for every revision versus collaborating with one who remembers your last conversation.

This fundamentally enhances the user experience by making the creative process:

  • Faster: Small tweaks take seconds, not minutes.
  • More Intuitive: Users can express changes naturally, without needing to learn complex "prompt engineering" jargon.
  • More Creative: It encourages experimentation. When the cost of trying a new idea is just typing a sentence, users are more likely to explore wild and wonderful variations.

By combining image and text understanding, Artisan Social transforms a simple image generator into a dynamic and interactive design partner.

Top comments (0)