DEV Community

Cover image for Wear Obscure Folk Arts
Pranjal
Pranjal

Posted on

Wear Obscure Folk Arts

This is a submission for the Google AI Studio Multimodal Challenge

What I Built
I've built "FolkOut," an AI-powered platform that connects users with the rich heritage of global indigenous and traditional art. The applet solves a unique problem: it allows users to not only discover beautiful art forms like Indian Warli or Mexican Otomi, but also to reimagine their own outfits with these styles.

A user can take a photo of their plain t-shirt, dress, or jacket, and FolkOut uses multimodal AI to generate a unique, culturally-inspired design and create a photorealistic mockup of the design on their item of clothing. It's a tool for personal expression, cultural discovery, and sustainable fashion.

Demo
You can try the live applet here: folkout.netlify.app
Applet Link for Gemini AI Studio: AI Studio App

Sam in Plain Clothes

Sam in Folk Art

A video demo would showcase the following user journey: Video Demo

Upload: A user uploads a photo of themselves in their favorite outfit.

Product Recognition: The AI instantly recognizes the object as an "outfit" and identifies the optimal printable surface area, masking it for design application.

Style Selection: The user browses a curated library of art forms (including Madhubani, Kalamkari, Warli, etc.) and selects "Australian Aboriginal Dot Art."

AI Design Generation: The user enters a simple text prompt, like "a journey to the waterhole." The AI generates a beautiful, authentic-style dot art pattern based on the prompt.

Mockup Creation: The generated design is seamlessly applied to the user's original t-shirt photo, showing a high-fidelity preview of the final product from the same angle and lighting.

How I Used Google AI Studio
Google AI Studio would be the engine for this entire experience. I would leverage a model like Gemini for its powerful, integrated multimodal capabilities.

Image Understanding: Gemini's ability to analyze and understand the content of an uploaded image is the first critical step. It would be used to identify the apparel in the user's photo (e.g., "t-shirt"), determine its contours, and isolate the surface area suitable for printing.

Text-to-Image Generation: I would use Gemini's image generation capabilities, guided by a sophisticated prompt that includes the user's text and strict stylistic parameters based on the chosen art form. For example: "Generate a monochromatic, minimalist pattern in the style of Warli painting, depicting a family celebrating a harvest."

Image-to-Image / Inpainting: The most advanced step would involve using the model's image editing capabilities. By providing the original image, a mask of the printable area, and the newly generated art, Gemini could create the final, photorealistic mockup, intelligently handling fabric folds, shadows, and perspective.

Text Generation: To ensure cultural sensitivity and educate users, Gemini would also power an information hub, generating rich descriptions of the history, symbolism, and cultural significance of each art form.

Multimodal Features
FolkOut is fundamentally multimodal, creating a seamless flow between the user's physical world and digital creation.

Apparel-Aware Design (Image -> Text -> Image): The core feature is the chain of operations where the app sees a user's outfit, listens to their creative idea, and shows them a finished concept. This goes beyond simple image generation by grounding the creation in a real-world clothing item provided by the user.

Visual Style Adherence (Text + Style -> Image): The AI doesn't just generate a picture from a prompt; it generates it within the specific visual language of a chosen cultural art form. This enhances the user experience by allowing them to create something that feels authentic and respectful to the source style.

Interactive Mockups (Image + Image -> Image): By merging the user's photo with the AI-generated art, the app provides an immediate, tangible preview that is far more engaging than just seeing a pattern on a blank background. This powerful visual feedback is key to the user's confidence in their design.

This multimodal approach makes the experience highly personal, educational, and creatively fulfilling.

Top comments (0)