Khethelo Mafuleka

Posted on Dec 1, 2025

🎨 VisionVerse: From Image to AI-Generated Poetry in Minutes

#deved #learngoogleaistudio #ai #gemini

Education Track: Build Apps with Google AI Studio

This post is my submission for DEV Education Track: Build Apps with Google AI Studio.

✨**

What I Built

**
I created VisionVerse, an AI-powered web application that transforms images into beautifully crafted poems. Users upload any picture, select a poetry style (from sonnets to haikus), and watch as Gemini AI analyzes the visual content and generates a unique poem in their chosen format.

Key Prompts Used:

1. Main Google AI Studio Prompt:

text

Create a web application called 'VisionVerse' with the tagline 'Where Images Whisper Poems'...
["Create a web application called 'VisionVerse' with the tagline 'Where Images Whisper Poems'. The app should have:
A clean, elegant header with:
App name: 'VisionVerse' in artistic typography
Tagline: 'Where Images Whisper Poems' below it
Soft, poetic color scheme (lavender, cream, soft blues)
Two main columns/layout:
LEFT COLUMN (Controls):
Image upload section with:
Title: 'Upload Image' (not Videos - fix this from the design)
A dashed border upload zone with text 'Drag and drop image here'
'Browse Files' button
Support for common image formats (JPG, PNG, WebP)
Poetry Style Selection:
Title: 'Poetry Style'
8 style buttons arranged in a grid: Sonnet, Haiku, Narrative, Limerick, Ballad, Ode, Acrostic, Free Verse
Each button should have a subtle icon representing the style
Selected style should be highlighted
'Generate Poem' button at the bottom of left column
RIGHT COLUMN (Results):
Display area for uploaded image preview
Generated poem display with:
Parchment-style background
Elegant serif font for poem text
Proper line spacing and formatting
Action buttons below poem:
'Copy Text' button with clipboard icon
'Download Poem' button with download icon
'Regenerate' button with refresh icon
Functionality:
Users can upload an image file
Image preview should show after upload
User selects one poetry style
Clicking 'Generate Poem' sends the image to Gemini for analysis
Gemini should analyze the image content and generate a poem in the selected style
Display the generated poem in the results area
Action buttons should work (copy to clipboard, download as .txt, regenerate new poem)
Technical Requirements:
Use React with TypeScript
Use the latest Gemini SDK for image analysis
Implement proper error handling
Use responsive design for mobile and desktop
Add loading states during image processing
API keys should be handled securely
Use a clean, artistic CSS design with poetic aesthetics

2. Custom Instructions for Consistency:

text

VisionVerse App Custom Instructions

[# VisionVerse App Custom Instructions

General Guidelines

Always use TypeScript with React functional components
Use modern React hooks (useState, useEffect, useContext when appropriate)
Implement proper TypeScript interfaces for all props and state
Follow Google's Material Design principles with poetic adaptations
Use responsive design (mobile-first approach)

Styling & Design

Use a color palette: primary (#E6E6FA - lavender), secondary (#FFFDD0 - cream), accent (#9C89B8 - muted purple)
Implement CSS modules for component styling
Use Google Fonts: "Playfair Display" for headings, "Crimson Text" for poem text
Create an elegant, minimalist design with poetic touches
Add subtle animations for state transitions (fade-in for poems)
Use emojis instead of SVG icons where possible (📷 for upload, ✍️ for generate, 📋 for copy, etc.)

Layout Requirements

Two-column layout for desktop (60/40 or 70/30 split)
Single column stacked layout for mobile
Left column: All controls (upload, style selection, generate button)
Right column: Image preview and poem display
Fixed header with app name and tagline
Footer with attribution to Google AI

Gemini API Implementation

Use Gemini 1.5 Flash for image analysis and poem generation
Keep API key handling secure (use environment variables in deployment)
Implement proper error handling for API calls
Add loading states for image analysis
Cache image analysis results to minimize API calls
Use streaming response for poem generation if possible

Specific Features

Image upload: Support drag-and-drop AND file browser
Image preview: Show thumbnail after upload with option to remove
Poetry styles: 8 options with visual selection feedback
Generated poems: Display with proper line breaks and formatting
Action buttons: Copy (to clipboard), Download (.txt file), Regenerate
Add a "Try with example image" feature for demo purposes

Code Quality

Write clean, commented code
Use meaningful variable and function names
Implement proper error boundaries
Add accessibility features (ARIA labels, keyboard navigation)
Optimize images for web performance
Implement proper form validation

Testing & Debugging

Add console logs for debugging (but remove in production)
Implement error messages for users in UI
Test with various image types (JPG, PNG, WebP)
Ensure mobile responsiveness
Test all user interactions

DO NOT

Do not expose API keys in client-side code
Do not use complex gradients (subtle ones only if needed)
Do not use heavy animations that affect performance
Do not change model strings found in the generated code
Do not use deprecated React patterns

Nano Banna Design Prompt:

text

"Design a complete web application interface for an AI-powered poetry generator called 'VisionVerse'...
["Design a complete web application interface for an AI-powered poetry generator called 'VisionVerse' or 'PictoPoet'. The app should have:

A clean, artistic header with the app name and tagline
Left panel with:
- Image upload area with drag-and-drop zone
- Poetry style selection: 6-8 style buttons (Sonnet, Haiku, Narrative, Limerick, Free Verse, Ballad, Ode, Acrostic) with artistic icons
- Generate button with poetic styling
Right panel with:
- Display area for uploaded image with elegant frame
- Generated poem displayed in beautiful typography with parchment-like background
- Action buttons (Copy, Download, Regenerate)
Aesthetic: Soft, creative color palette with poetic elements (ink splashes, quill pens, subtle poetry-themed patterns)
Mobile-responsive layout showing how it adapts to smaller screens

Style: Modern minimalist with artistic flourishes, soft shadows, elegant typography, pastel color scheme with accent colors for interactive elements."

🚀 Live Demo
🔗 Deployed App: [https://visionverse-742365824692.us-west1.run.app]

📸 Screenshots:

Desktop Interface:

text=VisionVerse+Desktop+Interface

Mobile Responsive:

text=VisionVerse+Mobile+View

*Poem Generation in Action:
*

Generated+Sonnet+About+suicidalthoughts.

🔧 How It Works
Upload an Image: Drag-and-drop or browse for any image (JPG, PNG, WebP)

Choose Poetry Style: Select from 8 styles: 📜 Sonnet, 🎋 Haiku, 📖 Narrative, 😄 Limerick, 🎵 Ballad, 🏆 Ode, 🔤 Acrostic, 🎨 Free Verse

**Generate Poem: **Click the ✍️ button to let Gemini analyze your image

Enjoy & Share: Read your unique poem, copy it, download it, or regenerate

💡** Key Features Implemented**
✅ Intuitive Image Upload with drag-and-drop support
✅ 8 Poetry Styles with visual selection feedback
✅ Gemini 1.5 Flash Integration for image analysis and poem generation
✅ Responsive Design that works perfectly on desktop and mobile
✅ Action Tools to copy, download, or regenerate poems
✅ Example Images for instant testing without upload
✅ Poetic Design with lavender/cream color palette and elegant typography

🎯 What I Learned
1. The Power of Layered Instructions
Using main prompts, custom instructions, and design references together created remarkably consistent results. The AI maintained color schemes (#E6E6FA lavender, #FFFDD0 cream), fonts ("Playfair Display", "Crimson Text"), and coding patterns across all generated files.

2. Custom Instructions = Production Quality
The custom instructions transformed the output from "working code" to "production-ready code":

Proper TypeScript interfaces and error boundaries

Mobile-first responsive design

Secure API key handling via environment variables

Accessibility features (ARIA labels, keyboard navigation)

Performance optimizations

3. AI's Self-Correction Capability
Watching Gemini detect and fix its own errors was mind-blowing. During generation, it:

Identified 15+ type mismatches and import conflicts

Fixed them in real-time without intervention

Maintained functional integrity throughout

4. Constraint-Based Development Works
The "DO NOT" section in custom instructions prevented common pitfalls:

No API keys exposed client-side

No heavy animations affecting performance

No deprecated React patterns

No complex gradients (subtle ones only)

🚧 Challenges & Solutions
Challenge 1: Color Consistency
Problem: Initial builds used random colors instead of my specified palette.
Solution: Added exact hex codes (#E6E6FA, #FFFDD0, #9C89B8) to custom instructions.
**
Challenge 2: Font Loading Issues**
Problem: Google Fonts weren't loading in preview mode.
Solution: Explicitly specified font names and added proper font loading logic.

Challenge 3: Mobile Layout Breakpoints
Problem: Mobile responsiveness failed at certain screen sizes.
Solution: Added "mobile-first approach" and specific breakpoint instructions.

Challenge 4: Example Feature Implementation
Problem: "Try with example image" feature wasn't included initially.
Solution: Added explicit requirement to both main prompt and custom instructions.
**
📊 Technical Stack Generated**
Frontend: React 18 + TypeScript

Styling: CSS Modules with responsive design

AI Integration: Gemini 1.5 Flash via @google/generative-ai SDK

Fonts: Google Fonts (Playfair Display, Crimson Text)

Icons: Emoji-based system (no SVG dependencies)

Build Tool: Vite

Deployment: Google Cloud Run ready

Accessibility: Full ARIA support
**
💭 Reflection & Tips for Others**
What Surprised Me Most:
The AI's ability to maintain consistency across dozens of files without explicit coordination. Every component followed the same patterns automatically.

Most Valuable Insight:
Constraints are as important as requirements. Clear "DO NOT" rules prevented problematic "creative" choices by the AI.

Tips for Your Build:
Use the Three-Layer Approach:

text
Layer 1: Main Prompt (What to build)
Layer 2: Custom Instructions (How to build it)
Layer 3: Visual Reference (What it should look like)
Be Specific About Negatives: Tell the AI what NOT to do.

Combine AI Tools: Use design generators (Nano Banna) alongside code generators.

Iterate in Small Steps: Make small, specific adjustments rather than massive rewrites.

Watch the Thinking Process: Pay attention to the AI's reasoning in the Code Assistant panel.

🎉** Final Thoughts**
This Education Track demonstrated that structured AI guidance produces professional results. Google AI Studio's "Build apps with Gemini" feature isn't about replacing developers—it's about augmenting creativity.

By handling boilerplate and consistency, AI lets us focus on what matters: the core idea. In my case, that was creating a magical experience where images whisper poems.

The ability to go from a detailed prompt to a fully deployed application in minutes is revolutionary for:

Rapid prototyping and idea validation

Learning modern development patterns

Accelerating development without sacrificing quality

Exploring creative concepts without technical barriers

Try building your own app with Google AI Studio—you'll be amazed at what you can create in minutes!

❓ Questions for You
Have you tried Google AI Studio's Build feature? What kind of apps are you excited to create? What challenges did you face, and how did you overcome them?

Share your experiences in the comments below! 👇

Cover Image: [
]

Top comments (2)

Jason C • Dec 2 '25 • Edited

I like your concepts and prompts, although they seem a bit heavy. Example: ai studio says they hide api keys automatically, no additional prompt needed for that. My projects never saw that problem.

I tested your site with my Thanksgiving turkey and laughed at the results. Nicely done.

Khethelo Mafuleka • Dec 2 '25

noted I will surely do likewise