🏗️ Dev.to Challenges – by Pravesh Sudha

Pravesh Sudha · 2025-09-14T05:53:47Z

This is a submission for the Google AI Studio Multimodal Challenge What I Built I built StoryWeaver AI , a multimodal storytelling web application powered by Google Gemini 2.5 Flash . The app allows anyone to input text, image, or audio (individually or combined) and instantly transforms it into an engaging 300–400 word creative story with a short narration script. The goal is simple: to make storytelling more accessible, fun, and creative by blending traditional story crafting with cutting-edge AI capabilities. Built with Flask + TailwindCSS and deployed on AWS EC2 with a custom domain and HTTPS, StoryWeaver AI provides a smooth, secure, and visually appealing experience. Demo 🎥 YouTube Walkthrough : 🌍 Live App → https://story.praveshsudha.com 🧑💻 Full Source Code (Navigate inside google-studio-challenge dir) : Pravesh-Sudha / dev-to-challenges 🏗️ Dev.to Challenges – by Pravesh Sudha This repository contains my submissions for various Dev.to Challenges . Each folder in this repo includes a hands-on project built around specific tools, APIs, or themes — from infrastructure to frontend and AI voice agents. 📁 Projects ⚙️ pulumi-challenge/ An infrastructure-as-code project built using Pulumi . It automates cloud infrastructure setup using Python and TypeScript across AWS services. 🎨 frontend-challenge/ A UI/UX-focused project that demonstrates creative frontend solutions using HTML, CSS, and JavaScript — optimized for responsiveness and accessibility. 📩 postmark-challenge/ A transactional email solution built with the Postmark API , showcasing email templates, delivery tracking, and webhook handling. 🧠 philo-agent/ A voice-based AI Philosopher built with AssemblyAI + Gemini — part of the World’s Largest Hackathon . 🗂️ Project Structure dev-to-challenges/ │ ├── pulumi-challenge/ ├── frontend-challenge/ ├── postmark-challenge/ ├── philo-agent/ └── README.md Enter fullscreen mode Exit fullscreen mode 🙌 Why This Repo? This repo is my playground to: … View on GitHub 📸 Screenshots How I Used Google AI Studio I used Google AI Studio with the Gemini 2.5 Flash model to handle multimodal input. By integrating the API into my Flask backend, I was able to process different forms of content: Text prompts are directly turned into narrative-rich stories. Image inputs are interpreted, and the AI builds a story inspired by visual details. Audio inputs are analyzed, and the context is woven into a creative narrative. This combination makes the app versatile and fun — users are free to interact with it however they like. Multimodal Features The standout feature is that users aren’t restricted to just one form of input. They can: Provide just text for a direct storytelling experience. Provide an image to get a narrative based on visuals. Provide audio for stories generated from sound-based input. Or combine all three for richer, more context-aware responses. This flexibility showcases the true strength of Gemini’s multimodal capabilities , turning it into more than just a text generator — it becomes a storytelling partner . Why it matters For centuries, stories have been humanity’s default way of sharing ideas, culture, and imagination . From cave paintings to epics, from bedtime tales to novels, stories shape how we learn, dream, and connect. But creating stories isn’t always easy for everyone. That’s where AI helps. With StoryWeaver AI , anyone — whether a child imagining a dragon, a student preparing for class, or a casual dreamer — can bring their ideas to life instantly. By blending human creativity with AI multimodal understanding , we’re expanding the ways people can express themselves. Conclusion StoryWeaver AI is my way of showing how AI and storytelling can beautifully merge . With the power of Google Gemini 2.5 Flash , this project highlights how multimodal inputs can enrich experiences beyond plain text. ✨ Try it out here: https://story.praveshsudha.com I hope this inspires you to imagine what’s possible when we combine AI and creativity. After all — “If you can think it, you can build it!” 🌐 Connect with me: 🔗 GitHub: Pravesh-Sudha 💼 LinkedIn: Pravesh Sudha 🐦 Twitter/X: @praveshstwt 📺 YouTube: @pravesh-sudha

This repository contains my submissions for various Dev.to Challenges. Each folder in this repo includes a hands-on project built around specific tools, APIs, or themes — from infrastructure to frontend and AI voice agents.

📁 Projects

⚙️ `pulumi-challenge/`

An infrastructure-as-code project built using Pulumi.
It automates cloud infrastructure setup using Python and TypeScript across AWS services.

🎨 `frontend-challenge/`

A UI/UX-focused project that demonstrates creative frontend solutions using HTML, CSS, and JavaScript — optimized for responsiveness and accessibility.

📩 `postmark-challenge/`

A transactional email solution built with the Postmark API, showcasing email templates, delivery tracking, and webhook handling.

🧠 `philo-agent/`

A voice-based AI Philosopher built with AssemblyAI + Gemini — part of the World’s Largest Hackathon.

🗂️ Project Structure

dev-to-challenges/
│
├── pulumi-challenge/
├── frontend-challenge/
├── postmark-challenge/
├── philo-agent/
└── README.md

🙌 Why This Repo?

This repo is my playground to:

Top comments (10)

Nadine • Sep 14

Pravesh how do you record your videos with the camera overlay?

Pravesh Sudha • Sep 14

I use OBS, for the camera overlay, use this video, it is a great guide: youtu.be/lqh00nfr-UU

Cool thanks :)

Fayakun IT Consulting • Sep 14

Building a web interface with any LLM doesn't make you a developer. That makes you a lazy human being 😆.

Pravesh Sudha • Sep 14 • Edited

It kind of does 😅, but the time saved can be spent on better things. Be intelligently lazy 😇

Roshan Sharma • Sep 15

Love this! Giving users the option to input text, image, or audio and then weaving them together into a story is super creative.
The Gemini 2.5 Flash integration + smooth UI make it shine.