CurioShorts

#devchallenge #googleaichallenge #ai #gemini

CurioShorts is an AI-powered educational content generator. It transforms any user-asked question into a short, engaging, TikTok-style shorts video. Users can select a fun character (like Spider-Man or a custom one) to be the narrator, choose a visual art style, and the app automatically generates a simple song with lyrics, custom images for each line, and a voice-over narration, all explaining the answer to the question with music in background.

What Problem It Solves
The app directly tackles the problem of "brain rot"—the passive, often low-value content consumption common on short-form video platforms. Its goal is to "swap brain rot for brain fuel" by making learning as engaging, accessible, and fun as scrolling through social media. It provides a creative and educational alternative, especially for younger audiences.

Tech Stack

Frontend: TypeScript, HTML5, CSS3
AI Model: Google Gemini API (@google/genai)
Client-Side Storage: IndexedDB for persisting generated shorts.
Browser APIs:
Web Speech API (SpeechSynthesis) : For text-to-speech voice narration 2. Intersection Observer : To manage video playback efficiently as the user scrolls
Libraries: marked for rendering lyrics.

Demo
app : https://ai.studio/apps/drive/1o-mWyYV79577vjypSg8wIzIjMHyVfWu0

youtube demo link : https://youtu.be/OCWtzV25rGQ?si=HMBppSJ1SRkF9jUc
screenshots :

How I Used Google AI Studio
This app completely built in Google AI Studio. The core logic, including the prompts for generating song structures and image descriptions, was developed and tested in AI Studio's flexible environment before being integrated into the application code with the Gemini API.

Multimodal Features
The app taps into Gemini’s multimodal capabilities through a two-step pipeline:

Structured Text Generation (gemini-2.5-flash): The user’s question is passed to the model with instructions to act as a songwriter. The model outputs a JSON structure containing the song’s lyrics, a detailed image prompt for each lyric line, and a suggested music style.
Text-to-Image Generation (gemini-2.5-flash-image-preview): Each image prompt is then fed into Gemini’s image model, producing unique, stylized visuals that match the lyrics. By chaining these steps, the app turns a single text query into an integrated audio-visual learning experience, blending storytelling, imagery, and music.

DEV Community

CurioShorts

Top comments (0)