_This is a submission for the Google AI Studio Multimodal Challenge
_
What I Built
I built Client Videos to Notes App, a productivity tool that converts video content into structured notes.
The app solves the problem of information overload in video content. Instead of watching long videos and manually writing notes, users can simply upload a client video and instantly receive concise, organized notes.
This makes it especially helpful for:
- Students reviewing lecture videos
- Freelancers handling client video briefs
- Professionals processing meeting recordings
Demo
GitHub Repo : Videos_to_Notes_Repo
Live App: Videos To Notes App
Demo Video: Watch on YouTube
How I Used Google AI Studio
I integrated Gemini 2.5 Pro through Google AI Studio (via GitHub Pro free tier).
Here’s how it works:
User uploads a video.
Audio is extracted and passed to Gemini 2.5 Pro.
Gemini processes the audio + video content and generates structured text notes.
Notes are displayed in an interactive React interface.
Multimodal Features
- Video + Audio Understanding: Gemini 2.5 Pro processes both video and audio streams.
- Text Summarization: Key insights and points are extracted automatically.
- Cross-format Processing: The app accepts multiple video formats and converts them into readable notes.
This multimodal approach enhances the user experience by transforming complex, time-consuming video content into instantly accessible knowledge.
Notes for Judges
_This project was built solo by @aayuamor.
I did not use paid cloud services. Instead of Cloud Run, the app is deployed on Vercel free hosting.
All AI functionality relies on Gemini 2.5 Pro (free tier), showcasing what’s possible even without paid APIs._
Top comments (0)