Videos to Notes App – Productivity Boost with Multimodal AI

#devchallenge #googleaichallenge #gemini #ai

_This is a submission for the Google AI Studio Multimodal Challenge
_
What I Built

I built Client Videos to Notes App, a productivity tool that converts video content into structured notes.

The app solves the problem of information overload in video content. Instead of watching long videos and manually writing notes, users can simply upload a client video and instantly receive concise, organized notes.

This makes it especially helpful for:

Students reviewing lecture videos
Freelancers handling client video briefs
Professionals processing meeting recordings

Demo

GitHub Repo : Videos_to_Notes_Repo

Live App: Videos To Notes App

Demo Video: Watch on YouTube

How I Used Google AI Studio

I integrated Gemini 2.5 Pro through Google AI Studio (via GitHub Pro free tier).

Here’s how it works:

User uploads a video.
Audio is extracted and passed to Gemini 2.5 Pro.
Gemini processes the audio + video content and generates structured text notes.
Notes are displayed in an interactive React interface.

Multimodal Features

Video + Audio Understanding: Gemini 2.5 Pro processes both video and audio streams.
Text Summarization: Key insights and points are extracted automatically.
Cross-format Processing: The app accepts multiple video formats and converts them into readable notes.

This multimodal approach enhances the user experience by transforming complex, time-consuming video content into instantly accessible knowledge.

Notes for Judges

_This project was built solo by @aayuamor.

I did not use paid cloud services. Instead of Cloud Run, the app is deployed on Vercel free hosting.

All AI functionality relies on Gemini 2.5 Pro (free tier), showcasing what’s possible even without paid APIs._