DEV Community

Cover image for VidyaChitra — AI Study Companion That Turns Indian Textbook PDFs into Videos, Audio & Exam Prep in Regional Languages
Amogh Sunil
Amogh Sunil

Posted on

VidyaChitra — AI Study Companion That Turns Indian Textbook PDFs into Videos, Audio & Exam Prep in Regional Languages

This is a submission for the DEV Weekend Challenge: Community

The Community

Indian school students who study in regional languages — Kannada, Hindi, Tamil, Telugu, and Marathi. Over 250 million students across India learn from State Board and NCERT textbooks written in these languages, yet almost every AI-powered study tool available today is English-first. A Class 10 student in rural Karnataka studying electromagnetism from a Kannada textbook has no AI tutor, no animated explainer, no spoken narration — just a dense PDF and an overworked teacher. VidyaChitra is built for them.

What I Built

VidyaChitra (विद्याचित्र — "picture of knowledge" in Sanskrit) is an AI study companion that transforms any school textbook chapter PDF into a complete study kit in the student's own language:

  • Chapter summary — teacher-style explanation, streamed within seconds
  • Animated explainer video — AI writes a Manim animation script for the key concept, rendered as MP4 with labels in the student's language
  • Audio narration — spoken teacher-style explanation via Gemini TTS
  • Board-pattern exam questions — MCQs and short-answer questions framed exactly as they appear in Karnataka SSLC, CBSE, Maharashtra SSC, or Tamil Nadu board exams
  • Grounded AI chat — answers questions strictly from the chapter, no hallucinations

Zero configuration needed. Upload a PDF — language, board, and class are auto-detected by Gemini.

Demo

Code

VidyaChitra — AI Study Companion for Indian School Students

VidyaChitra

विद्याचित्र (VidyaChitra) means "picture of knowledge" in Sanskrit. It is an AI-powered study companion that transforms any NCERT or State Board textbook PDF into a complete, personalised study kit — in the student's own language — within seconds.


The Problem

Over 250 million school students in India study from State Board and NCERT textbooks written in regional languages like Kannada, Hindi, Tamil, Telugu, and Marathi. These students face three major challenges:

  1. Comprehension gap — Dense textbook language is hard to understand without a teacher's explanation, especially for first-generation learners.
  2. No visual aids — Diagrams in textbooks are static. Complex science and math concepts — ray diagrams, circuit diagrams, biological processes — are very hard to learn from a flat image alone.
  3. Exam unpreparedness — Students don't know how questions will be framed in their specific board's pattern (Karnataka SSLC, CBSE, Maharashtra…

How I Built It

Everything is powered by a single AI — Google Gemini 2.5 Flash via the google-genai SDK.

Backend: Python 3.11 + FastAPI. The PDF is passed as raw bytes to Gemini's native PDF mode (mime_type="application/pdf") — it reads all pages, Indic scripts, diagrams, and formulas in one API call. No OCR needed.

Video generation uses a two-step pipeline: Gemini reads the chapter summary (already in the student's language) and writes a structured 3-step JSON concept script, then writes a Python Manim scene from that script, rendered to MP4. Indic text uses self.add() instead of FadeIn() because Cairo crashes rendering Indic glyphs at partial opacity — a platform-specific fix that took significant debugging.

Audio: Gemini writes a spoken narration script, then Gemini 2.5 Flash TTS synthesises it. Returns raw 16-bit PCM at 24 kHz, wrapped in WAV via Python's wave module.

Streaming: All three pipelines (video, audio, questions) run concurrently via asyncio.create_task + asyncio.Queue. Results are pushed to the frontend via SSE the moment each one finishes. A 15-second keepalive ping prevents browsers from dropping the connection during long Manim renders.

Frontend: React 18 + TypeScript + Vite + TailwindCSS with a custom useSSEStream hook for EventSource lifecycle management.

Stack: Python, FastAPI, React, TypeScript, Google Gemini 2.5 Flash, Gemini TTS, Manim, PyMuPDF, Google Cloud Storage, SSE, Docker

Top comments (0)