🎥 Local AI Video Analyzer (viddesc)

Local AI Video Analyzer is an open-source, lightweight, and quota-optimized engine for deep video understanding. It bridges the gap between local privacy/efficiency and powerful cloud-based reasoning.

By combining local OCR and audio extraction with Groq's high-speed Llama models, this tool provides a comprehensive analysis of any video—capturing visual scenes, on-screen text, and spoken dialogue—without breaking the bank.

🌟 Why This Project?

Most video AI tools are either purely cloud-based (expensive) or purely local (slow). viddesc uses a hybrid approach:

Local Heavy Lifting: OCR and frame extraction happen on your machine.
Turbocharged Reasoning: Groq handles the complex narrative merging in milliseconds.
Privacy First: Temporary files are purged immediately after processing.

✨ Key Features

Hybrid Vision & OCR
- Vision: Uses meta-llama/llama-4-scout-17b-16e-instruct on 5 keyframes for visual context.
- Local OCR: High-frequency Tesseract OCR (1 fps) captures on-screen text with zero API…