tumf

Posted on Feb 2 • Originally published at blog.tumf.dev

Announcing the Universal Explanation Video Generator: Just Enter a URL and Let Zundamon and Friends Explain

#ai #casual #python

Originally published on 2026-01-31
Original article (Japanese): なんでも解説動画ジェネレーターを公開: URLを入れるだけでずんだもん達が解説

I enjoy reading technical blogs and documentation, but sometimes I prefer to watch a video. That's why I created a tool that generates explanation videos just by entering a URL.

The Universal Explanation Video Generator is a web application that automatically generates videos where characters like Zundamon and Shikoku Metan explain the content when you input a URL.

In this article, I will introduce how to use it and the technical details behind it.

How It Works

Let's take a look at the flow from operating the Web UI to generating the video.

By simply entering a URL and waiting, a video like the one below is generated:

Voice: Character voices generated by VOICEVOX
Slides: Visuals summarizing the content of the article
Dialogue Format: Multiple characters explain the content in a conversational manner

It supports basically any text-based content, including technical blog articles, official documentation, and news articles.

How to Use: Complete in 3 Steps

Using it is very simple.

1. Enter the URL

Paste the URL of the page you want to turn into an explanation video into the input field.

2. Wait for Generation

The following processes are automatically executed in the background:

Content retrieval and analysis
Script generation using LLM (Large Language Model)
Voice synthesis (VOICEVOX)
Slide image generation
Video rendering

The progress is displayed in real-time, so you can easily see how far along the process is.

3. Download

Once the generation is complete, you can download the video in MP4 format.

Note: The generated video will be automatically deleted after 24 hours. Please download it promptly if needed.

Features of the Generated Videos

Multi-Speaker Dialogue

Instead of a monotonous monologue, multiple characters explain the content in a dialogue format.

Zundamon: A cheerful and lively fairy from Tohoku
Shikoku Metan: A calm character from Shikoku
Kasukabe Tsumugi: Other characters are also supported

Characters are randomly selected, so even with the same URL, different combinations will be generated each time.

AI-Generated Slides

Slides summarizing the content of the article are automatically generated.

Visualizing important points
Explaining technical terms
Displaying code examples and commands

Transitions between slides (fade, slide, wipe, etc.) are also applied automatically.

Character Animation

While based on still images, the following animations are applied:

Lip Sync: The mouth moves in sync with the voice
Blinking: Regular blinking occurs
Swaying: Natural movements are simulated (sway/bounce)

Supported Content

It supports the following types of content:

Technical Blogs: Zenn, Qiita, personal blogs, etc.
Official Documentation: GitHub README, technical specifications
News Articles: Text-based articles

It also supports pages rendered with JavaScript (using MCP via Firecrawl).

Technical Background (Briefly)

For those interested, here’s a quick overview of the tech stack.

Architecture

It consists of three components:

API (FastAPI): Web UI and job management
PocketBase: Storage of job data and real-time updates
Worker: Background video generation processing

Technologies Used

Voice Synthesis: VOICEVOX
LLM: Supports multiple models via OpenRouter
Video Rendering: Remotion (a React-based video generation engine)
Web Scraping: httpx + BeautifulSoup, with Firecrawl via MCP as needed

Why Certain CLI Features Were Removed

The CLI version available on GitHub has many rich options, such as:

--scenes: Generate only specific scenes
--persona-pool-count: Specify the number of characters
--persona-pool-seed: For reproducible selection
--allow-placeholder: Test mode without VOICEVOX

However, all of these have been stripped away in the Web UI version.

Reasons:

For General Users: A "just enter a URL" experience that is accessible to non-technical users
Operational Costs: To minimize resource consumption as a free public service
Simplicity: Too many options can lead to confusion

The decision to "reduce features" is crucial for publicly available services.

Limitations

As a free public service, there are the following limitations:

Generation Limit: Rate limiting per IP address
24-Hour Deletion: Generated videos are automatically deleted
Queue Management: Limits on the number of simultaneous generations

These are designed with a balance in mind for operational costs.

How to Try It Out

You can easily start it up using Docker Compose:

# Clone the repository
git clone https://github.com/tumf/movie-generator.git
cd movie-generator

# Set environment variables
cd web
cp .env.example .env
# Edit the .env file (set OPENROUTER_API_KEY, etc.)

# Start with Docker Compose
docker compose -f ./web/docker-compose.yml up -d --build

You can access it right away by navigating to http://localhost:8000 in your browser.

Conclusion

The Universal Explanation Video Generator is a web application that automatically generates videos where characters explain content just by entering a URL. It is available as open-source under the MIT license on GitHub.

Simple: Just enter a URL
Dialogue Format: Multiple characters in conversation
Automatically Generated: Script, voice, slides, and video are all generated automatically

If you want to turn a technical blog into a video or listen to documentation, please give it a try.

If you have any requests or feedback, feel free to open an Issue or PR on GitHub.

DEV Community