DEV Community

WonderLab
WonderLab

Posted on

Open Source Project of the Day (Part 16): Code2Video - Intelligent Framework for Generating High-Quality Educational Videos

Introduction

"What if generating educational videos were as simple as writing code?"

This is Part 16 of the "Open Source Project of the Day" series. Today we explore Code2Video (GitHub).

In the AI video generation space, most models are pixel-based text-to-video systems that produce visual results but often lack clarity, coherence, and reproducibility for educational use. Code2Video proposes a revolutionary code-centric paradigm: using executable Manim code as a unified medium, a multi-agent system automatically generates high-quality educational videos. Whether it's the Tower of Hanoi, LLM principles, or Fourier series β€” Code2Video can automatically produce clear, beautiful, reproducible educational videos.

Why this project?

  • 🎬 Code-centric paradigm: Executable code as the unified medium for video generation, ensuring clarity and reproducibility
  • πŸ€– Three-agent system: Planner (storyboard expansion), Coder (debuggable code synthesis), and Critic (layout optimization) work together
  • πŸ“š MMMC benchmark: The first code-driven video generation benchmark covering 117 curated learning topics
  • πŸŽ“ Education-optimized: Designed specifically for educational videos, referencing 3Blue1Brown's high-quality standards
  • πŸ† Academic recognition: Accepted by the Deep Learning for Code (DL4C) Workshop at NeurIPS 2025
  • πŸ§ͺ Multi-dimensional evaluation: Systematic evaluation of efficiency, aesthetics, and end-to-end knowledge transfer

What You'll Learn

  • Code2Video's code-centric paradigm and design philosophy
  • How the three-agent system (Planner, Coder, Critic) works
  • How to use Manim code to generate educational videos
  • MMMC benchmark construction and evaluation methods
  • How to configure and use Code2Video to generate videos
  • Comparative analysis with other video generation approaches
  • Application cases in real educational scenarios

Prerequisites

  • Basic understanding of AI video generation
  • Familiarity with multi-agent system concepts
  • Python programming knowledge (optional, helpful for understanding code generation)
  • Basic understanding of educational video design (optional)

Project Background

Project Introduction

Code2Video is a code-centric agent framework for generating high-quality educational videos from knowledge points. Unlike pixel-based text-to-video models, Code2Video uses executable Manim code to ensure video clarity, coherence, and reproducibility. Through the collaboration of a three-agent system (Planner, Coder, Critic), it automatically converts knowledge points into structured educational videos.

Core problems the project solves:

  • Traditional text-to-video models produce educational videos lacking clarity and coherence
  • Video generation is not reproducible, making it hard to debug and optimize
  • Lack of tools specifically designed for educational video generation
  • No systematic standards for video quality evaluation
  • Creating high-quality educational videos requires extensive manual work

Target user groups:

  • Educators and curriculum designers
  • Online education platform developers
  • AI video generation researchers
  • Institutions that need to generate educational videos at scale
  • Technologists interested in code-driven video generation

Author/Team Introduction

Team: Show Lab @ National University of Singapore

  • Key authors:
    • Yanzhe Chen
    • Kevin Qinghong Lin
    • Mike Zheng Shou (advisor)
  • Background: Show Lab at the National University of Singapore, focused on video understanding and generation research
  • Project creation date: 2025
  • Academic achievement: Accepted by the Deep Learning for Code (DL4C) Workshop at NeurIPS 2025
  • Philosophy: Create clear, beautiful, reproducible educational videos through code

Project Stats

  • ⭐ GitHub Stars: 1.5k+ (continuously growing)
  • 🍴 Forks: 203+
  • πŸ“¦ Version: Continuously updated (95+ commits)
  • πŸ“„ License: MIT (fully open source, free to use)
  • 🌐 Project website: showlab.github.io/Code2Video
  • πŸ“„ Paper: arXiv:2510.01174
  • πŸ’¬ Community: Active GitHub Issues, 1 open Issue
  • πŸ‘₯ Contributors: 3 core contributors

Project development history:

  • September 2025: Project created, initial version released
  • September 22, 2025: Accepted by the NeurIPS 2025 DL4C Workshop
  • October 2, 2025: arXiv paper, code, and dataset published
  • October 3, 2025: Went viral on Twitter
  • October 6, 2025: Updated MMMC dataset with real hand-crafted videos and metadata
  • October 11, 2025: Updated icon collection source (from IconFinder to MMMC)
  • November 6, 2025: Optimized requirements.txt, reducing install time by 80-90%
  • November 25, 2025: Reached 1000 Stars milestone
  • Ongoing maintenance: Project remains active with continuous community contributions

Main Features

Core Purpose

Code2Video's core purpose is to automatically generate high-quality educational videos through code, with main features including:

  1. Automated knowledge-to-video conversion: Input a knowledge point, automatically generate a complete educational video
  2. Three-agent collaboration: Planner plans the storyboard, Coder generates Manim code, Critic optimizes layout and aesthetics
  3. Executable code generation: Generates debuggable, modifiable Manim code rather than pixel-level video
  4. Multi-topic support: Supports educational topics across mathematics, computer science, physics, and more
  5. High-quality output: References 3Blue1Brown standards to generate clear, beautiful educational videos
  6. Systematic evaluation: Provides multi-dimensional evaluation of knowledge transfer, aesthetic quality, and efficiency

Use Cases

Code2Video is suitable for a variety of educational scenarios:

  1. Online course production

    • Quickly generate educational videos for online courses
    • Batch generate video content for multiple knowledge points
    • Maintain visual style consistency across videos
  2. Educational content creation

    • Educators quickly produce teaching videos
    • Visualize complex concepts
    • Create interactive teaching content
  3. Research and evaluation

    • Research code-driven video generation methods
    • Evaluate the effectiveness of different video generation techniques
    • Build benchmarks for educational video generation
  4. Automated content production

    • Education platforms batch generate video content
    • Automatically convert text curricula into video curricula
    • Rapidly generate multilingual educational content

Quick Start

Installation

Code2Video installation requires a few steps:

# 1. Clone the repository
git clone https://github.com/showlab/Code2Video.git
cd Code2Video/src

# 2. Install dependencies
pip install -r requirements.txt

# 3. Install Manim Community v0.19.0
# See the official installation guide: https://docs.manim.community/en/stable/installation.html
Enter fullscreen mode Exit fullscreen mode

Main dependencies:

  • manim: Mathematical animation engine (Manim Community v0.19.0)
  • LLM API: For Planner and Coder (Claude-4-Opus recommended)
  • VLM API: For Critic (gemini-2.5-pro-preview-05-06 recommended)
  • Other Python dependencies (see requirements.txt)

Configure API Keys

Configure API credentials in api_config.json:

{
  "LLM_API": {
    "provider": "anthropic",  // or other LLM provider
    "api_key": "your-api-key-here",
    "model": "claude-4-opus"  // recommended for best code quality
  },
  "VLM_API": {
    "provider": "google",
    "api_key": "your-api-key-here",
    "model": "gemini-2.5-pro-preview-05-06"  // for layout and aesthetic optimization
  },
  "ICONFINDER_API_KEY": "your-iconfinder-api-key"  // optional, for richer video icons
}
Enter fullscreen mode Exit fullscreen mode

Simplest Usage Example

Generate a video for a single knowledge point:

# Use the run_agent_single.sh script
sh run_agent_single.sh --knowledge_point "Linear transformations and matrices"
Enter fullscreen mode Exit fullscreen mode

This command will:

  • Planner agent plans the video storyboard
  • Coder agent generates Manim code
  • Critic agent optimizes layout and aesthetics
  • Execute the Manim code to generate the video
  • Save to the CASES/TEST-single/ directory

Common Command Examples

# Generate a video for a single knowledge point
sh run_agent_single.sh --knowledge_point "Hanoi Problem"

# Batch generate multiple topics (using long_video_topics_list.json)
sh run_agent.sh

# Configurable parameters in run_agent.sh:
# - API: Specify the LLM to use
# - FOLDER_PREFIX: Output folder prefix (e.g., TEST-LIST)
# - MAX_CONCEPTS: Number of concepts to include (-1 for all)
# - PARALLEL_GROUP_NUM: Number of groups to run in parallel
Enter fullscreen mode Exit fullscreen mode

Core Features

Code2Video's core features include:

  1. Code-centric paradigm

    • Uses executable Manim code as the video generation medium
    • Code is debuggable, modifiable, and reproducible
    • Ensures video clarity and coherence
  2. Three-agent system

    • Planner: Expands knowledge points into detailed storyboards
    • Coder: Generates executable Manim code from the storyboard
    • Critic: Uses a visual language model to optimize layout and aesthetics
  3. MMMC benchmark

    • The first benchmark for code-driven video generation
    • Covers 117 curated learning topics
    • References 3Blue1Brown's high-quality standards
    • Includes real hand-crafted videos as evaluation baselines
  4. Multi-dimensional evaluation

    • Knowledge Transfer (TeachQuiz): Evaluates the teaching effectiveness of videos
    • Aesthetic and Structural Quality (AES): Evaluates visual quality of videos
    • Efficiency metrics: Token usage and execution time
  5. Modular design

    • Agent system can be used independently
    • Supports custom prompt templates
    • Flexible configuration system
  6. High-quality output

    • References 3Blue1Brown's design standards
    • Supports icon and visual asset integration
    • Generates clear, beautiful educational videos

Project Advantages

Compared to other video generation methods, Code2Video's advantages:

Comparison Code2Video Traditional Text-to-Video Manual Video Production
Clarity High (code-generated) Medium-low (pixel-generated) High (human-controlled)
Reproducibility Fully reproducible Not reproducible Reproducible but time-consuming
Debuggability Debuggable code Not debuggable Editable but complex
Generation speed Fast (automated) Fast Slow (manual production)
Cost Low (API calls) Medium (compute resources) High (labor cost)
Consistency High (code-controlled) Medium-low Depends on creator
Educational suitability Specifically optimized General but not clear enough High quality but time-consuming

Why choose Code2Video?

  • βœ… Code-driven: Generates executable, debuggable code rather than pixels
  • βœ… Education-optimized: Specifically designed for educational videos, following best practices
  • βœ… Agent system: Three-agent collaboration with high automation
  • βœ… Systematic evaluation: Provides a complete evaluation framework and benchmark
  • βœ… Academic recognition: Accepted by NeurIPS 2025
  • βœ… Open source and free: MIT license, fully open source

Detailed Project Analysis

Architecture Design

Code2Video's overall architecture uses a three-agent collaborative design:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Planner   │────▢│    Coder    │────▢│   Critic    β”‚
β”‚ (Storyboard)β”‚     β”‚ (Code Gen)  β”‚     β”‚(Layout Opt) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
      β”‚                    β”‚                    β”‚
      β–Ό                    β–Ό                    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         Manim Code Execution and Video Generation    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
      β”‚
      β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Video Outputβ”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Enter fullscreen mode Exit fullscreen mode

Core workflow:

  1. Planner stage: Expands the knowledge point into a detailed storyboard including scenes, animations, and narration
  2. Coder stage: Generates executable Manim code from the storyboard
  3. Critic stage: Uses a visual language model to evaluate and optimize code layout and aesthetics
  4. Execution stage: Runs Manim code to generate the final video
  5. Evaluation stage: Evaluates video quality using multi-dimensional metrics

Core Module Analysis

1. Planner Agent

The Planner is responsible for expanding knowledge points into detailed storyboards:

Functions:

  • Analyzes the core concepts of the knowledge point
  • Plans the video structure and flow
  • Determines the animations and visualizations to display
  • Generates detailed storyboard descriptions

Implementation:

  • Uses LLM (Claude-4-Opus recommended) for storyboard expansion
  • Generates structured storyboards based on prompt templates
  • Considers best practices for educational videos

2. Coder Agent

The Coder is responsible for generating Manim code from the storyboard:

Functions:

  • Converts storyboards into Manim code
  • Generates executable animation code
  • Ensures code correctness and readability
  • Handles complex math and visualization requirements

Implementation:

  • Uses LLM to generate Manim code
  • Code can be directly executed and debugged
  • Supports Manim Community v0.19.0 syntax

Manim code example:

# Example of generated Manim code (simplified)
from manim import *

class LinearTransformation(Scene):
    def construct(self):
        # Create matrix
        matrix = Matrix([[2, 1], [1, 2]])
        self.play(Create(matrix))

        # Show linear transformation
        vector = Arrow(ORIGIN, [1, 1, 0], buff=0)
        self.play(Create(vector))

        # Apply transformation
        transformed = matrix @ vector.get_end()
        self.play(Transform(vector, Arrow(ORIGIN, transformed, buff=0)))
Enter fullscreen mode Exit fullscreen mode

3. Critic Agent

The Critic is responsible for optimizing code layout and aesthetics:

Functions:

  • Uses a visual language model to evaluate generated code
  • Optimizes layout and positioning of visual elements
  • Ensures aesthetic quality and visual clarity
  • Uses anchors for layout optimization

Implementation:

  • Uses VLM (gemini-2.5-pro-preview-05-06 recommended)
  • Analyzes the visual effect of the generated code
  • Provides layout optimization suggestions
  • Iteratively improves code quality

4. MMMC Benchmark

MMMC (Manim-based Multi-topic Multi-quality Code) is the first benchmark for code-driven video generation:

Features:

  • Covers 117 curated learning topics
  • References 3Blue1Brown's high-quality standards
  • Includes real hand-crafted videos as evaluation baselines
  • Covers multiple domains including mathematics, computer science, and physics

Evaluation dimensions:

  1. Knowledge Transfer (TeachQuiz): Evaluates the teaching effectiveness of videos
  2. Aesthetic and Structural Quality (AES): Evaluates visual quality of videos
  3. Efficiency metrics: Token usage and execution time

Usage:

# Evaluate knowledge transfer
python3 eval_TQ.py

# Evaluate aesthetic and structural quality
python3 eval_AES.py
Enter fullscreen mode Exit fullscreen mode

Key Technical Implementation

1. Code-Centric Paradigm

The core innovation of Code2Video is using code as the unified medium for video generation:

Advantages:

  • Reproducibility: Code can be repeatedly executed to generate identical videos
  • Debuggability: Code can be modified to adjust video effects
  • Clarity: Code-generated videos are clearer than pixel-generated ones
  • Extensibility: New animations and effects can be easily added

Implementation:

  • Uses Manim as the code execution engine
  • Generated code conforms to Manim Community standards
  • Supports complex math and visualization requirements

2. Multi-Agent Collaboration

The three-agent system achieves collaboration through modular design:

Planner β†’ Coder β†’ Critic workflow:

# Simplified workflow
def generate_video(knowledge_point):
    # 1. Planner generates storyboard
    storyboard = planner.expand(knowledge_point)

    # 2. Coder generates code
    manim_code = coder.generate(storyboard)

    # 3. Critic optimizes code
    optimized_code = critic.optimize(manim_code)

    # 4. Execute code to generate video
    video = execute_manim(optimized_code)

    return video
Enter fullscreen mode Exit fullscreen mode

Information passing between agents:

  • Planner outputs a structured storyboard
  • Coder receives the storyboard and generates code
  • Critic receives the code and provides optimization suggestions
  • Supports iterative optimization

3. Prompt Engineering

Code2Video uses carefully designed prompt templates to guide agents:

Prompt template location: prompts/ directory

Prompt types:

  • Planner prompts: Guide storyboard expansion
  • Coder prompts: Guide code generation
  • Critic prompts: Guide layout optimization

Prompt optimization:

  • Based on best practices for educational videos
  • References 3Blue1Brown's design standards
  • Supports customization and extension

Practical Use Cases

Case 1: Online Math Course Production

Scenario: An online education platform needs to batch generate educational videos for a linear algebra course.

Implementation steps:

# 1. Prepare the knowledge point list (long_video_topics_list.json)
# Contains: linear transformations, matrix operations, eigenvalues, etc.

# 2. Configure the batch generation script
sh run_agent.sh

# 3. Configure parameters
# API: claude-4-opus
# FOLDER_PREFIX: LinearAlgebra-Course
# MAX_CONCEPTS: -1  # generate all concepts
# PARALLEL_GROUP_NUM: 4  # generate 4 videos in parallel
Enter fullscreen mode Exit fullscreen mode

Result: Automatically generates a series of stylistically consistent, clear, and beautiful linear algebra educational videos, greatly reducing manual production time.

Case 2: Computer Science Concept Visualization

Scenario: Generate an educational video on the topic "Large Language Model Principles."

Implementation steps:

# Generate a video for a single knowledge point
sh run_agent_single.sh --knowledge_point "Large Language Model"

# The system will automatically:
# 1. Planner analyzes core LLM concepts (attention mechanism, Transformer, etc.)
# 2. Coder generates Manim code showing LLM architecture
# 3. Critic optimizes layout for clarity
# 4. Generate the final video
Enter fullscreen mode Exit fullscreen mode

Result: Generates a video clearly showing how LLMs work, including visualizations of attention mechanisms and Transformer architecture animations.

Case 3: Physics Concept Teaching

Scenario: Generate an educational video for Fourier series.

Implementation steps:

sh run_agent_single.sh --knowledge_point "Pure Fourier Series"

# The generated video will show:
# - Mathematical formula for Fourier series
# - Superposition process of different frequency components
# - Animation of decomposing a square wave into sine waves
Enter fullscreen mode Exit fullscreen mode

Result: Intuitively demonstrates the mathematical principles of Fourier series through animation, helping students understand abstract concepts.

Case 4: Batch Course Content Generation

Scenario: An educational institution needs to generate educational videos for multiple subjects.

Implementation steps:

# 1. Prepare a multi-subject topic list
# 117 topics across mathematics, physics, computer science, etc.

# 2. Batch generate
sh run_agent.sh

# 3. Use parallel processing to speed up
PARALLEL_GROUP_NUM=8  # 8 parallel tasks
Enter fullscreen mode Exit fullscreen mode

Result: Quickly generates a large number of high-quality educational videos with style consistency, ideal for large-scale online course production.


Advanced Configuration Tips

1. Custom Prompt Templates

Code2Video uses prompt templates to guide agents; you can customize these:

Prompt template location: prompts/ directory

Custom Planner prompt:

# prompts/planner_prompt.txt
You are an expert educational video planner. Please create a detailed storyboard for the following knowledge point:

Knowledge point: {knowledge_point}

Requirements:
1. Analyze the core concepts of the knowledge point
2. Plan the video structure (introduction, body, conclusion)
3. Determine the animations and visualizations to display
4. Consider best practices for educational videos
5. Reference the style of 3Blue1Brown

Please generate a detailed storyboard...
Enter fullscreen mode Exit fullscreen mode

Custom Coder prompt:

# prompts/coder_prompt.txt
You are an expert Manim code generator. Please generate Manim code based on the following storyboard:

Storyboard: {storyboard}

Requirements:
1. Use Manim Community v0.19.0 syntax
2. Code should be clean, readable, and executable
3. Include necessary animations and visualizations
4. Follow Manim best practices

Please generate complete Manim code...
Enter fullscreen mode Exit fullscreen mode

2. Optimize API Configuration

Choose the best LLM:

{
  "LLM_API": {
    "provider": "anthropic",
    "api_key": "your-key",
    "model": "claude-4-opus",  // best code quality
    "temperature": 0.7,  // controls creativity
    "max_tokens": 4000   // controls output length
  }
}
Enter fullscreen mode Exit fullscreen mode

Configure VLM for Critic:

{
  "VLM_API": {
    "provider": "google",
    "api_key": "your-key",
    "model": "gemini-2.5-pro-preview-05-06",  // best visual understanding
    "temperature": 0.3  // lower temperature ensures consistency
  }
}
Enter fullscreen mode Exit fullscreen mode

3. Batch Generation Optimization

Parallel processing configuration:

# Configure in run_agent.sh
PARALLEL_GROUP_NUM=8  # adjust based on CPU cores

# Limit generation count (for testing)
MAX_CONCEPTS=10  # only generate the first 10 concepts
Enter fullscreen mode Exit fullscreen mode

Output organization:

# Use a meaningful folder prefix
FOLDER_PREFIX="Math-Course-2026"

# Output structure:
# CASES/
#   └── Math-Course-2026/
#       β”œβ”€β”€ concept_1/
#       β”‚   β”œβ”€β”€ video.mp4
#       β”‚   β”œβ”€β”€ manim_code.py
#       β”‚   └── storyboard.json
#       └── concept_2/
#           └── ...
Enter fullscreen mode Exit fullscreen mode

4. Manim Code Post-Processing

The generated Manim code can be further optimized:

# Example of generated code
from manim import *

class GeneratedVideo(Scene):
    def construct(self):
        # Can be manually edited and optimized
        title = Text("Linear Transformations")
        self.play(Write(title))
        # ... more code
Enter fullscreen mode Exit fullscreen mode

Optimization tips:

  • Adjust animation duration and easing functions
  • Optimize color and font choices
  • Add more visual effects
  • Improve layout and typography

5. Integration into Workflows

Python API integration:

# Use in your Python project
from agent import Code2VideoAgent

# Initialize the agent
agent = Code2VideoAgent(
    llm_api_key="your-key",
    vlm_api_key="your-key"
)

# Generate video
video_path = agent.generate_video(
    knowledge_point="Linear transformations and matrices",
    output_dir="./output"
)

print(f"Video generated: {video_path}")
Enter fullscreen mode Exit fullscreen mode

Automation script:

#!/usr/bin/env python3
# auto_generate.py

import subprocess
import json

# Load topic list
with open('long_video_topics_list.json', 'r') as f:
    topics = json.load(f)

# Batch generate
for topic in topics[:10]:  # generate first 10
    knowledge_point = topic['name']
    print(f"Generating video: {knowledge_point}")

    subprocess.run([
        'sh', 'run_agent_single.sh',
        '--knowledge_point', knowledge_point
    ])

    print(f"βœ“ Done: {knowledge_point}\n")
Enter fullscreen mode Exit fullscreen mode

Evaluation Methods in Detail

1. Knowledge Transfer Evaluation (TeachQuiz)

Purpose: Evaluate the teaching effectiveness of videos β€” whether students can learn from them.

Evaluation method:

# Run the evaluation script
python3 eval_TQ.py
Enter fullscreen mode Exit fullscreen mode

Evaluation process:

  1. Generate quiz questions for each video
  2. Have students answer questions after watching the video
  3. Calculate accuracy as the knowledge transfer metric

Evaluation metrics:

  • Accuracy: Students' answer accuracy rate
  • Depth of understanding: How well students understand the concepts
  • Knowledge retention: Retention rate after some time

2. Aesthetic and Structural Quality Evaluation (AES)

Purpose: Evaluate visual quality and structural soundness of videos.

Evaluation method:

# Run the evaluation script
python3 eval_AES.py
Enter fullscreen mode Exit fullscreen mode

Evaluation dimensions:

  • Visual clarity: Whether text and graphics are clear
  • Layout rationality: Whether element placement is reasonable
  • Animation smoothness: Whether animations are fluid and natural
  • Overall aesthetics: Whether the video looks visually appealing

Evaluation standards:

  • References 3Blue1Brown's video quality standards
  • Uses real hand-crafted videos as baseline
  • Multi-dimensional composite scoring

3. Efficiency Metrics

Purpose: Evaluate the efficiency and cost of the generation process.

Evaluation metrics:

  1. Token usage

    • Tokens used by Planner
    • Tokens used by Coder
    • Tokens used by Critic
    • Total token usage
  2. Execution time

    • Storyboard generation time
    • Code generation time
    • Code execution time (Manim rendering)
    • Total generation time
  3. Cost estimate

    • API call cost
    • Compute resource cost
    • Total cost

Optimization suggestions:

  • Use more efficient LLMs (e.g., Claude-4-Opus)
  • Optimize prompt templates to reduce token usage
  • Use parallel processing to improve efficiency
  • Cache intermediate results to avoid redundant computation

4. Benchmark Comparison

Code2Video is compared against the following methods on the MMMC benchmark:

Method Knowledge Transfer Aesthetic Quality Generation Speed Reproducibility
Code2Video High High Medium Fully reproducible
Veo3 Medium Medium Fast Not reproducible
Wan2.2 Medium Medium Fast Not reproducible
Human-made High High Slow Reproducible but time-consuming

Code2Video's advantages:

  • Knowledge transfer and aesthetic quality approaching human-made videos
  • Much faster generation speed than human production
  • Fully reproducible, can be debugged and optimized
  • Much lower cost than human production

Comparison with Other Video Generation Tools

Code2Video vs Traditional Text-to-Video Models

Traditional text-to-video models (e.g., Veo3, Wan2.2):

Advantages:

  • Fast generation speed
  • Supports multiple styles
  • Doesn't require programming knowledge

Disadvantages:

  • Generated videos are not reproducible
  • Insufficient clarity
  • Not suitable for educational scenarios
  • Hard to debug and optimize

Code2Video:

Advantages:

  • Generated code is reproducible and debuggable
  • High clarity, suitable for education
  • Specifically optimized for educational videos
  • Can be iteratively improved

Disadvantages:

  • Requires Manim environment
  • Relatively slower generation speed
  • Primarily suited for educational scenarios

Code2Video vs Manual Video Production

Manual production (e.g., After Effects, Premiere):

Advantages:

  • Complete control over every detail
  • Can create complex effects
  • Highest quality

Disadvantages:

  • Time-consuming and labor-intensive
  • Requires professional skills
  • High cost
  • Difficult to batch produce

Code2Video:

Advantages:

  • Highly automated
  • Can be batch generated
  • Low cost
  • Consistent style

Disadvantages:

  • Less flexible than manual production
  • Limited support for complex effects
  • Requires debugging and optimization

Choosing the Right Tool

Choose Code2Video when:

  • βœ… Need to batch generate educational videos
  • βœ… Require clear, reproducible videos
  • βœ… Need to iterate and optimize quickly
  • βœ… Budget-conscious but need high quality

Choose traditional text-to-video when:

  • βœ… Need to quickly generate general-purpose videos
  • βœ… Reproducibility is not required
  • βœ… Non-educational scenarios

Choose manual production when:

  • βœ… Need complete control over details
  • βœ… Need complex special effects
  • βœ… Budget and time are not constraints

Troubleshooting Common Issues

Issue 1: Manim Installation Failure

Symptoms: Error when installing Manim Community v0.19.0.

Solutions:

  1. Check system dependencies:
   # Ubuntu/Debian
   sudo apt-get install build-essential python3-dev libcairo2-dev libpango1.0-dev

   # macOS
   brew install cairo pango
Enter fullscreen mode Exit fullscreen mode
  1. Use a virtual environment:
   python3 -m venv venv
   source venv/bin/activate  # Linux/macOS
   # or
   venv\Scripts\activate  # Windows
   pip install manim
Enter fullscreen mode Exit fullscreen mode
  1. See the official documentation:

Issue 2: API Call Failure

Symptoms: LLM or VLM API call fails.

Solutions:

  1. Check API key:
   {
     "LLM_API": {
       "api_key": "your-actual-api-key"  // make sure it's correct
     }
   }
Enter fullscreen mode Exit fullscreen mode
  1. Check network connection:

    • Ensure you can access the API service
    • Check firewall settings
  2. Check API quota:

    • Confirm the API account has sufficient quota
    • Check if there are rate limits

Issue 3: Generated Code Execution Failure

Symptoms: Manim code errors during execution.

Solutions:

  1. Check Manim version:
   manim --version  # should be 0.19.0
Enter fullscreen mode Exit fullscreen mode
  1. Manually debug code:
   # View generated code
   # CASES/TEST-single/concept_name/manim_code.py

   # Manually run test
   manim -pql manim_code.py GeneratedVideo
Enter fullscreen mode Exit fullscreen mode
  1. Check dependencies:
   pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

Issue 4: Poor Video Quality

Symptoms: Generated video lacks clarity or visual appeal.

Solutions:

  1. Use a better LLM:

    • Recommend using Claude-4-Opus
    • Ensure API configuration is correct
  2. Optimize prompt templates:

    • Customize prompts in the prompts/ directory
    • Reference 3Blue1Brown's style requirements
  3. Use Critic optimization:

    • Ensure VLM API is configured correctly
    • Use gemini-2.5-pro-preview-05-06
  4. Manual post-processing:

    • Edit the generated Manim code
    • Adjust colors, fonts, layout, etc.

Issue 5: Slow Generation Speed

Symptoms: Video generation takes a very long time.

Solutions:

  1. Use parallel processing:
   PARALLEL_GROUP_NUM=8  # increase parallel count
Enter fullscreen mode Exit fullscreen mode
  1. Optimize Manim rendering:
   # Use low quality preview
   manim -pql  # low quality, fast preview
   # High quality rendering
   manim -pqh  # high quality, slower
Enter fullscreen mode Exit fullscreen mode
  1. Reduce concept count:
   MAX_CONCEPTS=5  # test with fewer concepts first
Enter fullscreen mode Exit fullscreen mode

Issue 6: Out of Memory

Symptoms: Memory overflow when generating large videos.

Solutions:

  1. Reduce video complexity:

    • Simplify storyboard
    • Reduce the number of elements displayed simultaneously
  2. Process in batches:

   # Process in batches to avoid handling too many at once
   MAX_CONCEPTS=10
Enter fullscreen mode Exit fullscreen mode
  1. Increase system memory:
    • If possible, add more RAM
    • Use a more powerful machine

Project Resources

Official Resources


Who Should Use This

Code2Video is suitable for:

1. Educators and Curriculum Designers

  • βœ… Teachers who need to quickly produce educational videos
  • βœ… Content creators on online learning platforms
  • βœ… Educators who want to visualize complex concepts

2. Online Education Platform Developers

  • βœ… Platforms that need to batch generate video content
  • βœ… Teams that want to automate content production
  • βœ… Organizations that need to maintain stylistic consistency

3. AI Video Generation Researchers

  • βœ… Researching code-driven video generation methods
  • βœ… Evaluating different video generation techniques
  • βœ… Building benchmarks for educational video generation

4. Tech Enthusiasts and Developers

  • βœ… Interested in multi-agent systems
  • βœ… Want to learn code generation techniques
  • βœ… Want to explore AI applications in education

5. Content Creators

  • βœ… YouTubers who need to produce educational content
  • βœ… Creators who want to improve video production efficiency
  • βœ… Bloggers who need to visualize knowledge

Summary

Code2Video is an innovative code-driven video generation framework that provides a brand-new paradigm for educational video generation through executable Manim code and a multi-agent system.


Welcome to visit my personal homepage for more useful knowledge and interesting products

Top comments (0)